[openstack-dev] [Nova] [Cyborg] Tracking multiple functions

Alex Xu soulxu at gmail.com
Wed Mar 7 02:21:58 UTC 2018


2018-03-06 22:45 GMT+08:00 Mooney, Sean K <sean.k.mooney at intel.com>:

>
>
>
>
> *From:* Matthew Booth [mailto:mbooth at redhat.com]
> *Sent:* Saturday, March 3, 2018 4:15 PM
> *To:* OpenStack Development Mailing List (not for usage questions) <
> openstack-dev at lists.openstack.org>
> *Subject:* Re: [openstack-dev] [Nova] [Cyborg] Tracking multiple functions
>
>
>
> On 2 March 2018 at 14:31, Jay Pipes <jaypipes at gmail.com> wrote:
>
> On 03/02/2018 02:00 PM, Nadathur, Sundar wrote:
>
> Hello Nova team,
>
>      During the Cyborg discussion at Rocky PTG, we proposed a flow for
> FPGAs wherein the request spec asks for a device type as a resource class,
> and optionally a function (such as encryption) in the extra specs. This
> does not seem to work well for the usage model that I’ll describe below.
>
> An FPGA device may implement more than one function. For example, it may
> implement both compression and encryption. Say a cluster has 10 devices of
> device type X, and each of them is programmed to offer 2 instances of
> function A and 4 instances of function B. More specifically, the device may
> implement 6 PCI functions, with 2 of them tied to function A, and the other
> 4 tied to function B. So, we could have 6 separate instances accessing
> functions on the same device.
>
>
>
> Does this imply that Cyborg can't reprogram the FPGA at all?
>
> *[Mooney, Sean K] cyborg is intended to support fixed function acclerators
> also so it will not always be able to program the accelerator. In this case
> where an fpga is preprogramed with a multi function bitstream that is
> statically provisioned cyborge will not be able to reprogram the slot if
> any of the fuctions from that slot are already allocated to an instance. In
> this case it will have to treat it like a fixed function device and simply
> allocate a unused  vf  of the corret type if available. *
>
>
>
>
>
> In the current flow, the device type X is modeled as a resource class, so
> Placement will count how many of them are in use. A flavor for ‘RC
> device-type-X + function A’ will consume one instance of the RC
> device-type-X.  But this is not right because this precludes other
> functions on the same device instance from getting used.
>
> One way to solve this is to declare functions A and B as resource classes
> themselves and have the flavor request the function RC. Placement will then
> correctly count the function instances. However, there is still a problem:
> if the requested function A is not available, Placement will return an
> empty list of RPs, but we need some way to reprogram some device to create
> an instance of function A.
>
>
> Clearly, nova is not going to be reprogramming devices with an instance of
> a particular function.
>
> Cyborg might need to have a separate agent that listens to the nova
> notifications queue and upon seeing an event that indicates a failed build
> due to lack of resources, then Cyborg can try and reprogram a device and
> then try rebuilding the original request.
>
>
>
> It was my understanding from that discussion that we intend to insert
> Cyborg into the spawn workflow for device configuration in the same way
> that we currently insert resources provided by Cinder and Neutron. So while
> Nova won't be reprogramming a device, it will be calling out to Cyborg to
> reprogram a device, and waiting while that happens.
>
> My understanding is (and I concede some areas are a little hazy):
>
> * The flavors says device type X with function Y
>
> * Placement tells us everywhere with device type X
>
> * A weigher orders these by devices which already have an available
> function Y (where is this metadata stored?)
>
> * Nova schedules to host Z
>
> * Nova host Z asks cyborg for a local function Y and blocks
>
>   * Cyborg hopefully returns function Y which is already available
>
>   * If not, Cyborg reprograms a function Y, then returns it
>
> Can anybody correct me/fill in the gaps?
>
> *[Mooney, Sean K] that correlates closely to my recollection also. As for
> the metadata I think the weigher may need to call to cyborg to retrieve
> this as it will not be available in the host state object.*
>
Is it the nova scheduler weigher or we want to support weigh on placement?
Function is traits as I think, so can we have preferred_traits? I remember
we talk about that parameter in the past, but we don't have good use-case
at that time. This is good use-case.


> Matt
>
>
>
> --
>
> Matthew Booth
>
> Red Hat OpenStack Engineer, Compute DFG
>
>
>
> Phone: +442070094448 <+44%2020%207009%204448> (UK)
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20180307/f757f59e/attachment.html>


More information about the OpenStack-dev mailing list