[openstack-dev] [nova] [cyborg] Race condition in the Cyborg/Nova flow

Eric Fried openstack at fried.cc
Thu Mar 29 17:57:08 UTC 2018


> That means that for the (re)-programming scenarios you need to
> dynamically adjust the inventory of a particular FPGA resource provider.

Oh, see, this is something I had *thought* was a non-starter.  This
makes the "single program" case way easier to deal with, and allows it
to be handled on the fly:

* Model your region as a provider with separate resource classes for
each function it supports.  The inventory totals for each would be the
total number of virtual slots (or whatever they're called) of that type
that are possible when the device is flashed with that function.
* An allocation is made for one unit of class X.  This percolates down
to cyborg to do the flashing/attaching.  At this time, cyborg *deletes*
the inventories for all the other resource classes.
* In a race with different resource classes, whoever gets to cyborg
first, wins.  The second one will see that the device is already flashed
with X, and fail.  The failure will bubble up, causing the allocation to
be released.
* Requests for multiple different resource classes at once will have to
filter out allocation candidates that put both on the same device.  Not
completely sure how this happens.  Otherwise they would have to fail at
cyborg, resulting in the same bubble/deallocate as above.

-efried



More information about the OpenStack-dev mailing list