[openstack-dev] [Nova] [Cyborg] Tracking multiple functions

Jay Pipes jaypipes at gmail.com
Fri Mar 2 14:31:11 UTC 2018


On 03/02/2018 02:00 PM, Nadathur, Sundar wrote:
> Hello Nova team,
> 
>      During the Cyborg discussion at Rocky PTG, we proposed a flow for 
> FPGAs wherein the request spec asks for a device type as a resource 
> class, and optionally a function (such as encryption) in the extra 
> specs. This does not seem to work well for the usage model that I’ll 
> describe below.
> 
> An FPGA device may implement more than one function. For example, it may 
> implement both compression and encryption. Say a cluster has 10 devices 
> of device type X, and each of them is programmed to offer 2 instances of 
> function A and 4 instances of function B. More specifically, the device 
> may implement 6 PCI functions, with 2 of them tied to function A, and 
> the other 4 tied to function B. So, we could have 6 separate instances 
> accessing functions on the same device.
> 
> In the current flow, the device type X is modeled as a resource class, 
> so Placement will count how many of them are in use. A flavor for ‘RC 
> device-type-X + function A’ will consume one instance of the RC 
> device-type-X.  But this is not right because this precludes other 
> functions on the same device instance from getting used.
> 
> One way to solve this is to declare functions A and B as resource 
> classes themselves and have the flavor request the function RC. 
> Placement will then correctly count the function instances. However, 
> there is still a problem: if the requested function A is not available, 
> Placement will return an empty list of RPs, but we need some way to 
> reprogram some device to create an instance of function A.

Clearly, nova is not going to be reprogramming devices with an instance 
of a particular function.

Cyborg might need to have a separate agent that listens to the nova 
notifications queue and upon seeing an event that indicates a failed 
build due to lack of resources, then Cyborg can try and reprogram a 
device and then try rebuilding the original request.

Best,
-jay



More information about the OpenStack-dev mailing list