[nova] [cyborg] Impact of moving bind to compute

Matt Riedemann mriedemos at gmail.com
Thu Jun 6 20:32:57 UTC 2019


On 5/23/2019 7:00 AM, Nadathur, Sundar wrote:
> Hi,
> 
>      The feedback in the Nova – Cyborg interaction spec [1] is to move 
> the call for creating/binding accelerator requests (ARQs) from the 
> conductor (just before the call to build_and_run_instance, [2]) to the 
> compute manager (just before spawn, without holding the build sempahore 
> [3]). The point where the results of the bind are needed is in the virt 
> driver [4] – that is not changing. The reason for the move is to enable 
> Cyborg to notify Nova [5] instead of Nova virt driver polling Cyborg, 
> thus making the interaction similar to other services like Neutron.
> 
> The binding involves device preparation by Cyborg, which may take some 
> time (ballpark: milliseconds to few seconds to perhaps 10s of seconds – 
> of course devices vary a lot). We want to overlap as much of this as 
> possible with other tasks, by starting the binding as early as possible 
> and making it asynchronous, so that bulk VM creation rate etc. are not 
> affected. These considerations are probably specific to Cyborg, so 
> trying to make it uniform with other projects deserve a closer look 
> before we commit to it.
> 
> Moving the binding from [2] to [3] reduces this overlap. I did some 
> measurements of the time window from [2] to [3]: it was consistently 
> between 20 and 50 milliseconds, whether I launched 1 VM at a time, 2 at 
> a time, etc. This seems acceptable.
> 
> But this was just in a two-node deployment. Are there situations where 
> this window could get much larger (thus reducing the overlap)? Such as 
> in larger deployments, or issues with RabbitMQ messaging, etc. Are there 
> larger considerations of performance or scaling for this approach?
> 
> Thanks in advance.
> 
> [1] https://review.opendev.org/#/c/603955/
> 
> [2] 
> https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L1501
> 
> [3] 
> https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1882
> 
> [4] 
> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L3215 
> 
> 
> [5] https://wiki.openstack.org/wiki/Nova/ExternalEventAPI
> 
> Regards,
> 
> Sundar
> 

I'm OK with binding in the compute since that's where we trigger the 
callback event and want to setup something to wait for it before 
proceeding, like we do with port binding.

What I've talked about in detail in the spec is doing the ARQ *creation* 
in conductor rather than compute. I realize that doing the creation in 
the compute service means fewer (if any) RPC API changes to get phase 1 
of this code going, but I can't imagine any RPC API changes for that 
would be very big (it's a new parameter to the compute service methods, 
or something we lump into the RequestSpec).

The bigger concern I have is that we've long talked about moving port 
(and at times volume) creation from the compute service to conductor 
because it's less expensive to manage external resources there if 
something fails, e.g. going over-quota creating volumes. The problem 
with failing late in the compute is we have to cleanup other things 
(ports and volumes) and then reschedule, which may also fail on the next 
alternate host. Failing fast in conductor is more efficient and also 
helps take some of the guesswork out of which service is managing the 
resources (we've had countless bugs over the years about ports and 
volumes being leaked because we didn't clean them up properly on 
failure). Take a look at any of the error handling in the server create 
flow in the ComputeManager and you'll see what I'm talking about.

Anyway, if we're voting I vote that ARQ creation happens in conductor 
and binding happens in compute.

-- 

Thanks,

Matt



More information about the openstack-discuss mailing list