[nova] [cyborg] Impact of moving bind to compute
Matt Riedemann
mriedemos at gmail.com
Thu Jun 6 20:32:57 UTC 2019
On 5/23/2019 7:00 AM, Nadathur, Sundar wrote:
> Hi,
>
> The feedback in the Nova – Cyborg interaction spec [1] is to move
> the call for creating/binding accelerator requests (ARQs) from the
> conductor (just before the call to build_and_run_instance, [2]) to the
> compute manager (just before spawn, without holding the build sempahore
> [3]). The point where the results of the bind are needed is in the virt
> driver [4] – that is not changing. The reason for the move is to enable
> Cyborg to notify Nova [5] instead of Nova virt driver polling Cyborg,
> thus making the interaction similar to other services like Neutron.
>
> The binding involves device preparation by Cyborg, which may take some
> time (ballpark: milliseconds to few seconds to perhaps 10s of seconds –
> of course devices vary a lot). We want to overlap as much of this as
> possible with other tasks, by starting the binding as early as possible
> and making it asynchronous, so that bulk VM creation rate etc. are not
> affected. These considerations are probably specific to Cyborg, so
> trying to make it uniform with other projects deserve a closer look
> before we commit to it.
>
> Moving the binding from [2] to [3] reduces this overlap. I did some
> measurements of the time window from [2] to [3]: it was consistently
> between 20 and 50 milliseconds, whether I launched 1 VM at a time, 2 at
> a time, etc. This seems acceptable.
>
> But this was just in a two-node deployment. Are there situations where
> this window could get much larger (thus reducing the overlap)? Such as
> in larger deployments, or issues with RabbitMQ messaging, etc. Are there
> larger considerations of performance or scaling for this approach?
>
> Thanks in advance.
>
> [1] https://review.opendev.org/#/c/603955/
>
> [2]
> https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L1501
>
> [3]
> https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1882
>
> [4]
> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L3215
>
>
> [5] https://wiki.openstack.org/wiki/Nova/ExternalEventAPI
>
> Regards,
>
> Sundar
>
I'm OK with binding in the compute since that's where we trigger the
callback event and want to setup something to wait for it before
proceeding, like we do with port binding.
What I've talked about in detail in the spec is doing the ARQ *creation*
in conductor rather than compute. I realize that doing the creation in
the compute service means fewer (if any) RPC API changes to get phase 1
of this code going, but I can't imagine any RPC API changes for that
would be very big (it's a new parameter to the compute service methods,
or something we lump into the RequestSpec).
The bigger concern I have is that we've long talked about moving port
(and at times volume) creation from the compute service to conductor
because it's less expensive to manage external resources there if
something fails, e.g. going over-quota creating volumes. The problem
with failing late in the compute is we have to cleanup other things
(ports and volumes) and then reschedule, which may also fail on the next
alternate host. Failing fast in conductor is more efficient and also
helps take some of the guesswork out of which service is managing the
resources (we've had countless bugs over the years about ports and
volumes being leaked because we didn't clean them up properly on
failure). Take a look at any of the error handling in the server create
flow in the ComputeManager and you'll see what I'm talking about.
Anyway, if we're voting I vote that ARQ creation happens in conductor
and binding happens in compute.
--
Thanks,
Matt
More information about the openstack-discuss
mailing list