[nova] [cyborg] Impact of moving bind to compute

Dan Smith dms at danplanet.com
Tue Nov 26 23:06:07 UTC 2019

> But now we are close to having an improved  wait_for_instance_event() [3]. So I propose to:
> A.      Start the binding in the conductor. This gets maximum concurrency between binding and other tasks.
> B.      Wait for the binding notification in the compute manager (without losing the event). In fact, we can wait inside _build_resources, which is where
> Neutron/Cinder resources are gathered as well. That will allow for doing the cleanup in a consistent manner as today.
> C.       Call Cyborg to get the ARQs in the virt driver, like today.

We actually collect the neutron event in the virt driver. We kick off
some of the early stuff in _build_resources(), but those are things that
we want to be able to do from conductor.

I'd ideally like to move the wait further down into the stack purely so
we overlap with the image fetch. That's the thing that will take the
longest on the compute node. If the system is unloaded, the
conductor->compute->virt stuff could happen pretty quick, and if we wait
a minute (for example) for programming to finish before we start
spawn(), that's enough time that we could have potentially already
finished the image fetch. This is also time where we're holding a spot
in the parallel build limit queue, but we're not doing anything useful.

That said, things can move around inside the compute manager and virt
driver without affecting upgrades, so if it's easier to do it in
_build_resources() now, we can see about optimizing later. It should,
however, happen as the last step in _build_resources() so that we
overlap with all the network and block stuff that happens there already.


