[nova] compute's reaction to finding unmanaged VMs
Hi Nova Team, How does Nova relate to having unmanaged libvirt VMs alongside VMs managed by nova-compute? Is this considered supported, unsupported, or somewhere in between? I know about the warning in [1] and the compute startup time exception in [2]. Colleagues working on a major OpenStack version uplift in a downstream OpenStack distro started encountering the above exception. This distro utilizes some non-Nova-managed VMs (in particular, to orchestrate the OpenStack deployment itself). To reduce the deployment footprint, these VMs are co-located with Nova-managed VMs. AFAIU, this exception was introduced by the stable-compute-uuid blueprint [3]. With that in mind, I found this behavior: * Before the blueprint was implemented, Nova only logged a warning if it found an unmanaged VM. * After stable-compute-uuid, the compute service refuses to start in the same case. * If I make sure that the stable UUID is already present, then it seems to fall back to the earlier warning-only behavior. If I let nova-compute create the stable UUID itself, this leads to somewhat surprising behavior: * If nova-compute finds an unmanaged VM during its first startup, it refuses to start. * If it finds an unmanaged VM only during its second (or subsequent) startup (when the stable UUID has already been generated), it starts and logs a warning only. This made me think that the distro should pre-generate the compute UUID. Then I found in the stable-compute-uuid blueprint (and in the docs [4]) that this approach is clearly supported. It seems the compute service does not look at the metadata present in the libvirt domain definitions, and I started wondering why. AFAICT, the exception introduced by the stable-compute-uuid blueprint only really applies if Nova finds VMs that contain Nova metadata but are unknown to the compute service starting up. And this line of thought led me to my original question and these: * Could/should nova-compute inspect the metadata of libvirt VMs at startup and act differently if an unexpected VM has (or doesn’t have) Nova metadata? Would this be considered a meaningful Nova feature? Or should the distro just pre-generate the stable UUID and ignore the warning (as it did before the blueprint)? * Does Nova currently not look at the libvirt domain definition metadata simply because nobody has implemented this yet? Or is there another reason? * Is it considered supported (and to what level) to have non-Nova-managed VMs co-located with Nova-managed VMs? Thanks in advance, Bence Romsics (rubasov on irc) [1] https://github.com/openstack/nova/blob/54b65d5bf2b23fa8a4612fd3adddc8751192a... [2] https://github.com/openstack/nova/blob/54b65d5bf2b23fa8a4612fd3adddc8751192a... [3] https://specs.openstack.org/openstack/nova-specs/specs/2023.1/implemented/st... [4] https://docs.openstack.org/nova/latest/admin/compute-node-identification.htm...
How does Nova relate to having unmanaged libvirt VMs alongside VMs managed by nova-compute? Is this considered supported, unsupported, or somewhere in between?
Unrelated to stable-compute-uuid, we do not support this at all, in any way. Nova expects (and has always expected) to be the only thing managing the VMs on a libvirt instance, full stop. --Dan
On 04/08/2025 15:14, Dan Smith wrote:
How does Nova relate to having unmanaged libvirt VMs alongside VMs managed by nova-compute? Is this considered supported, unsupported, or somewhere in between? Unrelated to stable-compute-uuid, we do not support this at all, in any way. Nova expects (and has always expected) to be the only thing managing the VMs on a libvirt instance, full stop.
there is one untested caveat to that. if you use vcpu_pin_set, cpu_dedicated_set and/or cpu_shared_set to define which cores are aviabel to nova, and you adjust the host reserved ram/disk/hugepage values there was the capabltiy to run addtional host level vms on the compute nodes for thinks like vrouter for networking or other infra level usecase. i dont really know of anyone that really did that since circa 2015 era. This type of deployment was most common in installer whtat used "seed vms" to do the deployment that could be shutdown when the cloud is deployed and you are not perfroming day 2 oeprations like update/upgrade. in general you should run those vms on a seperate host that is not a nova compute node such as the controller hosts. the other commone example was providign infra level VNFs liek routing, loadblancing, vpns or firewalling as vms on the computes that are then consumed by the openstack itself. again ideally you woudl not run those vms on the comptue nodes unless you can run them as nova instance. they should be moved to dedicated networker nodes if possible. where the the logical network swtich for the openstack vms is run in a seperate vms like the early days of vrouter? ~(there was one network backend that used a vm for the vswich but i dont recall exactly) its not possible to move that vm to a separate host but that tyep of integration is not really supported upstream by the nova project. with that context in mind dan is absolutely right that for nova provisions vms nothing other then nova is allowed to interact with them. we do not document or test this colocation use-case even if very old installer sometimes did it because it not generally a usecase we want to support in nova. the capability exists but any issues that are encountered by using this partitioning approach are not upstream nova bugs. so strictly speaking it has never been officially supported, it has not been stated as unsupported in docs as there were existing deployment in production that made it work, but you are going outside the scope of upstream supported usecases if you attempt it.
--Dan
participants (3)
-
Bence Romsics
-
Dan Smith
-
Sean Mooney