[nova][dev] Revisiting qemu emulation where guest arch != host arch
All, A few years ago I asked a question[1] about why nova, when given a hw_architecture property from glance for an image, would not end up using the correct qemu-system-xx binary when starting the guest process on a compute node if that compute nodes architecture did not match the proposed guest architecture. As an example, if we had all x86 hosts, but wanted to run an emulated ppc guest, we should be able to do that given that at least one compute node had qemu-system-ppc already installed and libvirt was successfully reporting that as a supported architecture to nova. It seemed like a heavy lift at the time, so it was put on the back burner. I am now in a position to fund a contract developer to make this happen, so the question is: would this be a useful blueprint that would potentially be accepted? Most of the time when people want to run an emulated guest they would just nest it inside of an already running guest of the native architecture, but that severely limits observability and the task of managing any more than a handful of instances in this manner quickly becomes a tangled nightmare of networking, etc. I see real benefit in allowing this scenario to run natively so all of the tooling that exists for fleet management 'just works'. This would also be a significant differentiator for OpenStack as a whole. Thoughts? [1] http://lists.openstack.org/pipermail/openstack-operators/2018-August/015653.... Chris Apsey Director | Georgia Cyber Range GEORGIA CYBER CENTER 100 Grace Hopper Lane | Augusta, Georgia | 30901 https://www.gacybercenter.org
All,
A few years ago I asked a question[1] about why nova, when given a hw_architecture property from glance for an image, would not end up using the correct qemu-system-xx binary when starting the guest process on a compute node if that compute nodes architecture did not match the proposed guest architecture. As an example, if we had all x86 hosts, but wanted to run an emulated ppc guest, we should be able to do that given that at least one compute node had qemu- system-ppc already installed and libvirt was successfully reporting that as a supported architecture to nova. It seemed like a heavy lift at the time, so it was put on the back burner.
I am now in a position to fund a contract developer to make this happen, so the question is: would this be a useful blueprint that would potentially be accepted?
On Wed, 2020-07-15 at 14:17 +0000, Apsey, Christopher wrote: this came up during the ptg and the over all felling was it should really work already and if it does not its a bug. so yes i fa blueprint was filed to support emulation based on the image hw_architecture property i dont think you will get objection altough we proably will want to allso have schduler support for this and report it to placemnt or have a whigher of some kind to make it a compelte solution. i.e. enhance the virt driver to report all the achitecure it support via traits and add a weigher to prefer native execution over emulation. so placement can tell use where it can run and the weigher can say where it will run best. see line 467 https://etherpad.opendev.org/p/nova-victoria-ptg
Most of the time when people want to run an emulated guest they would just nest it inside of an already running guest of the native architecture, but that severely limits observability and the task of managing any more than a handful of instances in this manner quickly becomes a tangled nightmare of networking, etc. I see real benefit in allowing this scenario to run natively so all of the tooling that exists for fleet management 'just works'. This would also be a significant differentiator for OpenStack as a whole.
Thoughts?
[1] http://lists.openstack.org/pipermail/openstack-operators/2018-August/015653....
Chris Apsey Director | Georgia Cyber Range GEORGIA CYBER CENTER
100 Grace Hopper Lane | Augusta, Georgia | 30901 https://www.gacybercenter.org
On Wed, 2020-07-15 at 14:17 +0000, Apsey, Christopher wrote:
All,
A few years ago I asked a question[1] about why nova, when given a hw_architecture property from glance for an image, would not end up using the correct qemu-system-xx binary when starting the guest process on a compute node if that compute nodes architecture did not match the proposed guest architecture. As an example, if we had all x86 hosts, but wanted to run an emulated ppc guest, we should be able to do that given that at least one compute node had qemu-system-ppc already installed and libvirt was successfully reporting that as a supported architecture to nova. It seemed like a heavy lift at the time, so it was put on the back burner.
I am now in a position to fund a contract developer to make this happen, so the question is: would this be a useful blueprint that would potentially be accepted?
yes, i cant really speak to how much use it would get or how useful it woudl be to the majoriy fo user but i would be supportive of adding this capablity. We have to be a little carful to get the design wright for example we might want to differnciate between the native architecture and emulated architectures e.g. use HW_ARCH_X86 to idenfiy the host as being x86 and COMPUTE_ARCH_X86 for emulated with the new "in" suppot being added to placment we can use required=in:HW_ARCH_X86,COMPUTE_ARCH_X86 in cases where you dont care and by default if you wanted native only you could use the HW_ARCH_* traits in the image and we can have a prefileter add the both triats by default if the architrue is set in the iamge and there is not arch trait in the flavor or image. i will certenly review a spec if you proporse one but you might now have time to get it approved this cycle the spec deadlien woudl have been thursday btu it has been moved to next week.
Most of the time when people want to run an emulated guest they would just nest it inside of an already running guest of the native architecture, but that severely limits observability and the task of managing any more than a handful of instances in this manner quickly becomes a tangled nightmare of networking, etc. I see real benefit in allowing this scenario to run natively so all of the tooling that exists for fleet management 'just works'. This would also be a significant differentiator for OpenStack as a whole.
Thoughts?
[1] http://lists.openstack.org/pipermail/openstack-operators/2018-August/015653....
Chris Apsey Director | Georgia Cyber Range GEORGIA CYBER CENTER
100 Grace Hopper Lane | Augusta, Georgia | 30901 https://www.gacybercenter.org
participants (2)
-
Apsey, Christopher
-
Sean Mooney