[Openstack] [Nova][virt-driver-numa-placement]How to enbale instance with numa ?

Chris Friesen chris.friesen at windriver.com
Thu Feb 5 17:37:27 UTC 2015


On 02/05/2015 11:23 AM, Daniel P. Berrange wrote:
> On Thu, Feb 05, 2015 at 11:21:05AM -0600, Chris Friesen wrote:
>> On 02/05/2015 10:32 AM, Daniel P. Berrange wrote:
>>> On Thu, Feb 05, 2015 at 10:28:56AM -0600, Chris Friesen wrote:
>>
>>>> For what it's worth, I was able to make hugepages work with an older qemu by
>>>> commenting out two lines in
>>>> virt.libvirt.config.LibvirtConfigGuestMemoryBacking.format_dom()
>>>>
>>>>      def format_dom(self):
>>>>          root = super(LibvirtConfigGuestMemoryBacking, self).format_dom()
>>>>
>>>>          if self.hugepages:
>>>>              hugepages = etree.Element("hugepages")
>>>>              #for item in self.hugepages:
>>>>              #    hugepages.append(item.format_dom())
>>>>              root.append(hugepages)
>>>>
>>>>
>>>> This results in XML that looks like:
>>>>
>>>>    <memoryBacking>
>>>>      <hugepages/>
>>>>    </memoryBacking>
>>>>
>>>>
>>>> And a qemu commandline that looks like
>>>>
>>>> -mem-prealloc -mem-path /mnt/huge-2048kB/libvirt/qemu
>>>
>>> With that there is no guarantee that the huge pages are being allocated
>> >from the NUMA node on which the guest is actually placed by Nova, hence
>>> we did not intend to support that.
>>
>> It's possible that the end-user didn't indicate a preference for NUMA.  If
>> they just asked for hugepages and we have the ability to give it to them I
>> think we should do so.
>>
>> In the likely common case of an instance with a single NUMA node, I think
>> this will likely give the desired behaviour since the default kernel
>> behaviour is to prefer allocating from the numa node that requested the
>> memory.  As long as qemu affinity is set before it allocates memory we
>> should be okay.
>>
>> The only case that isn't covered is if the flavor specifies multiple numa
>> nodes.  In that case maybe the scheduler filters should be aware of that and
>> refuse to assign an instance with multiple numa nodes to a compute node with
>> an older qemu.
>
> Having the scheduler need to care about versions of software installed on
> nodes is a whole heap of extra complexity for no credible gain. It is
> perfectly reasonable to just mandate the newer QEMU for this IMHO and
> avoid that complexity in Nova.

Okay, then just let it fail on that compute node and the scheduler will retry 
somewhere else.  My point is that it's silly to require a very recent qemu 
version just to enable hugepages, when the common case of a single numa node VM 
will likely still work just fine with a much older qemu.

Also, in either case shouldn't we have a check in the code against 
MIN_QEMU_NUMA_PIN_VERSION or something like that?  As it stands, there is no 
information anywhere in the codebase or requirements.txt that specifies that you 
need qemu 2.1 or later if you want hugepages or numa pinning support.  This 
could end up being confusing for people that try to get it working--I know it 
took a while for me to track it down.

Chris




More information about the Openstack mailing list