[openstack-dev] [nova] NUMA, huge pages, and scheduling

Paul Michali pc at michali.net
Tue Jun 14 19:30:42 UTC 2016

Well, looks like we figured out what is going on - maybe folks have some
ideas on how we could handle this issue.

What I see is that for each VM create (small flavor), 1024 huge pages are
used and NUMA node 0 used. It appears that, when there is no longer enough
huge pages on that NUMA node, Nova with then schedule to the other NUMA
node and use those huge pages.

In our case, we happen to have a special container running on the compute
nodes, that uses 512 huge pages. As a result, when there are 768 huge pages
left, Nova thinks there are 1280 pages left and thinks one more VM can be
create. It tries, but the create fails.

Some questions...

1) Is there some way to "reserve" huge pages in Nova?
2) If the create fails, should Nova try the other NUMA node (or is this
because it doesn't know why it failed)?
3) Any ideas on how we can deal with this - without changing Nova?



On Tue, Jun 14, 2016 at 1:09 PM Paul Michali <pc at michali.net> wrote:

> Great info Chris and thanks for confirming the assignment of blocks of
> pages to a numa node.
> I'm still struggling with why each VM is being assigned to NUMA node 0.
> Any ideas on where I should look to see why Nova is not using NUMA id 1?
> Thanks!
> On Tue, Jun 14, 2016 at 10:29 AM Chris Friesen <
> chris.friesen at windriver.com> wrote:
>> On 06/13/2016 02:17 PM, Paul Michali wrote:
>> > Hmm... I tried Friday and again today, and I'm not seeing the VMs being
>> evenly
>> > created on the NUMA nodes. Every Cirros VM is created on nodeid 0.
>> >
>> > I have the m1/small flavor (@GB) selected and am using hw:numa_nodes=1
>> and
>> > hw:mem_page_size=2048 flavor-key settings. Each VM is consuming 1024
>> huge pages
>> > (of size 2MB), but is on nodeid 0 always. Also, it seems that when I
>> reach 1/2
>> > of the total number of huge pages used, libvirt gives an error saying
>> there is
>> > not enough memory to create the VM. Is it expected that the huge pages
>> are
>> > "allocated" to each NUMA node?
>> Yes, any given memory page exists on one NUMA node, and a
>> single-NUMA-node VM
>> will be constrained to a single host NUMA node and will use memory from
>> that
>> host NUMA node.
>> You can see and/or adjust how many hugepages are available on each NUMA
>> node via
>> /sys/devices/system/node/nodeX/hugepages/hugepages-2048kB/* where X is
>> the host
>> NUMA node number.
>> Chris
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160614/9f55bd2f/attachment.html>

More information about the OpenStack-dev mailing list