[openstack-dev] [nova] NUMA, huge pages, and scheduling

Jay Pipes jaypipes at gmail.com
Tue Jun 14 20:09:03 UTC 2016


On 06/14/2016 12:30 PM, Paul Michali wrote:
> Well, looks like we figured out what is going on - maybe folks have some
> ideas on how we could handle this issue.
>
> What I see is that for each VM create (small flavor), 1024 huge pages
> are used and NUMA node 0 used. It appears that, when there is no longer
> enough huge pages on that NUMA node, Nova with then schedule to the
> other NUMA node and use those huge pages.
>
> In our case, we happen to have a special container running on the
> compute nodes, that uses 512 huge pages. As a result, when there are 768
> huge pages left, Nova thinks there are 1280 pages left and thinks one
> more VM can be create. It tries, but the create fails.
>
> Some questions...
>
> 1) Is there some way to "reserve" huge pages in Nova?

Yes. Code merged recently from Sahid does this:

https://review.openstack.org/#/c/277422/

Best,
-jay

> 2) If the create fails, should Nova try the other NUMA node (or is this
> because it doesn't know why it failed)?
> 3) Any ideas on how we can deal with this - without changing Nova?
>
> Thanks!
>
> PCM
>
>
>
> On Tue, Jun 14, 2016 at 1:09 PM Paul Michali <pc at michali.net
> <mailto:pc at michali.net>> wrote:
>
>     Great info Chris and thanks for confirming the assignment of blocks
>     of pages to a numa node.
>
>     I'm still struggling with why each VM is being assigned to NUMA node
>     0. Any ideas on where I should look to see why Nova is not using
>     NUMA id 1?
>
>     Thanks!
>
>
>     PCM
>
>
>     On Tue, Jun 14, 2016 at 10:29 AM Chris Friesen
>     <chris.friesen at windriver.com <mailto:chris.friesen at windriver.com>>
>     wrote:
>
>         On 06/13/2016 02:17 PM, Paul Michali wrote:
>          > Hmm... I tried Friday and again today, and I'm not seeing the
>         VMs being evenly
>          > created on the NUMA nodes. Every Cirros VM is created on
>         nodeid 0.
>          >
>          > I have the m1/small flavor (@GB) selected and am using
>         hw:numa_nodes=1 and
>          > hw:mem_page_size=2048 flavor-key settings. Each VM is
>         consuming 1024 huge pages
>          > (of size 2MB), but is on nodeid 0 always. Also, it seems that
>         when I reach 1/2
>          > of the total number of huge pages used, libvirt gives an
>         error saying there is
>          > not enough memory to create the VM. Is it expected that the
>         huge pages are
>          > "allocated" to each NUMA node?
>
>         Yes, any given memory page exists on one NUMA node, and a
>         single-NUMA-node VM
>         will be constrained to a single host NUMA node and will use
>         memory from that
>         host NUMA node.
>
>         You can see and/or adjust how many hugepages are available on
>         each NUMA node via
>         /sys/devices/system/node/nodeX/hugepages/hugepages-2048kB/*
>         where X is the host
>         NUMA node number.
>
>         Chris
>
>
>         __________________________________________________________________________
>         OpenStack Development Mailing List (not for usage questions)
>         Unsubscribe:
>         OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>         <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list