<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>

    </p>

    <div class="moz-text-plain" wrap="true" style="font-family:

      -moz-fixed; font-size: 12px;" lang="x-unicode">

      <pre wrap="">Hello everyone,

We're experiencing issues with running large instances (~60GB RAM) on

fairly large NUMA nodes (4 CPUs, 256GB RAM) while using cpu pinning. The

problem is that it seems that in some extreme cases qemu/KVM can have

significant memory overhead (10-15%?) which nova-compute service doesn't

take in to the account when launching VMs. Using our configuration as an

example - imagine running two VMs with 30GB RAM on one NUMA node

(because we use cpu pinning) - therefore using 60GB out of 64GB for

given NUMA domain. When both VMs would consume their entire memory

(given 10% KVM overhead) OOM killer takes an action (despite having

plenty of free RAM in other NUMA nodes). (the numbers are just

arbitrary, the point is that nova-scheduler schedules the instance to

run on the node because the memory seems 'free enough', but specific

NUMA node can be lacking the memory reserve).

Our initial solution was to use ram_allocation_ratio < 1 to ensure

having some reserved memory - this didn't work. Upon studying source of

nova, it turns out that ram_allocation_ratio is ignored when using cpu

pinning. (see

<a class="moz-txt-link-freetext" href="https://github.com/openstack/nova/blob/mitaka-eol/nova/virt/hardware.py#L859">https://github.com/openstack/nova/blob/mitaka-eol/nova/virt/hardware.py#L859</a>

and

<a class="moz-txt-link-freetext" href="https://github.com/openstack/nova/blob/mitaka-eol/nova/virt/hardware.py#L821">https://github.com/openstack/nova/blob/mitaka-eol/nova/virt/hardware.py#L821</a>

). We're running Mitaka, but this piece of code is implemented in Ocata

in a same way.

We're considering to create a patch for taking ram_allocation_ratio in

to account.

My question is - is ram_allocation_ratio ignored on purpose when using

cpu pinning? If yes, what is the reasoning behind it? And what would be

the right solution to ensure having reserved RAM on the NUMA nodes?

Thanks.

Regards,

Jakub Jursa

</pre>

    </div>

  </body>

</html>