[openstack-dev] [nova] Running large instances with CPU pinning and OOM

Jakub Jursa jakub.jursa at chillisys.com
Wed Sep 27 09:58:08 UTC 2017

On 27.09.2017 11:12, Jakub Jursa wrote:
> On 27.09.2017 10:40, Blair Bethwaite wrote:
>> On 27 September 2017 at 18:14, Stephen Finucane <sfinucan at redhat.com> wrote:
>>> What you're probably looking for is the 'reserved_host_memory_mb' option. This
>>> defaults to 512 (at least in the latest master) so if you up this to 4192 or
>>> similar you should resolve the issue.
>> I don't see how this would help given the problem description -
>> reserved_host_memory_mb would only help avoid causing OOM when
>> launching the last guest that would otherwise fit on a host based on
>> Nova's simplified notion of memory capacity. It sounds like both CPU
>> and NUMA pinning are in play here, otherwise the host would have no
>> problem allocating RAM on a different NUMA node and OOM would be
>> avoided.
> I'm not quite sure if/how OpenStack handles NUMA pinning (why is VM
> being killed by OOM rather than having memory allocated on different
> NUMA node). Anyway, good point, thank you, I should have a look at exact
> parameters passed to QEMU when using CPU pinning.
>> Jakub, your numbers sound reasonable to me, i.e., use 60 out of 64GB
> Hm, but the question is, how to prevent having some smaller instance
> (e.g. 2GB RAM) scheduled on such NUMA node?
>> when only considering QEMU overhead - however I would expect that
>> might  be a problem on NUMA node0 where there will be extra reserved
>> memory regions for kernel and devices. In such a configuration where
>> you are wanting to pin multiple guests into each of multiple NUMA
>> nodes I think you may end up needing different flavor/instance-type
>> configs (using less RAM) for node0 versus other NUMA nodes. Suggest
> What do you mean using different flavor? From what I understand (
> http://specs.openstack.org/openstack/nova-specs/specs/juno/implemented/virt-driver-numa-placement.html
> https://docs.openstack.org/nova/pike/admin/cpu-topologies.html ) it can
> be specified that flavor 'wants' different amount memory from its
> (virtual) NUMA nodes, but mapping vCPU <-> pCPU is more or less
> arbitrary (meaning that there is no way how to specify for NUMA node0 on
> physical host that it has less memory available for VM allocation)

Can't be 'reserved_huge_pages' option used to reserve memory on certain
NUMA nodes?

>> freshly booting one of your hypervisors and then with no guests
>> running take a look at e.g. /proc/buddyinfo/ and /proc/zoneinfo to see
>> what memory is used/available and where.
> Thanks, I'll look into it.
> Regards,
> Jakub

More information about the OpenStack-dev mailing list