[nova] NUMA scheduling

Satish Patel satish.txt at gmail.com
Mon Oct 19 13:25:28 UTC 2020


Sean,

Awesome write up, It would be great to have this explanation on
official website at here
https://docs.openstack.org/nova/pike/admin/cpu-topologies.html

~S

On Mon, Oct 19, 2020 at 8:02 AM Sean Mooney <smooney at redhat.com> wrote:
>
> sorry to top post but i was off on friday.
> the issue is that hw:mem_page_size has not been set.
>
> if you are using any numa feature you always need to set the mem_page_size to a value
> it does not matter what valid value you set it too but you need to define it in the flavor
> or image.
> if you do not then you have not activated per numa node memory tracking in nova and your
> vms will eventualy be killed by the OOM reaper.
>
>
> the minium valid numa aware vm to create is hw:mem_page_size=any
>
> that implictly expands to hw:mem_page_size=any hw:numa_nodes=1
>
> since 1 numa node is the default if you do not set hw:numa_nodes
>
> today when we generate a numa toplogy for hw:cpu_policy=dedicated we implictly set hw:numa_node=1 effectivly internally
> but we do not defime hw:mem_page_size=small/any/large
>
> so if you simply defien a flaovr with hw:numa_nodes=1 or hw:cpu_policy=dedicated and no other extra specs then technically
> that is an invalid flavor for the libvirt driver.
>
> hw:numa_nodes=1 is vaild for the hyperv driver on its own but not for the libvirt driver.
>
> if you are using any numa featuer with the libvirt driver hw:mem_page_size in the falvor or hw_mem_page_size in the image
> must be set for nova to correctly track and allocate memory for the vm.
>
> Sat, 2020-10-17 at 13:44 -0400, Satish Patel wrote:
> > or "hw:numa_nodes=2" to see if vm vcpu spreads to both zones.
> >
> > On Sat, Oct 17, 2020 at 1:41 PM Satish Patel <satish.txt at gmail.com> wrote:
> > >
> > > I would say try without  "hw:numa_nodes=1" in flavor properties.
> > >
> > > ~S
> > >
> > > On Sat, Oct 17, 2020 at 1:28 PM Eric K. Miller
> > > <emiller at genesishosting.com> wrote:
> > > >
> > > > > What is the error thrown by Openstack when NUMA0 is full?
> > > >
> > > >
> > > >
> > > > OOM is actually killing the QEMU process, which causes Nova to report:
> > > >
> > > >
> > > >
> > > > /var/log/kolla/nova/nova-compute.log.4:2020-08-25 12:31:19.812 6 WARNING nova.compute.manager [req-62bddc53-ca8b-4bdc-bf41-8690fc88076f - - - -
> > > > -] [instance: 8d8a262a-6e60-4e8a-97f9-14462f09b9e5] Instance shutdown by itself. Calling the stop API. Current vm_state: active, current
> > > > task_state: None, original DB power_state: 1, current VM power_state: 4
> > > >
> > > >
> > > >
> > > > So, there isn't a NUMA or memory-specific error from Nova - Nova is simply scheduling a VM on a node that it thinks has enough memory, and
> > > > Libvirt (or Nova?) is configuring the VM to use CPU cores on a full NUMA node.
> > > >
> > > >
> > > >
> > > > NUMA Node 1 had about 240GiB of free memory with about 100GiB of buffer/cache space used, so plenty of free memory, whereas NUMA Node 0 was
> > > > pretty tight on free memory.
> > > >
> > > >
> > > >
> > > > These are some logs in /var/log/messages (not for the nova-compute.log entry above, but the same condition for a VM that was killed - logs were
> > > > rolled, so I had to pick a different VM):
> > > >
> > > >
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: CPU 0/KVM invoked oom-killer: gfp_mask=0x100dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0,
> > > > oom_score_adj=0
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: CPU: 15 PID: 30468 Comm: CPU 0/KVM Not tainted 5.3.8-1.el7.elrepo.x86_64 #1
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Hardware name: <redacted hardware>
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Call Trace:
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: dump_stack+0x63/0x88
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: dump_header+0x51/0x210
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: oom_kill_process+0x105/0x130
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: out_of_memory+0x105/0x4c0
> > > >
> > > > …
> > > >
> > > > …
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: active_anon:108933472 inactive_anon:174036 isolated_anon:0#012 active_file:21875969
> > > > inactive_file:2418794 isolated_file:32#012 unevictable:88113 dirty:0 writeback:4 unstable:0#012 slab_reclaimable:3056118
> > > > slab_unreclaimable:432301#012 mapped:71768 shmem:570159 pagetables:258264 bounce:0#012 free:58924792 free_pcp:326 free_cma:0
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Node 0 active_anon:382548916kB inactive_anon:173052kB active_file:0kB inactive_file:2272kB
> > > > unevictable:289840kB isolated(anon):0kB isolated(file):128kB mapped:16696kB dirty:0kB writeback:0kB shmem:578812kB shmem_thp: 0kB
> > > > shmem_pmdmapped: 0kB anon_thp: 286420992kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Node 0 DMA free:15880kB min:0kB low:12kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB
> > > > inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15880kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB
> > > > free_pcp:0kB local_pcp:0kB free_cma:0kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: lowmem_reserve[]: 0 1589 385604 385604 385604
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Node 0 DMA32 free:1535904kB min:180kB low:1780kB high:3380kB active_anon:90448kB inactive_anon:0kB
> > > > active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:1717888kB managed:1627512kB mlocked:0kB kernel_stack:0kB
> > > > pagetables:0kB bounce:0kB free_pcp:1008kB local_pcp:248kB free_cma:0kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: lowmem_reserve[]: 0 0 384015 384015 384015
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Node 0 Normal free:720756kB min:818928kB low:1212156kB high:1605384kB active_anon:382458300kB
> > > > inactive_anon:173052kB active_file:0kB inactive_file:2272kB unevictable:289840kB writepending:0kB present:399507456kB managed:393231952kB
> > > > mlocked:289840kB kernel_stack:58344kB pagetables:889796kB bounce:0kB free_pcp:296kB local_pcp:0kB free_cma:0kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: lowmem_reserve[]: 0 0 0 0 0
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Node 0 DMA: 0*4kB 1*8kB (U) 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U)
> > > > 1*2048kB (M) 3*4096kB (M) = 15880kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Node 0 DMA32: 1*4kB (U) 1*8kB (M) 0*16kB 9*32kB (UM) 11*64kB (UM) 12*128kB (UM) 12*256kB (UM)
> > > > 11*512kB (UM) 11*1024kB (M) 1*2048kB (U) 369*4096kB (M) = 1535980kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Node 0 Normal: 76633*4kB (UME) 30442*8kB (UME) 7998*16kB (UME) 1401*32kB (UE) 6*64kB (U) 0*128kB
> > > > 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 723252kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: 24866489 total pagecache pages
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: 0 pages in swap cache
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Swap cache stats: add 0, delete 0, find 0/0
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Free swap  = 0kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Total swap = 0kB
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: 200973631 pages RAM
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: 0 pages HighMem/MovableOnly
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: 3165617 pages reserved
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: 0 pages hwpoisoned
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Tasks state (memory values in pages):
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: [   2414]     0  2414    33478    20111   315392        0             0 systemd-journal
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: [   2438]     0  2438    31851      540   143360        0             0 lvmetad
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: [   2453]     0  2453    12284     1141   131072        0         -1000 systemd-udevd
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: [   4170]     0  4170    13885      446   131072        0         -1000 auditd
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: [   4393]     0  4393     5484      526    86016        0             0 irqbalance
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: [   4394]     0  4394     6623      624   102400        0             0 systemd-logind
> > > >
> > > > …
> > > >
> > > > …
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: oom-
> > > > kill:constraint=CONSTRAINT_MEMORY_POLICY,nodemask=0,cpuset=vcpu0,mems_allowed=0,global_oom,task_memcg=/machine.slice/machine-
> > > > qemu\x2d237\x2dinstance\x2d0000fda8.scope,task=qemu-kvm,pid=25496,uid=42436
> > > >
> > > > Oct 10 15:17:01 <redacted hostname> kernel: Out of memory: Killed process 25496 (qemu-kvm) total-vm:67989512kB, anon-rss:66780940kB, file-
> > > > rss:11052kB, shmem-rss:4kB
> > > >
> > > > Oct 10 15:17:02 <redacted hostname> kernel: oom_reaper: reaped process 25496 (qemu-kvm), now anon-rss:0kB, file-rss:36kB, shmem-rss:4kB
> >
>
>



More information about the openstack-discuss mailing list