[nova] OOM Killed Processes

Sean Mooney smooney at redhat.com
Mon Feb 21 13:21:12 UTC 2022

On Mon, 2022-02-21 at 12:24 +0500, Ammad Syed wrote:
> Hi,,
> I am having trouble with my compute node that nova-compute and ovs process
> are being killed by OOM. I have alot memory available in system.
> # free -g
>               total        used        free      shared  buff/cache
> available
> Mem:           1006         121         881           0           2
> 879
> Swap:             7           0           7
> But I am seeing process are being killed in dmesg.
> [Sat Feb 19 03:46:26 2022] Memory cgroup out of memory: Killed process
> 2080898 (ovs-vswitchd) total-vm:9474284kB, anon-rss:1076384kB,
> file-rss:11700kB, shmem-rss:0kB, UID:0 pgtables:2776kB oom_score_adj:0
> [Sat Feb 19 03:47:01 2022] Memory cgroup out of memory: Killed process
> 2081218 (ovs-vswitchd) total-vm:9475332kB, anon-rss:1096988kB,
> file-rss:11700kB, shmem-rss:0kB, UID:0 pgtables:2780kB oom_score_adj:0
> [Sat Feb 19 03:47:06 2022] Memory cgroup out of memory: Killed process
> 2081616 (ovs-vswitchd) total-vm:9473252kB, anon-rss:1073052kB,
> file-rss:11700kB, shmem-rss:0kB, UID:0 pgtables:2784kB oom_score_adj:0
> [Sat Feb 19 03:47:16 2022] Memory cgroup out of memory: Killed process
> 2081940 (ovs-vswitchd) total-vm:9471236kB, anon-rss:1070920kB,
> file-rss:11700kB, shmem-rss:0kB, UID:0 pgtables:2776kB oom_score_adj:0
> [Sat Feb 19 03:47:16 2022] Memory cgroup out of memory: Killed process 6098
> (nova-compute) total-vm:3428356kB, anon-rss:279920kB, file-rss:9868kB,
> shmem-rss:0kB, UID:64060 pgtables:1020kB oom_score_adj:0
> [Mon Feb 21 11:15:08 2022] Memory cgroup out of memory: Killed process
> 2082296 (ovs-vswitchd) total-vm:9475372kB, anon-rss:1162636kB,
> file-rss:11700kB, shmem-rss:0kB, UID:0 pgtables:2864kB oom_score_adj:0
> Any advice on how to fix this ? Also any best practices document on
> configuring memory optimizations in nova compute node.
so one thing to note is that the OOM reaper service runs per numa node
so the gloable free memory values are not really what you need to look at.

croups/systemd also provide ways to limit the max memory a process/cgroup tree can consume

so your first stp shoudl be to determin if the oom event was triggered by exaustign the memroy in
a specific numa node or if you are hitting a differnt cgroup memory limit.

in terms fo how to optimisze memroy it really depend on what your end goal is here.

obviously we do not want ovs or nova-compute to be killsed in general.
the virtaul and reseden memory for ovs is in the 10 to 1 GB range so that is not excessively large.

that also should not be anywhere near the numa limit but if you were incorrectly creating numa affinged
vms without seting hw:mem_page_size via nova then that could perhaps tirgger out of memory events

effectivly with openenstack if your vm is numa affined eiter explictly via hw:numa_nodes extra specs or implictly via
cpu pinning or otherwise then you must define that the memory is tracked using the numa aware path which requires
you do defien hw:mem_page_size in the flaovr  or hw_mem_page_size in the image.

if you do not want ot use hugepages hw:mem_page_size=small is a good default but just be aware that if the
vm has a numa topology then memory over subsction is not supproted in openstack. i.e. you cannot use cpu pinning
or any other feature that requriees a numa topoplogy like virtual persistent memory and also use memory over subscription.

assuming these event do not corralate with vm boots then i woudl investiagte the cgroup memory limits you set on teh ovs and compute service
cgroups. if they are correatlted with vm boots check if the vm is numa affined and if it is which page size is requested.
ir its hw:mem_page_size=small then you might need to use the badly named reserved_huge_pages config option to reserve 4k pages for the host per numa
e.g. reserve 4G on node 0 and 1
reserved_huge_pages = node:0,size:4,count:1048576
reserved_huge_pages = node:1,size:4,count:1048576
the sum of all the 4k page size reservation should equal the value of reserved_host_memory_mb

this is only really needed where you are usign numa instances since reserved_host_memory_mb dose not account
for the host numa toplogy so it will not prevent the numa node form being exausted.

if you are usign a amd epyc system and the NPS( numa per socket) bios option set to say 4 or 8 
on a dual or single socket system respecivly then the the 1TB of ram you have on the host woudl be devided
into numa nodes of 128GB each which is very close to the 121 used you have when you star seeing issues.

nova currently tries to fill the numa nodes in order when you have numa instances too which causes the OOM issue to manifest much sooner then
people often exepct due to the per numa nature of OOM reaper.

that may not help you in your case but that is how i would approch tracking down this issue.
> Ammad

More information about the openstack-discuss mailing list