[nova] OOM Killed Processes

Sean Mooney smooney at redhat.com
Tue Feb 22 13:10:08 UTC 2022


On Tue, 2022-02-22 at 17:51 +0500, Ammad Syed wrote:
> On Mon, Feb 21, 2022 at 6:21 PM Sean Mooney <smooney at redhat.com> wrote:
> 
> > On Mon, 2022-02-21 at 12:24 +0500, Ammad Syed wrote:
> > > Hi,,
> > > 
> > > I am having trouble with my compute node that nova-compute and ovs
> > process
> > > are being killed by OOM. I have alot memory available in system.
> > > 
> > > # free -g
> > >               total        used        free      shared  buff/cache
> > > available
> > > Mem:           1006         121         881           0           2
> > > 879
> > > Swap:             7           0           7
> > > 
> > > But I am seeing process are being killed in dmesg.
> > > 
> > > [Sat Feb 19 03:46:26 2022] Memory cgroup out of memory: Killed process
> > > 2080898 (ovs-vswitchd) total-vm:9474284kB, anon-rss:1076384kB,
> > > file-rss:11700kB, shmem-rss:0kB, UID:0 pgtables:2776kB oom_score_adj:0
> > > [Sat Feb 19 03:47:01 2022] Memory cgroup out of memory: Killed process
> > > 2081218 (ovs-vswitchd) total-vm:9475332kB, anon-rss:1096988kB,
> > > file-rss:11700kB, shmem-rss:0kB, UID:0 pgtables:2780kB oom_score_adj:0
> > > [Sat Feb 19 03:47:06 2022] Memory cgroup out of memory: Killed process
> > > 2081616 (ovs-vswitchd) total-vm:9473252kB, anon-rss:1073052kB,
> > > file-rss:11700kB, shmem-rss:0kB, UID:0 pgtables:2784kB oom_score_adj:0
> > > [Sat Feb 19 03:47:16 2022] Memory cgroup out of memory: Killed process
> > > 2081940 (ovs-vswitchd) total-vm:9471236kB, anon-rss:1070920kB,
> > > file-rss:11700kB, shmem-rss:0kB, UID:0 pgtables:2776kB oom_score_adj:0
> > > [Sat Feb 19 03:47:16 2022] Memory cgroup out of memory: Killed process
> > 6098
> > > (nova-compute) total-vm:3428356kB, anon-rss:279920kB, file-rss:9868kB,
> > > shmem-rss:0kB, UID:64060 pgtables:1020kB oom_score_adj:0
> > > [Mon Feb 21 11:15:08 2022] Memory cgroup out of memory: Killed process
> > > 2082296 (ovs-vswitchd) total-vm:9475372kB, anon-rss:1162636kB,
> > > file-rss:11700kB, shmem-rss:0kB, UID:0 pgtables:2864kB oom_score_adj:0
> > > 
> > > Any advice on how to fix this ? Also any best practices document on
> > > configuring memory optimizations in nova compute node.
> > so one thing to note is that the OOM reaper service runs per numa node
> > so the gloable free memory values are not really what you need to look at.
> > 
> > croups/systemd also provide ways to limit the max memory a process/cgroup
> > tree can consume
> > 
> > so your first stp shoudl be to determin if the oom event was triggered by
> > exaustign the memroy in
> > a specific numa node or if you are hitting a differnt cgroup memory limit.
> > 
> 
> As the logs suggest, it looks like the memory of cgroup is exhausted. The
> memory of system.slice cgroup is 4G and user.slice is 2G by default. I have
> increased system.slice to 64GB.
ack this seam so be out side the scope of nova then.
nova does not magne host cgroups
libvirt does create cgroups for the vms but nova has not role in any cgroup management.
> 
> 
> > 
> > in terms fo how to optimisze memroy it really depend on what your end goal
> > is here.
> > 
> > obviously we do not want ovs or nova-compute to be killsed in general.
> > the virtaul and reseden memory for ovs is in the 10 to 1 GB range so that
> > is not excessively large.
> > 
> > that also should not be anywhere near the numa limit but if you were
> > incorrectly creating numa affinged
> > vms without seting hw:mem_page_size via nova then that could perhaps
> > tirgger out of memory events
> > 
> 
> Currently I am not using any page_size in my flavors.
ack
> 
> > 
> > effectivly with openenstack if your vm is numa affined eiter explictly via
> > hw:numa_nodes extra specs or implictly via
> > cpu pinning or otherwise then you must define that the memory is tracked
> > using the numa aware path which requires
> > you do defien hw:mem_page_size in the flaovr  or hw_mem_page_size in the
> > image.
> > 
> 
> I am only using CPU soft pinning (vcpu placement) i.e cpu_shared_set.
> However I have only configured hw:cpu_sockets='2' in flavors to make two
> sockets for VM. This helps in effective cpu utilization in windows hosts.
> However in VM I can only see one numa node of memory. Will this possibly
> cause trouble ?
no it should not.
hw:cpu_sockets='2' alters the cpu toplogy but does not modify the guest virtual
numa toplogy. by default all guest will be reproted as having 1 numa node but
without requesting a numa toptopogy, directly or indirectly we will not provide any
numa affintiy by default.

old servers (12+ years old) with a front side bus architture had multipel sockets
per numa node since the memory contoler was located on the north bridge.
while this is not a common toplogy these days i woudl not expect it to have any negitve
performance impacts on the vm or windwos running itn the vm.

setting hw:cpu_sockets='1' woudl also likely improved the windwos guest cpu utiliastion
while being more typical of a real host topology but i doubt you will see
any meaning full perfromacne delta. i generally recommend setting hw:cpu_sockets equal
to the number of numa nodes more out of consitancy then anything elese.


if you expclitly have mulitipel numa nodes hw:numa_nodes=2 then hw:cpu_sockets='2'
can help the guest kernel make better schduilg decisions but i dont think
hw:cpu_sockets='2' when the guest has 1 numa node will degrade the perfroamce.
> 
> 
> > 
> > if you do not want ot use hugepages hw:mem_page_size=small is a good
> > default but just be aware that if the
> > vm has a numa topology then memory over subsction is not supproted in
> > openstack. i.e. you cannot use cpu pinning
> > or any other feature that requriees a numa topoplogy like virtual
> > persistent memory and also use memory over subscription.
> > 
> 
> Got it.
> 
> 
> > 
> > assuming these event do not corralate with vm boots then i woudl
> > investiagte the cgroup memory limits you set on teh ovs and compute service
> > cgroups. if they are correatlted with vm boots check if the vm is numa
> > affined and if it is which page size is requested.
> > ir its hw:mem_page_size=small then you might need to use the badly named
> > reserved_huge_pages config option to reserve 4k pages for the host per numa
> > node.
> > 
> > https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.reserved_huge_pages
> > e.g. reserve 4G on node 0 and 1
> > reserved_huge_pages = node:0,size:4,count:1048576
> > reserved_huge_pages = node:1,size:4,count:1048576
> > the sum of all the 4k page size reservation should equal the value of
> > reserved_host_memory_mb
> > 
> > https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.reserved_host_memory_mb
> 
> 
> Currently I have reserved_host_memory_mb
> <https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.reserved_host_memory_mb>
> with
> 64GB memory reserved for host and no oversubscription i.e memory over
> provisioning factor set to 1.0 in compute nodes.
ack
since you are not numa ffinign the vms that should be sufficent
it really does seam that this is just related to the cgroup config on the host.
> 
> 
> > 
> > 
> > this is only really needed where you are usign numa instances since
> > reserved_host_memory_mb dose not account
> > for the host numa toplogy so it will not prevent the numa node form being
> > exausted.
> > 
> > if you are usign a amd epyc system and the NPS( numa per socket) bios
> > option set to say 4 or 8
> > on a dual or single socket system respecivly then the the 1TB of ram you
> > have on the host woudl be devided
> > into numa nodes of 128GB each which is very close to the 121 used you have
> > when you star seeing issues.
> > 
> 
> Yes I am using epyc system and checked in BIOS NPS is set to 1.
ack so in that case you likely are not exausting the numa node
> 
> > 
> > nova currently tries to fill the numa nodes in order when you have numa
> > instances too which causes the OOM issue to manifest much sooner then
> > people often exepct due to the per numa nature of OOM reaper.
> > 
> > that may not help you in your case but that is how i would approch
> > tracking down this issue.
> > 
> 
> This indeed helped a lot. It's been last 18 hours, and no OOM Killed
> observed till now.
based on what you have said tweakign the system and user slices is proably the way to adress
this. it sound like your nova config is fine for how you are creating vms.

im not sure how you have deployed openstack/openvswtich in this case but i suspect the cgroups limist
the install or you applied as part fo the instalation are jsut a little too low and if you increase them
it will work ok.
> 
> 
> > > 
> > > Ammad
> > 
> > 




More information about the openstack-discuss mailing list