How much are you reserving for Openstack vs the VM?
On Mon, 2022-07-25 at 18:06 -0400, Laurent Dumont wrote: that is a very good question many people fail to account for the qemu overhead and fail to allocate swap. even if you are not using memory over subscripion you should ahve 8-16GB fo swap on any nova compute host. in addtion to how much is being reserved its also imporant to ensure taht if you are doing memory over subscrtion that there is enough swap to cover that and to understand that the kernel oom reaper runs per numa node so even if there is plent of free memory on numa 1 if the kernel need memory on numa 0 then it will trigger an OOM reaping cycle. so if you are using hugepages its imporant to ensure that you still have enough memory one all numa nodes where kernel proceess can run.
On Mon, Jul 25, 2022 at 2:19 PM hai wu <haiwu.us@gmail.com> wrote:
Understand. The same concern is also raised in the following redhat KB: https://access.redhat.com/solutions/4670201.
just be aware that ^ is not something that is supproted in the redhat openstack product and implementing it woudl void your support for the vms. knolwadge base articals are generally writen by support engineers when debugging a problem with possibel solutions they tried. The are not part of our product docs, are not review for correctness by the engineri teams that maintain openstack upstream or downstream. so take anything you find there with a grain of salt. libvirt hooks are not and never have been supported upstream or downstream. but if you are maintaining and or operating the cloud your self then that might work for you.
But we could also protect some critical openstack services, like neutron, libvirtd, via the same way by setting OOMScoreAdjust for those to be -1000. If we do that, we should probably be ok. We protect both critical openstack services, and all openstack VMs in this way.
On Thu, Jul 21, 2022 at 6:42 AM Sean Mooney <smooney@redhat.com> wrote:
On Wed, 2022-07-20 at 20:25 -0500, hai wu wrote:
You are correct, there's no way to set OOMScoreAdjust for machine.slice. It errored out when trying to do that, with "Unknown assignment" error..
if you mess with the cgroups behind novas back then any hope of support
you have with
your vendor or updstream is gone.
you shoudl really find out why your running out of memroy.
it ususllay means you have not configured nova and the host correctly.
most often this hapens becuase peopel use cpu pinning wiht out enable per numa node memory memory tracking by setting a page size.
it also could be because you have not allcoated enough swap.
so before you try to adjust things with cgroups yourslef or explore other options you shoudl determin why the host is runnign out of memroy.
if you prevent ti from kill the gues i have see it kill ovs or nova iteslf before where the guest were unkillable or unlkely to be killed because they used hugepages.
so you will likely jsut shift the problem else where that will be more impactful.
On Wed, Jul 20, 2022 at 6:48 PM hai wu <haiwu.us@gmail.com> wrote:
In this case there's no memory oversubscription. This oom killer
event
happened when we did "swapoff -a; swapon -a" to push processes in swap back to memory, which is very strange.
On Wed, Jul 20, 2022 at 6:39 PM Clark Boylan <cboylan@sapwetik.org> wrote:
On Wed, Jul 20, 2022, at 4:04 PM, hai wu wrote: > After installing some systemd package, and starting up
machine.slice,
> systemd-machined, and hard rebooting the vm from openstack side, I > could now see the VM showing up under machine.slice. all vms were > showing up under libvirtd.service, which is under system.slice. > > What are the benefits of running libvirt managed guest instances under > machine.slice?
You can use machine.slice to set system resource options that each sub slice inherits. Those options are documented at https://www.freedesktop.org/software/systemd/man/systemd.resource-control.ht... (per my earlier link https://www.freedesktop.org/software/systemd/man/systemd.slice.html). I don't see OOMScoreAdjust listed there so I am unsure if you can actually set it via this method.
That all said, if you are oversubscribing memory this is likely to always be an issue. If you adjust the oom score for your VMs then the oomkiller is just going to find other victims to kill. Losing your nova compute agent or NetworkManager or iscsid may be just as problematic. Instead, I suspect that you may need to stop oversubscribing memory.
> > On Wed, Jul 20, 2022 at 5:53 PM Clark Boylan < cboylan@sapwetik.org> wrote: > > > > On Wed, Jul 20, 2022, at 3:17 PM, hai wu wrote: > > > Is there any configuration file that is needed to ensure guest domains > > > are under systemd machine.slice? not seeing anything under > > > machine.slice .. > > > > I think that https://www.freedesktop.org/software/systemd/man/systemd.slice.html and https://libvirt.org/cgroups.html covers this for libvirt managed VMs. > > > > > > > > On Wed, Jul 20, 2022 at 3:33 PM Dmitriy Rabotyagov > > > <noonedeadpunk@gmail.com> wrote: > > > > > > > > I believe you can decrease OOMScoreAdjust for systemd machines.slice, under which guest domains are to reduce chances of oom killing them. > > > > > > > > ср, 20 июл. 2022 г., 21:52 hai wu <haiwu.us@gmail.com>: > > > > > > > > > > nova hypervisor sometimes oom would kill some openstack guests. > > > > > > > > > > Is it possible to not allow kernel to oom kill any openstack guests? > > > > > ram is not oversubscribed much .. > > > > > > >