[nova] nova hypervisor oom killed some openstack guest
Laurent Dumont
laurentfdumont at gmail.com
Mon Jul 25 22:06:02 UTC 2022
How much are you reserving for Openstack vs the VM?
On Mon, Jul 25, 2022 at 2:19 PM hai wu <haiwu.us at gmail.com> wrote:
> Understand. The same concern is also raised in the following redhat
> KB: https://access.redhat.com/solutions/4670201.
>
> But we could also protect some critical openstack services, like
> neutron, libvirtd, via the same way by setting OOMScoreAdjust for
> those to be -1000. If we do that, we should probably be ok. We protect
> both critical openstack services, and all openstack VMs in this way.
>
> On Thu, Jul 21, 2022 at 6:42 AM Sean Mooney <smooney at redhat.com> wrote:
> >
> > On Wed, 2022-07-20 at 20:25 -0500, hai wu wrote:
> > > You are correct, there's no way to set OOMScoreAdjust for
> > > machine.slice. It errored out when trying to do that, with "Unknown
> > > assignment" error..
> >
> > if you mess with the cgroups behind novas back then any hope of support
> you have with
> > your vendor or updstream is gone.
> >
> > you shoudl really find out why your running out of memroy.
> >
> > it ususllay means you have not configured nova and the host correctly.
> >
> > most often this hapens becuase peopel use cpu pinning wiht out enable per
> > numa node memory memory tracking by setting a page size.
> >
> > it also could be because you have not allcoated enough swap.
> >
> > so before you try to adjust things with cgroups yourslef or explore
> other options you shoudl determin why
> > the host is runnign out of memroy.
> >
> > if you prevent ti from kill the gues i have see it kill ovs or nova
> iteslf before where the guest were
> > unkillable or unlkely to be killed because they used hugepages.
> >
> > so you will likely jsut shift the problem else where that will be more
> impactful.
> >
> > >
> > > On Wed, Jul 20, 2022 at 6:48 PM hai wu <haiwu.us at gmail.com> wrote:
> > > >
> > > > In this case there's no memory oversubscription. This oom killer
> event
> > > > happened when we did "swapoff -a; swapon -a" to push processes in
> swap
> > > > back to memory, which is very strange.
> > > >
> > > > On Wed, Jul 20, 2022 at 6:39 PM Clark Boylan <cboylan at sapwetik.org>
> wrote:
> > > > >
> > > > > On Wed, Jul 20, 2022, at 4:04 PM, hai wu wrote:
> > > > > > After installing some systemd package, and starting up
> machine.slice,
> > > > > > systemd-machined, and hard rebooting the vm from openstack side,
> I
> > > > > > could now see the VM showing up under machine.slice. all vms were
> > > > > > showing up under libvirtd.service, which is under system.slice.
> > > > > >
> > > > > > What are the benefits of running libvirt managed guest instances
> under
> > > > > > machine.slice?
> > > > >
> > > > > You can use machine.slice to set system resource options that each
> sub slice inherits. Those options are documented at
> https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#
> (per my earlier link
> https://www.freedesktop.org/software/systemd/man/systemd.slice.html). I
> don't see OOMScoreAdjust listed there so I am unsure if you can actually
> set it via this method.
> > > > >
> > > > > That all said, if you are oversubscribing memory this is likely to
> always be an issue. If you adjust the oom score for your VMs then the
> oomkiller is just going to find other victims to kill. Losing your nova
> compute agent or NetworkManager or iscsid may be just as problematic.
> Instead, I suspect that you may need to stop oversubscribing memory.
> > > > >
> > > > > >
> > > > > > On Wed, Jul 20, 2022 at 5:53 PM Clark Boylan <
> cboylan at sapwetik.org> wrote:
> > > > > > >
> > > > > > > On Wed, Jul 20, 2022, at 3:17 PM, hai wu wrote:
> > > > > > > > Is there any configuration file that is needed to ensure
> guest domains
> > > > > > > > are under systemd machine.slice? not seeing anything under
> > > > > > > > machine.slice ..
> > > > > > >
> > > > > > > I think that
> https://www.freedesktop.org/software/systemd/man/systemd.slice.html and
> https://libvirt.org/cgroups.html covers this for libvirt managed VMs.
> > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Jul 20, 2022 at 3:33 PM Dmitriy Rabotyagov
> > > > > > > > <noonedeadpunk at gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > I believe you can decrease OOMScoreAdjust for systemd
> machines.slice, under which guest domains are to reduce chances of oom
> killing them.
> > > > > > > > >
> > > > > > > > > ср, 20 июл. 2022 г., 21:52 hai wu <haiwu.us at gmail.com>:
> > > > > > > > > >
> > > > > > > > > > nova hypervisor sometimes oom would kill some openstack
> guests.
> > > > > > > > > >
> > > > > > > > > > Is it possible to not allow kernel to oom kill any
> openstack guests?
> > > > > > > > > > ram is not oversubscribed much ..
> > > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20220725/c9820331/attachment.htm>
More information about the openstack-discuss
mailing list