[nova] workarounds and operator experience around bug 1522307/1908133
Rodrigo Barbieri
rodrigo.barbieri2010 at gmail.com
Thu Jan 14 16:50:03 UTC 2021
Hello there,
Thanks Sean for the suggestions. I've tested them and reported my findings
in https://bugs.launchpad.net/nova/+bug/1908133
Your links helped me a lot of figuring out that my placement aggregates
were set up incorrectly, and the fake reservation worked slightly better
than the reserved_host_disk_mb (more details on that in the bug update).
And it works very well on Rocky+, so that's very good.
This problem is now much more manageable, thanks for the suggestions!
Regards,
On Fri, Jan 8, 2021 at 7:13 PM Sean Mooney <smooney at redhat.com> wrote:
> On Fri, 2021-01-08 at 18:27 -0300, Rodrigo Barbieri wrote:
> > Thanks for the responses Eugen and Sean!
> >
> > The placement.yaml approach sounds good if it can prevent the compute
> host
> > from reporting local_gb repeatedly, and then as you suggested use
> Placement
> > Aggregates I can perhaps make that work for a subset of use cases. Too
> bad
> > it is only available on Victoria+. I was looking for something that could
> > work, even if partially, on Queens and Stein.
> >
> > The cron job updating the reservation, I'm not sure if it will clash with
> > the host updates (being overriden, as I've described in the LP bug), but
> > you actually gave me another idea. I may be able to create a fake
> > allocation in the nodes to cancel out their reported values, and then
> rely
> > only on the shared value through placement.
> well actully you could use the host reserved disk space config value to do
> that on older releases
> just set it equal to the pool size.
>
> https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.reserved_host_disk_mb
> not sure why that is in MB it really should be GB but anyway if you set
> that then it will set the placement value.
>
> >
> > Monitoring Ceph is only part of the problem. The second part, if you end
> up
> > needing it (and you may if you're not very conservative in the monitoring
> > parameters and have unpredictable workload) is to prevent new instances
> > from being created, thus new data from being stored, to prevent it from
> > filling up before you can react to it (think of an accidental DoS attack
> by
> > running a certain storage-heavy workloads).
> >
> > @Eugen, yes. I was actually looking for more reliable ways to prevent it
> > from happening.
> >
> > Overall, the shared placement + fake allocation sounded like the cleanest
> > workaround for me. I will try that and report back.
>
> if i get time in the next week or two im hoping ot try and tweak our ceph
> ci job to test
> that toplogy in the upstream ci. but just looking a the placemnt
> funcitonal tests it should work.
>
> This covers the use of sharing resouce providers
>
> https://github.com/openstack/placement/blob/master/placement/tests/functional/gabbits/shared-resources.yaml
>
> the final section thes the allocation candiate endpoint and asserts we
> getan allocation for both providres
>
> https://github.com/openstack/placement/blob/master/placement/tests/functional/gabbits/shared-resources.yaml#L135-L143
>
> its relitivly simple to read this file top to bottom and its only 143
> lines long but it basically step
> through and constucte the topolgoy i was descifbing or at least a similar
> ones and shows step by step what
> the different behavior will be as the rps are created and aggreates are
> created exctra.
>
> the main issue with this approch is we dont really have a good way to
> upgrade existing deployments to this toplogy beyond
> live migrating everything one node at a time so that there allcoation will
> get reshaped as a side effect of the move operation.
>
> looking a tthe history of this file it was added 3 years ago
> https://github.com/openstack/placement/commit/caeae7a41ed41535195640dfa6c5bb58a7999a9b
> around stien although it may also have worked before thatim not sure when
> we added sharing providers.
>
> >
> > Thanks for the help!
> >
> > On Wed, Jan 6, 2021 at 10:57 AM Eugen Block <eblock at nde.ag> wrote:
> >
> > > Hi,
> > >
> > > we're using OpenStack with Ceph in production and also have customers
> > > doing that.
> > > From my point of view fixing nova to be able to deal with shared
> > > storage of course would improve many things, but it doesn't liberate
> > > you from monitoring your systems. Filling up a ceph cluster should be
> > > avoided and therefore proper monitoring is required.
> > >
> > > I assume you were able to resolve the frozen instances?
> > >
> > > Regards,
> > > Eugen
> > >
> > >
> > > Zitat von Sean Mooney <smooney at redhat.com>:
> > >
> > > > On Tue, 2021-01-05 at 14:17 -0300, Rodrigo Barbieri wrote:
> > > > > Hi Nova folks and OpenStack operators!
> > > > >
> > > > > I have had some trouble recently where while using the
> "images_type =
> > > rbd"
> > > > > libvirt option my ceph cluster got filled up without I noticing and
> > > froze
> > > > > all my nova services and instances.
> > > > >
> > > > > I started digging and investigating why and how I could prevent or
> > > > > workaround this issue, but I didn't find a very reliable clean way.
> > > > >
> > > > > I documented all my steps and investigation in bug 1908133 [0]. It
> has
> > > been
> > > > > marked as a duplicate of 1522307 [1] which has been around for
> quite
> > > some
> > > > > time, so I am wondering if any operators have been using nova +
> ceph in
> > > > > production with "images_type = rbd" config set and how you have
> been
> > > > > handling/working around the issue.
> > > >
> > > > this is indeed a know issue and the long term plan to fix it was to
> > > > track shared storae
> > > > as a sharing resouce provide in plamcent. that never happend so
> > > > there si currenlty no mechanium
> > > > available to prevent this explcitly in nova.
> > > >
> > > > the disk filter which is nolonger used could prevnet the boot of a
> > > > vm that would fill the ceph pool but
> > > > it could not protect against two concurrent request form filling the
> > > pool.
> > > >
> > > > placement can protect against that due to the transational nature of
> > > > allocations which serialise
> > > > all resouce useage however since each host reports the total size of
> > > > the ceph pool as its local storage that wont work out of the box.
> > > >
> > > > as a quick hack what you can do is set the
> > > > [DEFAULT]/disk_allocation_ratio=(1/number of compute nodes)
> > > >
> > >
> https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.disk_allocation_ratio
> > > > on each of your compute agents configs.
> > > >
> > > >
> > > > that will prevent over subscription however it has other negitve
> > > sidefects.
> > > > mainly that you will fail to scudle instance that could boot if a
> > > > host exced its 1/n usage
> > > > so unless you have perfectly blanced consumtion this is not a good
> > > approch.
> > > >
> > > > a better appoch but one that requires external scripting is to have
> > > > a chron job that will update the resrved
> > > > usaage of each of the disk_gb inventores to the actull amount of of
> > > > stoarge allocated form the pool.
> > > >
> > > > the real fix however is for nova to tack its shared usage in
> > > > placment correctly as a sharing resouce provide.
> > > >
> > > > its possible you might be able to do that via the porvider.yaml file
> > > >
> > > > by overriding the local disk_gb to 0 on all comupte nodes
> > > > then creating a singel haring resouce provider of disk_gb that
> > > > models the ceph pool.
> > > >
> > > >
> > >
> https://specs.openstack.org/openstack/nova-specs/specs/ussuri/approved/provider-config-file.html
> > > > currently that does not support the addtion of providers to placment
> > > > agggreate so while it could be used to 0 out the comptue node
> > > > disk inventoies and to create a sharing provider it with the
> > > > MISC_SHARES_VIA_AGGREGATE trait it cant do the final step of mapping
> > > > which compute nodes can consume form sharing provider via the
> > > > agggrate but you could do that form.
> > > > that assume that "sharing resouce provdiers" actully work.
> > > >
> > > >
> > > > bacialy what it comes down to today is you need to monitor the
> > > > avaiable resouce yourslef externally and ensure you never run out of
> > > > space.
> > > > that sucks but untill we proably track things in plamcent there is
> > > > nothign we can really do.
> > > > the two approch i suggested above might work for a subset of
> > > > usecasue but really this is a feature that need native suport in
> > > > nova to adress properly.
> > > >
> > > > >
> > > > > Thanks in advance!
> > > > >
> > > > > [0] https://bugs.launchpad.net/nova/+bug/1908133
> > > > > [1] https://bugs.launchpad.net/nova/+bug/1522307
> > > > >
> > >
> > >
> > >
> > >
> > >
> >
>
>
>
>
--
Rodrigo Barbieri
MSc Computer Scientist
OpenStack Manila Core Contributor
Federal University of São Carlos
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210114/702a5c67/attachment-0001.html>
More information about the openstack-discuss
mailing list