[oslo][nova][stable][requirements] Fixing a high CPU usage from oslo.service into stable/rocky branch
Eric Fried
openstack at fried.cc
Wed Nov 21 16:15:06 UTC 2018
I have no preference.
For #1, the backports won't be perfectly clean, because the requirements
files will have different versions of oslo.service in them. But that's
no big deal. It's also more "work", but the patches are already
proposed, so meh.
For #2, the original objection [1] was that it's still mocking private
things from a 3rd-party library. But that's not as big a deal for
stable, which is less likely to yank the rug out. (You know, except for
the backport that led to this issue...)
-efried
[1]
https://review.openstack.org/#/c/615724/1/nova/tests/unit/compute/test_compute_mgr.py@6440
On 11/21/18 09:28, Herve Beraud wrote:
> I agree this is not a feature in the normal sense.
> Fixture is isolated from the rest and doesn't affecting the oslo.service
> consumers.
> I have a slight preference for the second solution but I've no problems
> to use the first proposed solution (backport).
>
> Le mer. 21 nov. 2018 à 16:02, Sean Mooney <smooney at redhat.com
> <mailto:smooney at redhat.com>> a écrit :
>
> On Wed, 2018-11-21 at 09:57 -0500, Doug Hellmann wrote:
> > Herve Beraud <hberaud at redhat.com <mailto:hberaud at redhat.com>> writes:
> >
> > > Hey all!
> > >
> > > Here is a thread to coordinate all the teams (oslo, nova, stable,
> > > requirements) working on the update of the oslo.service
> constraint in the
> > > Rocky requirements.
> > >
> > > # Summary
> > >
> > > Usage of threading event with eventlet caused inefficient code
> (causing
> > > many useless system calls and high CPU usage).
> > > This issue was already fixed on oslo.service master and we also
> want to fix
> > > it in stable/rocky.
> > >
> > > Our main issue is how to fix the high CPU usage on stable/rocky
> without
> > > break the nova CI.
> > >
> > > Indeed, we already have backported the eventlet related fix to
> oslo.service
> > > but this fix requires also a nova update to avoid nova CI errors
> due to
> > > threading removal on oslo.service that introduce the nova CI errors.
> > >
> > > A fix was proposed and merged on oslo.service master to
> introduce a new
> > > feature (fixture) that avoid the nova CI errors, but
> > > backporting the master fix to Rocky introduces a new feature
> into a stable
> > > branch so this is also an issue.
> > >
> > > So we need to discuss with all the teams to find a proper solution.
> > >
> > > # History
> > >
> > > A few weeks ago this issue was opened on oslo.service (
> > > https://bugs.launchpad.net/oslo.service/+bug/1798774) and it was
> fixed by
> > > this submited patch on the master branch (
> > > https://review.openstack.org/#/c/611807/ ).
> > >
> > > This change use the proper event primitive to fix the
> performance issue.
> > >
> > > A new version of oslo.service was released (1.32.1)
> > >
> > > Since these changes was introduced into oslo.service master,
> nova facing
> > > some issues into the master CI process, due to the threading
> changes, and
> > > they was fixed by these patches (
> https://review.openstack.org/#/c/615724/,
> > > https://review.openstack.org/#/c/617989/ ) into master.
> > >
> > > Few weeks ago I have backport to oslo.service some changes (
> > > https://review.openstack.org/#/c/614489/ ) from master to
> stable/rocky to
> > > also fix the problem in the rocky release.
> > >
> > > When this backport was merged we have created a new release of
> oslo.service
> > > (1.31.6) ( https://review.openstack.org/#/c/616505/ ) (stable/rocky
> > > version).
> > >
> > > Then the openstack proposal bot submit a patch to requirements
> on stable
> > > rocky to update the oslo.service version with the latest version
> (1.31.6)
> > > but if we'll use it we'll then break the CI
> > > https://review.openstack.org/#/c/618834/ so this patch is
> currently blocked
> > > to avoid nova CI error.
> > >
> > > # Issue
> > >
> > > Since the oslo.services threading changes were backported to
> rocky we risk
> > > to faces the same issues inside the nova rocky CI if we update the
> > > requirements.
> > >
> > > In parallel in oslo.service we have started to backport a new
> patch who
> > > introduces fixture ( https://review.openstack.org/#/c/617989/ )
> from
> > > master to rocky, and also we start to backport on nova rocky
> branch (
> > > https://review.openstack.org/619019,
> https://review.openstack.org/619022 )
> > > patches who use oslo.service.fixture and who solve the nova CI
> issue. The
> > > patch on oslo.service exposes a public
> oslo_service.fixture.SleepFixture
> > > for this purpose. It can be maintained opaquely as internals
> change without
> > > affecting its consumers.
> > >
> > > The main problem is that the patch bring a new functionality to
> a stable
> > > branch (oslo.service rocky) but this patch help to fix the nova
> issue.
> > >
> > > Also openstack proposal bot submit a patch to requirements on
> stable rocky
> > > to update the oslo.service version with the latest version
> (1.31.6) but if
> > > we'll use it we'll then break the CI
> > > https://review.openstack.org/#/c/618834/ since the oslo service
> 1.31.6 is
> > > incompatible with novas stable rocky unittest due to the
> threading changes.
> > >
> > > # Questions and proposed solutions
> > >
> > > This thread try to summarize the current situation.
> > >
> > > We need to find how to be able to proceed, so this thread aim to
> allow to
> > > discuss between team to find the best way to fix.
> > >
> > > 1. Do we need to continue to try to backport fixture on
> oslo.service to fix
> > > the CI problem (https://review.openstack.org/#/c/617989/) ?
> > >
> > > 2. Do we need to find an another approach like mocking
> > > oslo.service.loopingcall._Event.wait in nova instead of mocking
> > > oslo_service.loopingcall._ThreadingEvent.wait (example:
> > >
> https://review.openstack.org/#/c/616697/2/nova/tests/unit/compute/test_compute_mgr.py)
> > > ?
> > > This is only a fix on the nova side and it allows us to update
> oslo.service
> > > requirements and allows us to fix the high CPU usage issue. I've
> submit
> > > this patch (https://review.openstack.org/619246) who implement the
> > > description above.
> > >
> > > Personaly I think we need to find an another approach like the
> mocking
> > > remplacement (c.f 2).
> > >
> > > We need to decide which way we use and to discuss about other
> solutions.
> > >
> > > --
> > > Hervé Beraud
> > > Senior Software Engineer
> > > Red Hat - Openstack Oslo
> > > irc: hberaud
> > > -----BEGIN PGP SIGNATURE-----
> > >
> > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+
> > > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
> > > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
> > > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
> > > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
> > > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
> > > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
> > > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
> > > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
> > > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
> > > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
> > > v6rDpkeNksZ9fFSyoY2o
> > > =ECSj
> > > -----END PGP SIGNATURE-----
> >
> > Thank you for summarizing this issue, Hervé, and for working on the
> > patches we need.
> >
> > I think I would be happy with either solution. Using clean backports
> > seems less risky, and even though we are adding a new feature to
> > oslo.service it's only a unit test fixture. On the other hand if
> we want
> > to be very strict about not adding features in stable branches and we
> > are OK with creating a change to nova's unit tests that is not
> > backported from master, then that works for me, too.
>
> it should be noted this is not just a blocker for the nova ci
> if we dont fix the unit test it will break ditrobution that run
> the unitest as part of there packaging of nova downstream.
> i would prefer to backport the fixture personally and do a clean
> backport of the nova patches
> also rather then a stable only patch. while thecnically a feature i
> dont really consider a
> test fixture to be a feature in the normal sense and it is relitivly
> small and self contained.
> >
> > I have a slight preference for the first proposal, but not strong
> enough
> > to vote fight for it if the majority decides to go with the second
> > option.
> >
>
>
>
> --
> Hervé Beraud
> Senior Software Engineer
> Red Hat - Openstack Oslo
> irc: hberaud
> -----BEGIN PGP SIGNATURE-----
>
> wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+
> Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
> RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
> F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
> 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
> glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
> m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
> hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
> qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
> F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
> B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
> v6rDpkeNksZ9fFSyoY2o
> =ECSj
> -----END PGP SIGNATURE-----
>
More information about the openstack-discuss
mailing list