[oslo][nova][stable][requirements] Fixing a high CPU usage from oslo.service into stable/rocky branch

Doug Hellmann doug at doughellmann.com
Wed Nov 21 14:57:02 UTC 2018


Herve Beraud <hberaud at redhat.com> writes:

> Hey all!
>
> Here is a thread to coordinate all the teams (oslo, nova, stable,
> requirements) working on the update of the oslo.service constraint in the
> Rocky requirements.
>
> # Summary
>
> Usage of threading event with eventlet caused inefficient code (causing
> many useless system calls and  high CPU usage).
> This issue was already fixed on oslo.service master and we also want to fix
> it in stable/rocky.
>
> Our main issue is how to fix the high CPU usage on stable/rocky without
> break the nova CI.
>
> Indeed, we already have backported the eventlet related fix to oslo.service
> but this fix requires also a nova update to avoid nova CI errors due to
> threading removal on oslo.service that introduce the nova CI errors.
>
> A fix was proposed and merged on oslo.service master to introduce a new
> feature (fixture) that avoid the nova CI errors, but
> backporting the master fix to Rocky introduces a new feature into a stable
> branch so this is also an issue.
>
> So we need to discuss with all the teams to find a proper solution.
>
> # History
>
> A few weeks ago this issue was opened on oslo.service (
> https://bugs.launchpad.net/oslo.service/+bug/1798774) and it was fixed by
> this submited patch on the master branch (
> https://review.openstack.org/#/c/611807/ ).
>
> This change use the proper event primitive to fix the performance issue.
>
> A new version of oslo.service was released (1.32.1)
>
> Since these changes was introduced into oslo.service master, nova facing
> some issues into the master CI process, due to the threading changes, and
> they was fixed by these patches ( https://review.openstack.org/#/c/615724/,
> https://review.openstack.org/#/c/617989/ ) into master.
>
> Few weeks ago I have backport to oslo.service some changes (
> https://review.openstack.org/#/c/614489/ ) from master to stable/rocky to
> also fix the problem in the rocky release.
>
> When this backport was merged we have created a new release of oslo.service
> (1.31.6) ( https://review.openstack.org/#/c/616505/ ) (stable/rocky
> version).
>
> Then the openstack proposal bot submit a patch to requirements on stable
> rocky to update the oslo.service version with the latest version (1.31.6)
> but if we'll use it we'll then break the CI
> https://review.openstack.org/#/c/618834/ so this patch is currently blocked
> to avoid nova CI error.
>
> # Issue
>
> Since the oslo.services threading changes were backported to rocky we risk
> to  faces the same issues inside the nova rocky CI if we update the
> requirements.
>
> In parallel in oslo.service we have started to backport a new patch who
> introduces fixture  ( https://review.openstack.org/#/c/617989/ ) from
> master to rocky, and also we start to backport on nova rocky branch (
> https://review.openstack.org/619019, https://review.openstack.org/619022 )
> patches who use oslo.service.fixture and who solve the nova CI issue. The
> patch on oslo.service exposes a public oslo_service.fixture.SleepFixture
> for this purpose. It can be maintained opaquely as internals change without
> affecting its consumers.
>
> The main problem is that the patch bring a new functionality to a stable
> branch (oslo.service rocky) but this patch help to fix the nova issue.
>
> Also openstack proposal bot submit a patch to requirements on stable rocky
> to update the oslo.service version with the latest version (1.31.6) but if
> we'll use it we'll then break the CI
> https://review.openstack.org/#/c/618834/ since the oslo service 1.31.6 is
> incompatible with novas stable rocky unittest due to the threading changes.
>
> # Questions and proposed solutions
>
> This thread try to summarize the current situation.
>
> We need to find how to be able to proceed, so this thread aim to allow to
> discuss between team to find the best way to fix.
>
> 1. Do we need to continue to try to backport fixture on oslo.service to fix
> the CI problem (https://review.openstack.org/#/c/617989/) ?
>
> 2. Do we need to find an another approach like mocking
> oslo.service.loopingcall._Event.wait in nova instead of mocking
> oslo_service.loopingcall._ThreadingEvent.wait (example:
> https://review.openstack.org/#/c/616697/2/nova/tests/unit/compute/test_compute_mgr.py)
> ?
> This is only a fix on the nova side and it allows us to update oslo.service
> requirements and allows us to fix the high CPU usage issue. I've submit
> this patch (https://review.openstack.org/619246) who implement the
> description above.
>
> Personaly I think we need to find an another approach like the mocking
> remplacement (c.f 2).
>
> We need to decide which way we use and to discuss about other solutions.
>
> -- 
> Hervé Beraud
> Senior Software Engineer
> Red Hat - Openstack Oslo
> irc: hberaud
> -----BEGIN PGP SIGNATURE-----
>
> wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+
> Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
> RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
> F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
> 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
> glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
> m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
> hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
> qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
> F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
> B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
> v6rDpkeNksZ9fFSyoY2o
> =ECSj
> -----END PGP SIGNATURE-----

Thank you for summarizing this issue, Hervé, and for working on the
patches we need.

I think I would be happy with either solution. Using clean backports
seems less risky, and even though we are adding a new feature to
oslo.service it's only a unit test fixture. On the other hand if we want
to be very strict about not adding features in stable branches and we
are OK with creating a change to nova's unit tests that is not
backported from master, then that works for me, too.

I have a slight preference for the first proposal, but not strong enough
to vote fight for it if the majority decides to go with the second
option.

-- 
Doug



More information about the openstack-discuss mailing list