[oslo][nova][stable][requirements] Fixing a high CPU usage from oslo.service into stable/rocky branch

Tony Breeds tony at bakeyournoodle.com
Thu Nov 22 04:55:07 UTC 2018


HI folks,
    I admit my initial response to this was mor pragmatic 'take the
bakport' but as I thought it through I saw more problems with that
approach.

On Wed, Nov 21, 2018 at 03:47:25PM +0100, Herve Beraud wrote:
 
> Since these changes was introduced into oslo.service master, nova facing
> some issues into the master CI process, due to the threading changes, and
> they was fixed by these patches ( https://review.openstack.org/#/c/615724/,
> https://review.openstack.org/#/c/617989/ ) into master.
> 
> Few weeks ago I have backport to oslo.service some changes (
> https://review.openstack.org/#/c/614489/ ) from master to stable/rocky to
> also fix the problem in the rocky release.

Okay that was a mistake, backporting a patch from master to stable that
is known to break consumers, I admit this isn't explicitly called out in
the stable policy but it is kind of the spirit of the stable policy.

The quickest fix would be to revert 614489, release 1.31.7 and blacklist
1.31.6.

Yes this leaves the High CPU usage bug open on rocky.  That isn't great
but it also isn't terrible.
 
> When this backport was merged we have created a new release of oslo.service
> (1.31.6) ( https://review.openstack.org/#/c/616505/ ) (stable/rocky
> version).
> 
> Then the openstack proposal bot submit a patch to requirements on stable
> rocky to update the oslo.service version with the latest version (1.31.6)
> but if we'll use it we'll then break the CI
> https://review.openstack.org/#/c/618834/ so this patch is currently blocked
> to avoid nova CI error.

Huzzah for cross-project gateing!
 
> # Issue
> 
> Since the oslo.services threading changes were backported to rocky we risk
> to  faces the same issues inside the nova rocky CI if we update the
> requirements.
> 
> In parallel in oslo.service we have started to backport a new patch who
> introduces fixture  ( https://review.openstack.org/#/c/617989/ ) from
> master to rocky, and also we start to backport on nova rocky branch (
> https://review.openstack.org/619019, https://review.openstack.org/619022 )
> patches who use oslo.service.fixture and who solve the nova CI issue. The
> patch on oslo.service exposes a public oslo_service.fixture.SleepFixture
> for this purpose. It can be maintained opaquely as internals change without
> affecting its consumers.
> 
> The main problem is that the patch bring a new functionality to a stable
> branch (oslo.service rocky) but this patch help to fix the nova issue.
> 
> Also openstack proposal bot submit a patch to requirements on stable rocky
> to update the oslo.service version with the latest version (1.31.6) but if
> we'll use it we'll then break the CI
> https://review.openstack.org/#/c/618834/ since the oslo service 1.31.6 is
> incompatible with novas stable rocky unittest due to the threading changes.
> 
> # Questions and proposed solutions
> 
> This thread try to summarize the current situation.
> 
> We need to find how to be able to proceed, so this thread aim to allow to
> discuss between team to find the best way to fix.
> 
> 1. Do we need to continue to try to backport fixture on oslo.service to fix
> the CI problem (https://review.openstack.org/#/c/617989/) ?

Doing this is a violation of the stable policy.  I get that the new
feature is just a testing only fixture but that doesn't really matter
it's still a feature.  To use it consumers would need to raise
the value for oslo.service in lower-constraints.txt which is a policy
violation.

The is an additional complication that this backport adds fixtures to
requirements.txt for oslo.service, at the very least this would mean
we're into a minor semver bump (1.32.X, which is already taken).  This
also means the vendors need to ensure that there is a 'fixtures' package
available.  Now I expect that all packagers have such a thing but there
is a small chance that it exists as a build only package and needs to be
exposed/published.  We're previously said to vendors we wouldn't do that
on stable branches.
 
> 2. Do we need to find an another approach like mocking
> oslo.service.loopingcall._Event.wait in nova instead of mocking
> oslo_service.loopingcall._ThreadingEvent.wait (example:
> https://review.openstack.org/#/c/616697/2/nova/tests/unit/compute/test_compute_mgr.py)
> ?
> This is only a fix on the nova side and it allows us to update oslo.service
> requirements and allows us to fix the high CPU usage issue. I've submit
> this patch (https://review.openstack.org/619246) who implement the
> description above.
> 
> Personaly I think we need to find an another approach like the mocking
> remplacement (c.f 2).
> 
> We need to decide which way we use and to discuss about other solutions.

I think the only way forward is the revert, release and block path.  The
existing open reviews just add more policy violations.

Yours Tony.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20181122/15ed5c4c/attachment.sig>


More information about the openstack-discuss mailing list