2013.2.2 exception requests

newer
OpenStack 2013.2.2 released

older
Stable check of openstack/tempest...

Alan Pevec

31 Jan 2014 31 Jan '14

2:33 a.m.

Hi all, I'm opening the thread to discuss proposed exceptions for the stable/havana, either breaking the freeze or breaking the backport rules. I've two for now: * https://review.openstack.org/69884 - Keystone: significant performance improvement, includes db migrations which are verboten in general, so my first reaaction was -2. But upon closer look, I think this is acceptable since even without running migration code will work after upgrade. Migrations are sequential to last in havana (034) so 035-036 could be treated like "reserved for Havana backports" like Nova did in https://github.com/openstack/nova/commit/ab2c467da951071a8aac4eb6ca032371c69... Of course that means no more db migrations in Keystone Havana! * https://review.openstack.org/70016 - Horizon: session data type change - I'm not a Django expert, so I'd like input for Horizon team to verify Kieran's claims that it won't break existing sessions. Cheers, Alan

Show replies by date

Steven Hardy

31 Jan 31 Jan

2:51 a.m.

On Fri, Jan 31, 2014 at 11:33:09AM +0100, Alan Pevec wrote:

...

Hi all,

I'm opening the thread to discuss proposed exceptions for the stable/havana, either breaking the freeze or breaking the backport rules.

This one for keystone: https://review.openstack.org/#/c/69514/ - Keystone trusts list API is broken - in theory this changes the interface by removing roles from the list responses, but the current response is a 500 error in all cases so nobody can be reliant on the current response format. Also this aligns the API with what is documented. Steve

Doug Hellmann

4 Feb 4 Feb

9:42 a.m.

On Fri, Jan 31, 2014 at 5:33 AM, Alan Pevec <apevec@gmail.com> wrote:

...

Hi all,

I'm opening the thread to discuss proposed exceptions for the stable/havana, either breaking the freeze or breaking the backport rules. I've two for now:

* https://review.openstack.org/69884 - Keystone: significant performance improvement, includes db migrations which are verboten in general, so my first reaaction was -2. But upon closer look, I think this is acceptable since even without running migration code will work after upgrade. Migrations are sequential to last in havana (034) so 035-036 could be treated like "reserved for Havana backports" like Nova did in https://github.com/openstack/nova/commit/ab2c467da951071a8aac4eb6ca032371c69... Of course that means no more db migrations in Keystone Havana!

* https://review.openstack.org/70016 - Horizon: session data type change - I'm not a Django expert, so I'd like input for Horizon team to verify Kieran's claims that it won't break existing sessions.

I raised this in another thread, but I'll follow-up here so we have all of the requests in one place: I'm particularly interested in https://review.openstack.org/#/c/66149/ as a fix for https://bugs.launchpad.net/keystone/+bug/1251123 DreamHost is using this patch with havana. Without the patch, we had to use the sql backend for tokens in order to achieve reasonable performance (we have a few poorly-behaved user scripts with a rather large number of tokens). Doug

...

Cheers, Alan

_______________________________________________ Openstack-stable-maint mailing list Openstack-stable-maint@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-stable-maint

Dolph Mathews

6 Feb 6 Feb

3:09 p.m.

On Tue, Feb 4, 2014 at 11:42 AM, Doug Hellmann <doug.hellmann@dreamhost.com>wrote:

...

On Fri, Jan 31, 2014 at 5:33 AM, Alan Pevec <apevec@gmail.com> wrote:

...
Hi all,

I'm opening the thread to discuss proposed exceptions for the stable/havana, either breaking the freeze or breaking the backport rules. I've two for now:

* https://review.openstack.org/69884 - Keystone: significant performance improvement, includes db migrations which are verboten in general, so my first reaaction was -2. But upon closer look, I think this is acceptable since even without running migration code will work after upgrade. Migrations are sequential to last in havana (034) so 035-036 could be treated like "reserved for Havana backports" like Nova did in https://github.com/openstack/nova/commit/ab2c467da951071a8aac4eb6ca032371c69... Of course that means no more db migrations in Keystone Havana!

* https://review.openstack.org/70016 - Horizon: session data type change - I'm not a Django expert, so I'd like input for Horizon team to verify Kieran's claims that it won't break existing sessions.

I raised this in another thread, but I'll follow-up here so we have all of the requests in one place:

I'm particularly interested in https://review.openstack.org/#/c/66149/ as a fix for https://bugs.launchpad.net/keystone/+bug/1251123

DreamHost is using this patch with havana. Without the patch, we had to use the sql backend for tokens in order to achieve reasonable performance (we have a few poorly-behaved user scripts with a rather large number of tokens).

+++ Morgan just brought to my attention that the severity of bug 1251123 was originally underestimated, and has now been upgraded to a Critical: https://bugs.launchpad.net/keystone/+bug/1251123/comments/8

...

Doug

...
Cheers, Alan

_______________________________________________ Openstack-stable-maint mailing list Openstack-stable-maint@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-stable-maint

_______________________________________________ Openstack-stable-maint mailing list Openstack-stable-maint@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-stable-maint

Flavio Percoco

5 Feb 5 Feb

2:46 a.m.

On 31/01/14 11:33 +0100, Alan Pevec wrote:

...

Hi all,

I'm opening the thread to discuss proposed exceptions for the stable/havana, either breaking the freeze or breaking the backport rules. I've two for now:

* https://review.openstack.org/69884 - Keystone: significant performance improvement, includes db migrations which are verboten in general, so my first reaaction was -2. But upon closer look, I think this is acceptable since even without running migration code will work after upgrade. Migrations are sequential to last in havana (034) so 035-036 could be treated like "reserved for Havana backports" like Nova did in https://github.com/openstack/nova/commit/ab2c467da951071a8aac4eb6ca032371c69... Of course that means no more db migrations in Keystone Havana!

* https://review.openstack.org/70016 - Horizon: session data type change - I'm not a Django expert, so I'd like input for Horizon team to verify Kieran's claims that it won't break existing sessions.

I'd like to request an exception for[0]. This is an important - quite invasive - cinder fix. The issue raises when the driver is not initialized correctly - or goes down - which makes volumes operations on that node fail without updating the volume status correctly. I hesitated a bit before proposing it for stable/havana but I believe that it can be a show stopper in some cases and most importantly it causes a lot of frustrations to the final users. Several bug reports have been filed because of this. After talking to John, Eric and Huang, we agreed it is a good backport. [0] https://review.openstack.org/#/c/67097/

...

Cheers, Alan

_______________________________________________ Openstack-stable-maint mailing list Openstack-stable-maint@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-stable-maint

-- @flaper87 Flavio Percoco

Flavio Percoco

2:49 a.m.

On 31/01/14 11:33 +0100, Alan Pevec wrote:

...

Hi all,

I'm opening the thread to discuss proposed exceptions for the stable/havana, either breaking the freeze or breaking the backport rules. I've two for now:

* https://review.openstack.org/69884 - Keystone: significant performance improvement, includes db migrations which are verboten in general, so my first reaaction was -2. But upon closer look, I think this is acceptable since even without running migration code will work after upgrade. Migrations are sequential to last in havana (034) so 035-036 could be treated like "reserved for Havana backports" like Nova did in https://github.com/openstack/nova/commit/ab2c467da951071a8aac4eb6ca032371c69... Of course that means no more db migrations in Keystone Havana!

* https://review.openstack.org/70016 - Horizon: session data type change - I'm not a Django expert, so I'd like input for Horizon team to verify Kieran's claims that it won't break existing sessions.

I'd also like to request an exception for[0]. This patch fixes the simple space quota feature in glance. The feature, as it is now, is broken since it counts deleted, killed and pending_delete images. This is a non invasive fix that makes this feature usable. [0] https://review.openstack.org/#/c/63455/

...

Cheers, Alan

_______________________________________________ Openstack-stable-maint mailing list Openstack-stable-maint@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-stable-maint

-- @flaper87 Flavio Percoco

Akihiro Motoki

7 Feb 7 Feb

3:21 a.m.

Hi, I would like to request a FFE for neutron: https://review.openstack.org/#/c/71859/. This patch is a workaround for a bug that an external network can become invisible to non-admin users after restarting neutron-server. (https://bugs.launchpad.net/neutron/+bug/1254555). Once an external network becomes invisible, non-admin user cannot connect a router to the outside and cannot allocate new floating IPs. It break usability much. According to the analysis of the cause, more neutron agents we have, more likely this bug occurs. This means it usually happens in production environments (when neutron-server is restarted). This patch was merged in the master two month ago and we confirmed this patch works well. I believe this patch is row risk and fixes a user-visible issue. Thanks, Akihiro On Fri, Jan 31, 2014 at 7:33 PM, Alan Pevec <apevec@gmail.com> wrote:

...

Hi all,

I'm opening the thread to discuss proposed exceptions for the stable/havana, either breaking the freeze or breaking the backport rules. I've two for now:

* https://review.openstack.org/69884 - Keystone: significant performance improvement, includes db migrations which are verboten in general, so my first reaaction was -2. But upon closer look, I think this is acceptable since even without running migration code will work after upgrade. Migrations are sequential to last in havana (034) so 035-036 could be treated like "reserved for Havana backports" like Nova did in https://github.com/openstack/nova/commit/ab2c467da951071a8aac4eb6ca032371c69... Of course that means no more db migrations in Keystone Havana!

* https://review.openstack.org/70016 - Horizon: session data type change - I'm not a Django expert, so I'd like input for Horizon team to verify Kieran's claims that it won't break existing sessions.

Cheers, Alan

_______________________________________________ Openstack-stable-maint mailing list Openstack-stable-maint@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-stable-maint

Alan Pevec

10:56 a.m.

Hi, all exception request seems sensible to me and will be approved when havana gate is unblocked, Mark's devstack backport https://review.openstack.org/71401 is the only fix left that I know about. Is there anything else missing? Cheers, Alan

Thierry Carrez

10 Feb 10 Feb

7:27 a.m.

Alan Pevec wrote:

...

all exception request seems sensible to me and will be approved when havana gate is unblocked, Mark's devstack backport https://review.openstack.org/71401 is the only fix left that I know about. Is there anything else missing?

Maybe the Glance Security patch @ https://review.openstack.org/#/c/71643/ -- Thierry Carrez (ttx)

Thierry Carrez

12 Feb 12 Feb

2:57 a.m.

Alan Pevec wrote:

...

Is there anything else missing?

Someone proposed on the -dev list: https://review.openstack.org/#/c/68601/ -- Thierry Carrez (ttx)

Alan Pevec

3:47 a.m.

2014-02-12 11:57 GMT+01:00 Thierry Carrez <thierry@openstack.org>:

...

Alan Pevec wrote:

...
Is there anything else missing?

Someone proposed on the -dev list: https://review.openstack.org/#/c/68601/

Yep, I've redirected it here - looks like .3 is fine for that one. Cheers, Alan

Xavier Queralt-Mateu

11 Feb 11 Feb

6:44 a.m.

----- Original Message -----

...

From: "Alan Pevec" <apevec@gmail.com> To: "openstack-stable-maint" <openstack-stable-maint@lists.openstack.org> Sent: Friday, January 31, 2014 11:33:09 AM Subject: [Openstack-stable-maint] 2013.2.2 exception requests

Hi all,

I'm opening the thread to discuss proposed exceptions for the stable/havana, either breaking the freeze or breaking the backport rules.

I'd like to request an exception for https://review.openstack.org/72575 - Instances using RBD backend cannot be created right now because there is a missing parameter in the libvirt_info method redefinition. Thanks, --- xqm

Alan Pevec

12:03 p.m.

2014-02-11 15:44 GMT+01:00 Xavier Queralt-Mateu <xqueralt@redhat.com>:

...

I'd like to request an exception for https://review.openstack.org/72575 - Instances using RBD backend cannot be created right now because there is a missing parameter in the libvirt_info method redefinition.

Fix itself is fine but bug is only Medium and open for long time - does it really need to be included as an exception or 2013.2.3 would be fine? I'll wait for Nova stable-maint members to review this. Cheers, Alan

Ihar Hrachyshka

12 Feb 12 Feb

4:58 a.m.

Hi, I'd like to request the following freeze break: https://review.openstack.org/#/c/72754/ This one-liner patch makes a significant difference in performance when instantiating multiple nova instances at once if they use metadata from neutron-metadata-agent (f.e. those instances that get their configuration from cloud-init). The patch is Thanks, /Ihar

Alan Pevec

10:54 a.m.

...

https://review.openstack.org/#/c/72754/

That's a good candidate for exception, and I see Neutron stable-maint members already approved but it's failing *-isolated gate jobs. I'll try throwing dice few more times, but could someone familiar have a look? What are those jobs doing? That should be the last exception, only regressions found during testing could be approved at this point. Cheers, Alan

Alan Pevec

2:44 p.m.

Copying authors of tempest patches referenced below + few Tempest core members who might be interested.

...

...
https://review.openstack.org/#/c/72754/ That's a good candidate for exception, and I see Neutron stable-maint members already approved but it's failing *-isolated gate jobs. I'll try throwing dice few more times, but could someone familiar have a look? What are those jobs doing?

Ihar commented in the review: " I suspect tempest lacks some of those ssh.py fixes from master: c3128c085c2635d82c4909d1be5d016df4978632 ad7ef7d1bdd98045639ee4045144c8fe52853e76 31a91a605a25f578b51a7bed2df8fde5c5f49ffc I'm not sure this would be enough to stabilize gate though." Gary, Attila, Joe - would you like to backport your patches to stable/havana Tempest? Do you agree they should improve gate stability and is there anything else to be backported to stabilize *-isolated gate jobs? Thanks, Alan

Gary Kotton

13 Feb 13 Feb

12:06 a.m.

On 2/13/14 12:44 AM, "Alan Pevec" <apevec@gmail.com> wrote:

...

Copying authors of tempest patches referenced below + few Tempest core members who might be interested.

...
...
https://urldefense.proofpoint.com/v1/url?u=https://review.openstack.org/ %23/c/72754/&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=eH0pxTUZo8NPZyF6hgoMQ u%2BfDtysg45MkPhCZFxPEq8%3D%0A&m=A0BvxdJ0ZUcC6CDYpxSLCpiaR6YFsHVYqPfNgOL TPxE%3D%0A&s=188926ad37a5d2f6ef0296114977d80075a5957a94168e0f7ed985fdefc 9865e

That's a good candidate for exception, and I see Neutron stable-maint members already approved but it's failing *-isolated gate jobs. I'll try throwing dice few more times, but could someone familiar have a look? What are those jobs doing?

Ihar commented in the review: " I suspect tempest lacks some of those ssh.py fixes from master: c3128c085c2635d82c4909d1be5d016df4978632 ad7ef7d1bdd98045639ee4045144c8fe52853e76 31a91a605a25f578b51a7bed2df8fde5c5f49ffc I'm not sure this would be enough to stabilize gate though."

Gary, Attila, Joe - would you like to backport your patches to stable/havana Tempest? Do you agree they should improve gate stability and is there anything else to be backported to stabilize *-isolated gate jobs?

I will back port the SSH change that I did in Tempest. The purpose of that was to try and get more visibility of the problems that occur. I hope that it will help. Thanks Gary

...

Thanks, Alan

Attila Fazekas

1:28 a.m.

Retrying the ssh connection with on all ssh exception may help. It is possible the ssh server causes this type of exception, when the key or the ssh service being configured by cloud-init. It also can hide a temporary network black hole issue. These are not scientifically proven things, but https://review.openstack.org/#/c/73186/. NOTE: We are using the same ssh code to make connection, in nova network jobs since long.. The other mentioned changes probably does not have impact to the stability, they mainly improves the logging of the failures. The 9f756a081533b55f212221ea5de8ed968acea273 and the following patches might decrease the load on the l3 agent, but it would be more difficult to backport. I do not remember anything else in tempest what may help to make the stable/havana neutron jobs more stable. Best Regards, Attila ----- Original Message -----

...

From: "Alan Pevec" <apevec@gmail.com> To: "Gary Kotton" <gkotton@vmware.com>, "Attila Fazekas" <afazekas@redhat.com>, "Joe Gordon" <joe.gordon0@gmail.com>, "David Kranz" <dkranz@redhat.com>, mtreinish@kortar.org, "Sean Dague" <sean@dague.net> Cc: "openstack-stable-maint" <openstack-stable-maint@lists.openstack.org> Sent: Wednesday, February 12, 2014 11:44:58 PM Subject: Re: [Openstack-stable-maint] 2013.2.2 exception requests

Copying authors of tempest patches referenced below + few Tempest core members who might be interested.

...
...
https://review.openstack.org/#/c/72754/ That's a good candidate for exception, and I see Neutron stable-maint members already approved but it's failing *-isolated gate jobs. I'll try throwing dice few more times, but could someone familiar have a look? What are those jobs doing?

Ihar commented in the review: " I suspect tempest lacks some of those ssh.py fixes from master: c3128c085c2635d82c4909d1be5d016df4978632 ad7ef7d1bdd98045639ee4045144c8fe52853e76 31a91a605a25f578b51a7bed2df8fde5c5f49ffc I'm not sure this would be enough to stabilize gate though."

Gary, Attila, Joe - would you like to backport your patches to stable/havana Tempest? Do you agree they should improve gate stability and is there anything else to be backported to stabilize *-isolated gate jobs?

Thanks, Alan

Ihar Hrachyshka

4:09 a.m.

Hi, see below. ----- Original Message -----

...

Retrying the ssh connection with on all ssh exception may help.

It is possible the ssh server causes this type of exception, when the key or the ssh service being configured by cloud-init.

First, tests don't use cloud-init based images to start new nova instances. Cirros images use some similar, but another service to set instance up. See: http://bazaar.launchpad.net/~smoser/cirros/trunk/view/head:/src/sbin/cirros-... The fix in question is for neutron-metadata-agent, and it was not hit by any requests from the new instance created by tempest, meaning the instance either failed to run, or network connection was not properly established. Nova-api log shows that new nova instance state is polled for some time (~6 mins), but its port is always in DOWN state.

...

It also can hide a temporary network black hole issue.

The instance is created at ~00:59:??, the test fails at ~01:06:??, so it's hardly temporary.

...

These are not scientifically proven things, but https://review.openstack.org/#/c/73186/.

NOTE: We are using the same ssh code to make connection, in nova network jobs since long..

This review catches another exception type (SSHException). Does it mean that if that would be our issue, we would see SSHException tracebacks in tempest log? There's no such thing there.

...

The other mentioned changes probably does not have impact to the stability, they mainly improves the logging of the failures.

The 9f756a081533b55f212221ea5de8ed968acea273 and the following patches might decrease the load on the l3 agent, but it would be more difficult to backport.

I do not remember anything else in tempest what may help to make the stable/havana neutron jobs more stable.

There was also some bug in file injection to a new instance in gate that made ssh sessions fail. Something related to guestfs, but I don't know all the details. Adding Russel to Cc since he may have more info on this.

...

Best Regards, Attila

----- Original Message -----

...
From: "Alan Pevec" <apevec@gmail.com> To: "Gary Kotton" <gkotton@vmware.com>, "Attila Fazekas" <afazekas@redhat.com>, "Joe Gordon" <joe.gordon0@gmail.com>, "David Kranz" <dkranz@redhat.com>, mtreinish@kortar.org, "Sean Dague" <sean@dague.net> Cc: "openstack-stable-maint" <openstack-stable-maint@lists.openstack.org> Sent: Wednesday, February 12, 2014 11:44:58 PM Subject: Re: [Openstack-stable-maint] 2013.2.2 exception requests

Copying authors of tempest patches referenced below + few Tempest core members who might be interested.

...
...
https://review.openstack.org/#/c/72754/ That's a good candidate for exception, and I see Neutron stable-maint members already approved but it's failing *-isolated gate jobs. I'll try throwing dice few more times, but could someone familiar have a look? What are those jobs doing?

Ihar commented in the review: " I suspect tempest lacks some of those ssh.py fixes from master: c3128c085c2635d82c4909d1be5d016df4978632 ad7ef7d1bdd98045639ee4045144c8fe52853e76 31a91a605a25f578b51a7bed2df8fde5c5f49ffc I'm not sure this would be enough to stabilize gate though."

Gary, Attila, Joe - would you like to backport your patches to stable/havana Tempest? Do you agree they should improve gate stability and is there anything else to be backported to stabilize *-isolated gate jobs?

Thanks, Alan

_______________________________________________ Openstack-stable-maint mailing list Openstack-stable-maint@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-stable-maint

Attila Fazekas

8:24 a.m.

I agree, in this specific case it will not help. May be in 10..100 ppm of the failure cases it may help. ----- Original Message -----

...

From: "Ihar Hrachyshka" <ihrachys@redhat.com> To: "Attila Fazekas" <afazekas@redhat.com> Cc: "Alan Pevec" <apevec@gmail.com>, "Joe Gordon" <joe.gordon0@gmail.com>, "openstack-stable-maint" <openstack-stable-maint@lists.openstack.org>, "Sean Dague" <sean@dague.net>, "Russell Bryant" <rbryant@redhat.com> Sent: Thursday, February 13, 2014 1:09:16 PM Subject: Re: [Openstack-stable-maint] 2013.2.2 exception requests

Hi, see below.

----- Original Message -----

...
Retrying the ssh connection with on all ssh exception may help.

It is possible the ssh server causes this type of exception, when the key or the ssh service being configured by cloud-init.

First, tests don't use cloud-init based images to start new nova instances. Cirros images use some similar, but another service to set instance up. See: http://bazaar.launchpad.net/~smoser/cirros/trunk/view/head:/src/sbin/cirros-...

The fix in question is for neutron-metadata-agent, and it was not hit by any requests from the new instance created by tempest, meaning the instance either failed to run, or network connection was not properly established. Nova-api log shows that new nova instance state is polled for some time (~6 mins), but its port is always in DOWN state.

...
It also can hide a temporary network black hole issue.

The instance is created at ~00:59:??, the test fails at ~01:06:??, so it's hardly temporary.

...
These are not scientifically proven things, but https://review.openstack.org/#/c/73186/.

NOTE: We are using the same ssh code to make connection, in nova network jobs since long..

This review catches another exception type (SSHException). Does it mean that if that would be our issue, we would see SSHException tracebacks in tempest log? There's no such thing there.

...
The other mentioned changes probably does not have impact to the stability, they mainly improves the logging of the failures.

The 9f756a081533b55f212221ea5de8ed968acea273 and the following patches might decrease the load on the l3 agent, but it would be more difficult to backport.

I do not remember anything else in tempest what may help to make the stable/havana neutron jobs more stable.

There was also some bug in file injection to a new instance in gate that made ssh sessions fail. Something related to guestfs, but I don't know all the details. Adding Russel to Cc since he may have more info on this.

...
Best Regards, Attila

----- Original Message -----

...
From: "Alan Pevec" <apevec@gmail.com> To: "Gary Kotton" <gkotton@vmware.com>, "Attila Fazekas" <afazekas@redhat.com>, "Joe Gordon" <joe.gordon0@gmail.com>, "David Kranz" <dkranz@redhat.com>, mtreinish@kortar.org, "Sean Dague" <sean@dague.net> Cc: "openstack-stable-maint" <openstack-stable-maint@lists.openstack.org> Sent: Wednesday, February 12, 2014 11:44:58 PM Subject: Re: [Openstack-stable-maint] 2013.2.2 exception requests

Copying authors of tempest patches referenced below + few Tempest core members who might be interested.

...
...
https://review.openstack.org/#/c/72754/ That's a good candidate for exception, and I see Neutron stable-maint members already approved but it's failing *-isolated gate jobs. I'll try throwing dice few more times, but could someone familiar have a look? What are those jobs doing?

Ihar commented in the review: " I suspect tempest lacks some of those ssh.py fixes from master: c3128c085c2635d82c4909d1be5d016df4978632 ad7ef7d1bdd98045639ee4045144c8fe52853e76 31a91a605a25f578b51a7bed2df8fde5c5f49ffc I'm not sure this would be enough to stabilize gate though."

Gary, Attila, Joe - would you like to backport your patches to stable/havana Tempest? Do you agree they should improve gate stability and is there anything else to be backported to stabilize *-isolated gate jobs?

Thanks, Alan

_______________________________________________ Openstack-stable-maint mailing list Openstack-stable-maint@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-stable-maint

Matthew Treinish

6:58 a.m.

On Wed, Feb 12, 2014 at 11:44:58PM +0100, Alan Pevec wrote:

...

Copying authors of tempest patches referenced below + few Tempest core members who might be interested.

...
...
https://review.openstack.org/#/c/72754/ That's a good candidate for exception, and I see Neutron stable-maint members already approved but it's failing *-isolated gate jobs. I'll try throwing dice few more times, but could someone familiar have a look? What are those jobs doing?

Ihar commented in the review: " I suspect tempest lacks some of those ssh.py fixes from master: c3128c085c2635d82c4909d1be5d016df4978632 ad7ef7d1bdd98045639ee4045144c8fe52853e76 31a91a605a25f578b51a7bed2df8fde5c5f49ffc I'm not sure this would be enough to stabilize gate though."

Gary, Attila, Joe - would you like to backport your patches to stable/havana Tempest? Do you agree they should improve gate stability and is there anything else to be backported to stabilize *-isolated gate jobs?

So honestly I'd probably recommend disabling the isolated neutron gate jobs for the stable/havana branch. Getting that working stably has been a goal that we've been working on for the whole cycle. Which included many improvements on the neutron side, more recently: https://review.openstack.org/#/c/61964/ https://review.openstack.org/#/c/63100/ https://review.openstack.org/#/c/63558/ https://review.openstack.org/#/c/66736/ https://review.openstack.org/#/c/65838/ https://review.openstack.org/#/c/66899/ https://review.openstack.org/#/c/66928/ https://review.openstack.org/#/c/67475/ and numerous changes to tempest as well: https://review.openstack.org/#/c/66871/ https://review.openstack.org/#/c/66970/ https://review.openstack.org/#/c/67218/ https://review.openstack.org/#/c/67172/ https://review.openstack.org/#/c/67108/ https://review.openstack.org/#/c/64217/ https://review.openstack.org/#/c/67371/ Those are just the patches from the Montreal code sprint to improve things, so there was definitely more that was done to improve the isolated jobs. There were also infra/nova changes required to disable file injection which fixed a kernel panic that was also breaking the isolated jobs on neutron as well. Only recently, as in the last week, has the isolated job been stable enough on master(after the file injection changes) that I enabled it for all the neutron jobs and removed the isolated jobs. I think most of this work (probably not the file injection on the infra side) is too large for backporting. So we probably should try disabling file injection on the jobs and see if that improves things, if it doesn't then I really think we probably should just pull the isolated jobs. -Matt Treinish

4268

Age (days ago)

4281

Last active (days ago)

List overview

Download

20 comments

12 participants

participants (12)

Akihiro Motoki
Alan Pevec
Attila Fazekas
Dolph Mathews
Doug Hellmann
Flavio Percoco
Gary Kotton
Ihar Hrachyshka
Matthew Treinish
Steven Hardy
Thierry Carrez
Xavier Queralt-Mateu