[openstack-dev] [qa] [neutron] Neutron Full Parallel job - Last 4 days failures

Salvatore Orlando sorlando at nicira.com
Tue Mar 25 08:17:13 UTC 2014


Inline

Salvatore


On 24 March 2014 23:01, Matthew Treinish <mtreinish at kortar.org> wrote:

> On Mon, Mar 24, 2014 at 09:56:09PM +0100, Salvatore Orlando wrote:
> > Thanks a lot!
> >
> > We now need to get on these bugs, and define with QA an acceptable
> failure
> > rate criterion for switching the full job to voting.
> > It would be good to have a chance to only run the tests against code
> which
> > is already in master.
> > To this aim we might push a dummy patch, and keep it spinning in the
> check
> > queue.
>
> Honestly, there isn't really a number. I had a thread trying to get
> consensus on
> that back when I first made tempest run in parallel. What I ended up doing
> back
> then and what we've done since for this kind of change is to just pick a
> slower
> week for the gate and just green light it, of course after checking to
> make sure
> if it blows up we're not blocking anything critical.


Then I guess the ideal period would be after RC2s are cut.
Also, we'd need to run also a postgres flavour of the job at least.
Meaning that when calculating the probability of a patch passing in the
gate is actually the combined probability of two jobs completing
successfully.
On another note, we noticed that the duplicated jobs currently executed for
redundancy in neutron actually seem to point all to the same build id.
I'm not sure then if we're actually executing each job twice or just
duplicating lines in the jenkins report.


> If it looks like it's
> passing at roughly the same rate as everything else and you guys think it's
> ready. 25% is definitely too high, for comparison when I looked at a
> couple of
> min. ago at the numbers for the past 4 days on the equivalent job with
> nova-network it only failed 4% of the time. (12 out of 300) But that
> number does
> fluctuate quite a bit for example looking at the past week the number
> grows to
> 11.6%. (171 out of 1480)


Even with 11.6% I would not enable it.
Running mysql and pg jobs this will give us a combined success rate of
 78.1%, which pretty much means the chances of clearing successfully a
5-deep queue in the gate will be a mere 29%. My "gut" metric is that we
should achieve a degree of pass rate which allows us to clear a 10-deep
gate queue with a 50% success rate. This translates to a 3.5% failure rate
per job, which is indeed inline with what's currently observed for
nova-network.

Doing it this way doesn't seem like the best, but until it's gating things
> really don't get the attention they deserve and more bugs will just slip in
> while you wait. There will most likely be initial pain after it merges,
> but it's
> the only real way to lock it down and make forward progress.
>

> -Matt Treinish
>
> >
> >
> > On 24 March 2014 21:45, Rossella Sblendido <rsblendido at suse.com> wrote:
> >
> > > Hello all,
> > >
> > > here is an update regarding the Neutron full parallel job.
> > > I used the following Logstash query [1]  that checks the failures of
> the
> > > last
> > > 4 days (the last bug fix related with the full job was merged 4 days
> ago).
> > > These are the results:
> > >
> > > 123 failure (25% of the total)
> > >
> > > I took a sample of 50 failures and I obtained the following:
> > >
> > > 22% legitimate failures (they are due to the code change introduced by
> the
> > > patch)
> > > 22% infra issues
> > > 12% https://bugs.launchpad.net/openstack-ci/+bug/1291611
> > > 12% https://bugs.launchpad.net/tempest/+bug/1281969
> > > 8% https://bugs.launchpad.net/tempest/+bug/1294603
> > > 3% https://bugs.launchpad.net/neutron/+bug/1283522
> > > 3% https://bugs.launchpad.net/neutron/+bug/1291920
> > > 3% https://bugs.launchpad.net/nova/+bug/1290642
> > > 3% https://bugs.launchpad.net/tempest/+bug/1252971
> > > 3% https://bugs.launchpad.net/horizon/+bug/1257885
> > > 3% https://bugs.launchpad.net/tempest/+bug/1292242
> > > 3% https://bugs.launchpad.net/neutron/+bug/1277439
> > > 3% https://bugs.launchpad.net/neutron/+bug/1283599
> > >
> > > cheers,
> > >
> > > Rossella
> > >
> > > [1] http://logstash.openstack.org/#eyJzZWFyY2giOiJidWlsZF9uYW1lOi
> > > BcImNoZWNrLXRlbXBlc3QtZHN2bS1uZXV0cm9uLWZ1bGxcIiBBTkQgbWVzc2
> > > FnZTpcIkZpbmlzaGVkOiBGQUlMVVJFXCIgQU5EIHRhZ3M6Y29uc29sZSIsIm
> > > ZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiY3VzdG9tIiwiZ3
> > > JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7ImZyb20iOiIyMDE0LTAzLTIwVD
> > > EzOjU0OjI1KzAwOjAwIiwidG8iOiIyMDE0LTAzLTI0VDEzOjU0OjI1KzAwOj
> > > AwIiwidXNlcl9pbnRlcnZhbCI6IjAifSwibW9kZSI6IiIsImFuYWx5emVfZm
> > > llbGQiOiIiLCJzdGFtcCI6MTM5NTY3MDY2ODc0OX0=
> > >
> > > _______________________________________________
> > > OpenStack-dev mailing list
> > > OpenStack-dev at lists.openstack.org
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >
>
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140325/06634df9/attachment.html>


More information about the OpenStack-dev mailing list