Proposal to move cinder backup tests out of the integrated gate

Ghanshyam Mann gmann at ghanshyammann.com
Thu Apr 11 17:14:22 UTC 2019


 ---- On Thu, 11 Apr 2019 11:50:06 -0500 Matt Riedemann <mriedemos at gmail.com> wrote ----
 > On 12/12/2018 2:00 PM, Matt Riedemann wrote: 
 > > I wanted to send this separate from the latest gate status update [1]  
 > > since it's primarily about latent cinder bugs causing failures in the  
 > > gate for which no one is really investigating. 
 > >  
 > > Running down our tracked gate bugs [2] there are several related to  
 > > cinder-backup testing: 
 > >  
 > > * http://status.openstack.org/elastic-recheck/#1483434 
 > > * http://status.openstack.org/elastic-recheck/#1745168 
 > > * http://status.openstack.org/elastic-recheck/#1739482 
 > > * http://status.openstack.org/elastic-recheck/#1635643 
 > >  
 > > All of those bugs were reported a long time ago. I've done some  
 > > investigation into them (at least at the time of reporting) and some are  
 > > simply due to cinder-api using synchronous RPC calls to cinder-volume  
 > > (or cinder-backup) and that doesn't scale. This bug isn't a backup  
 > > issue, but it's definitely related to using RPC call rather than cast: 
 > >  
 > > http://status.openstack.org/elastic-recheck/#1763712 
 > >  
 > > Regarding the backup tests specifically, I don't see a reason why they  
 > > need to be run in the integrated gate jobs, e.g. tempest-full(-py3).  
 > > They don't involve other services, so in my opinion we should move the  
 > > backup tests to a separate job which only runs on cinder changes to  
 > > alleviate these latent bugs failing jobs for unrelated changes and  
 > > resetting the entire gate. 
 > >  
 > > I would need someone from the cinder team that is more involved in  
 > > knowing what their job setup looks like to identify a candidate job for  
 > > these tests if this is something everyone can agree on doing. 
 > >  
 > > [1]  
 > > http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000867.html  
 > >  
 > > [2] http://status.openstack.org/elastic-recheck/ 
 >  
 > This is an old thread but gmann recently skipping a cinder backup test  
 > which was failing a lot [1] prompted me to revisit this. 
 >  
 > As such I've proposed a change [2] which will disable the cinder-backup  
 > service in the tempest-full job which is in the integrated-gate project  
 > template and run by most projects. 

at end goal i agree on this but this will skip all backup tests whihc are running fine,
so let's wait till we move those tests to cindet tempest plugin or run on other integrated
job etc.

 >  
 > There is a voting job running against cinder changes named  
 > "cinder-tempest-dsvm-lvm-lio-barbican" which will still test the backup  
 > service but it's not gating - it's up to the cinder team if they want to  
 > make that job gating. The other thing is it doesn't look like that job runs 
 > on glance (or swift) changes so if the cinder team is interested in  
 > co-gating changes between at least cinder and glance, they could add  
 > cinder-tempest-dsvm-lvm-lio-barbican to glance so it runs there and/or  
 > create a new cinder-backup job which just runs backup tests and gate on  
 > that in both cinder and glance. 

Initially, I was on the side to test/run everything together but on second thought
and by seeing tempest-full unstable I agree with you to find some solution to make
integrated-gate template testing (tempest-full) more efficient and stable for each service.

neutron also face lot of test failure due to volume backup or image tests which definitely
not related to neutron and not worth to block neutron development for that.

I have added this topic in QA PTG etherpad to find the best possible solution.
- https://etherpad.openstack.org/p/qa-train-ptg 

-gmann

 >  
 > [1] https://review.openstack.org/#/c/651660/ 
 > [2] https://review.openstack.org/#/c/651865/ 
 >  
 > --  
 >  
 > Thanks, 
 >  
 > Matt 
 >  
 > 




More information about the openstack-discuss mailing list