---- On Thu, 13 Dec 2018 05:00:25 +0900 Matt Riedemann <mriedemos@gmail.com> wrote ----
I wanted to send this separate from the latest gate status update [1] since it's primarily about latent cinder bugs causing failures in the gate for which no one is really investigating.
Running down our tracked gate bugs [2] there are several related to cinder-backup testing:
* http://status.openstack.org/elastic-recheck/#1483434 * http://status.openstack.org/elastic-recheck/#1745168 * http://status.openstack.org/elastic-recheck/#1739482 * http://status.openstack.org/elastic-recheck/#1635643
I agree that those are long pending bugs but seems not occurring so frequently. First, two are ~20 times in the last 10 days and last two even less.
All of those bugs were reported a long time ago. I've done some investigation into them (at least at the time of reporting) and some are simply due to cinder-api using synchronous RPC calls to cinder-volume (or cinder-backup) and that doesn't scale. This bug isn't a backup issue, but it's definitely related to using RPC call rather than cast:
http://status.openstack.org/elastic-recheck/#1763712
Regarding the backup tests specifically, I don't see a reason why they need to be run in the integrated gate jobs, e.g. tempest-full(-py3). They don't involve other services, so in my opinion we should move the backup tests to a separate job which only runs on cinder changes to alleviate these latent bugs failing jobs for unrelated changes and resetting the entire gate.
I would need someone from the cinder team that is more involved in knowing what their job setup looks like to identify a candidate job for these tests if this is something everyone can agree on doing.
Also, I would like to know that is cinder backup standard feature (including snapshot back etc)? There is no harm to test those in the integrated job. But I agree that if those tests/features are not stable then, we can skip or remove them from integrated gate testing and let cinder to test them on their gate in the specific job till they are stable. As you mentioned, let's wait for Cinder team to respond on these. -gmann
[1] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000867.... [2] http://status.openstack.org/elastic-recheck/
--
Thanks,
Matt