<div dir="ltr"><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, May 7, 2019 at 12:31 AM Tim Burke <<a href="mailto:tim@swiftstack.com">tim@swiftstack.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div bgcolor="#FFFFFF">

    <p><br>

    </p>

    <div class="gmail-m_620502633609308714moz-cite-prefix">On 5/5/19 12:18 AM, Ghanshyam Mann

      wrote:<br>

    </div>

    <blockquote type="cite">

      <pre class="gmail-m_620502633609308714moz-quote-pre">Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried

to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.

We talked about the Ideas to make it more stable and fast for projects especially when failure is not

related to each project. We are planning to split the integrated-gate template (only tempest-full job as

first step) per related services. 

Idea:

- Run only dependent service tests on project gate.</pre>

    </blockquote>

    I love this plan already.<br>

    <blockquote type="cite">

      <pre class="gmail-m_620502633609308714moz-quote-pre">- Tempest gate will keep running all the services tests as the integrated gate at a centeralized  place without any change in the current job.

- Each project can run the below mentioned template. 

- All below template will be defined and maintained by QA team. </pre>

    </blockquote>

    My biggest regret is that I couldn't figure out how to do this

    myself. Much thanks to the QA team!<br>

    <blockquote type="cite">

      <pre class="gmail-m_620502633609308714moz-quote-pre">I would like to know each 6 services which run integrated-gate jobs

1."Integrated-gate-networking" (job to run on neutron gate)

 Tests to run in this template: neutron APIs , nova APIs,  keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial)

Improvement for neutron gate: exlcude the cinder API tests,  glance API tests, swift API tests,

2."Integrated-gate-storage" (job to run on cinder gate, glance gate)

Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial)

Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests

3. "Integrated-gate-object-storage" (job to run on swift gate)

Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial)

Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests.</pre>

    </blockquote>

    This sounds great. My only question is why Cinder tests are still

    included, but I trust that it's there for a reason and I'm just

    revealing my own ignorance of Swift's consumers, however removed.<br>

    <blockquote type="cite">

      <pre class="gmail-m_620502633609308714moz-quote-pre">Note: swift does not run integrated-gate as of now.</pre>

    </blockquote>

    <p>Correct, and for all the reasons that you're seeking to address.

      Some eight months ago I'd gotten tired of seeing spurious failures

      that had nothing to do with Swift, and I was hard pressed to find

      an instance where the tempest tests caught a regression or

      behavior change that wasn't already caught by Swift's own

      functional tests. In short, the signal-to-noise ratio for those

      particular tests was low enough that a failure only told me "you

      should leave a recheck comment," so I proposed

      <a class="gmail-m_620502633609308714moz-txt-link-freetext" href="https://review.opendev.org/#/c/601813/" target="_blank">https://review.opendev.org/#/c/601813/</a> . There was also a side

      benefit of having our longest-running job change from

      legacy-tempest-dsvm-neutron-full (at 90-100 minutes) to

      swift-probetests-centos-7 (at ~30 minutes), tightening developer

      feedback loops.</p>

    <p>It sounds like this proposal addresses both concerns: by reducing

      the scope of tests to what might actually exercise the Swift API

      (if indirectly), the signal-to-noise ratio should be much better

      and the wall-clock time will be reduced.<br>

    </p>

    <blockquote type="cite">

      <pre class="gmail-m_620502633609308714moz-quote-pre">4. "Integrated-gate-compute" (job to run on Nova gate)

tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial)

Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. 

5. "Integrated-gate-identity" (job to run on keystone gate)

Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate.

But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?

6. "Integrated-gate-placement" (job to run on placement gate)

Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs

 Improvement for placement gate: excluded the  glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests

Thoughts on this approach?

The important point is we must not lose the coverage of integrated testing per project. So I would like to

get each project view if we are missing any dependency (proposed tests removal) in above proposed templates.</pre>

    </blockquote>

    As far as Swift is aware, these dependencies seem accurate; at any

    rate, *we* don't use anything other than Keystone, even by way of

    another API. Further, Swift does not use particularly esoteric

    Keysonte APIs; I would be OK with integrated-gate-identity not

    exercising Swift's API with the assumption that some other (or

    indeed, almost *any* other) service would likely exercise the parts

    that we care about.<br>

    <blockquote type="cite">

      <pre class="gmail-m_620502633609308714moz-quote-pre">- <a class="gmail-m_620502633609308714moz-txt-link-freetext" href="https:/etherpad.openstack.org/p/qa-train-ptg" target="_blank">https:/etherpad.openstack.org/p/qa-train-ptg</a> 

-gmann

</pre></blockquote></div></blockquote><div><br></div><div>While I'm all up for limiting the scope Tempest is targeting for each patch to save time and our precious infra resources I have feeling that we might end up missing something here. Honestly I'm not sure what that something would be and maybe it's me thinking the scopes wrong way around.</div><div><br></div><div>For example:</div><div><pre class="gmail-m_620502633609308714moz-quote-pre">4. "Integrated-gate-compute" (job to run on Nova gate)<br></pre><pre class="gmail-m_620502633609308714moz-quote-pre">I'm not exactly sure what any given Nova patch would be able to break from Cinder, Glance or Neutron or on number 2 what Swift is depending on Glance and Cinder that we could break when we introduce a change.<br><br></pre><pre class="gmail-m_620502633609308714moz-quote-pre">Shouldn't we be looking "What projects are consuming service X and target those Tempest tests"? In Glance perspective this would be (from core projects) Glance, Cinder, Nova; Cinder probably interested about Cinder, Glance and Nova (anyone else consuming Cinder?) etc.<br><br></pre><pre class="gmail-m_620502633609308714moz-quote-pre">I'd like to propose approach where we define these jobs and run them in check for the start and let gate run full suites until we figure out are we catching something in gate we did not catch in check and once the understanding has been reached that we have sufficient coverage, we can go ahead and swap gate using those jobs as well. This approach would give us the benefit where the impact is highest until we are confident we got the coverage right. I think biggest issue is that for the transition period _everyone_ needs to understand that gate might catch something check did not and simple "recheck" might not be sufficient when tempest succeeded in check but failed in gate.<br><br></pre><pre class="gmail-m_620502633609308714moz-quote-pre">Best,<br></pre><pre class="gmail-m_620502633609308714moz-quote-pre">Erno "jokke_" Kuvaja<br></pre></div></div></div>