<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 3, 2017 at 11:53 PM, Emilien Macchi <span dir="ltr"><<a href="mailto:emilien@redhat.com" target="_blank">emilien@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">(cross-posting)<br>

<br>

I've seen a bunch of interesting thoughts here.<br>

The most relevant feedback I've seen so far:<br>

<br>

- TripleO folks want to keep testing fast and efficient.<br>

- Tempest folks understand this problematic and is willing to collaborate.<br>

<br>

I propose that we move forward and experiment the usage of Tempest in<br>

TripleO CI for one job that could be experimental or non-voting to<br>

start.<br>

Instead of running the Pingtest, we would execute a Tempest Scenario<br>

that boot an instance from volume (like Pingstest is already doing)<br>

and see how it goes (in term of coverage and runtime).<br>

I volunteer to kick-off the work with someone more expert than I am<br>

with quickstart (Arx maybe?).<br>

<br></blockquote><div><br></div><div>Sure, let's work on that :)</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Another iteration could be to start building an easy interface to<br>

select which Tempest tests we want a TripleO CI job to run and plug it<br>

to our CI tooling (tripleo-quickstart I presume).<br>

I also hear some feedback about keeping the pingtest alive for some<br>

uses cases, and I agree we could keep some CI jobs to run the pingtest<br>

when it makes more sense (when we want to test Heat for example, or<br>

just maintain it for developers who used it).<br>

<br>

How does it sounds? Please bring feedback.<br>

<div class="HOEnZb"><div class="h5"><br>

<br>

On Tue, Apr 18, 2017 at 7:41 AM, Attila Fazekas <<a href="mailto:afazekas@redhat.com">afazekas@redhat.com</a>> wrote:<br>

><br>

><br>

> On Tue, Apr 18, 2017 at 11:04 AM, Arx Cruz <<a href="mailto:arxcruz@redhat.com">arxcruz@redhat.com</a>> wrote:<br>

>><br>

>><br>

>><br>

>> On Tue, Apr 18, 2017 at 10:42 AM, Steven Hardy <<a href="mailto:shardy@redhat.com">shardy@redhat.com</a>> wrote:<br>

>>><br>

>>> On Mon, Apr 17, 2017 at 12:48:32PM -0400, Justin Kilpatrick wrote:<br>

>>> > On Mon, Apr 17, 2017 at 12:28 PM, Ben Nemec <<a href="mailto:openstack@nemebean.com">openstack@nemebean.com</a>><br>

>>> > wrote:<br>

>>> > > Tempest isn't really either of those things.  According to another<br>

>>> > > message<br>

>>> > > in this thread it takes around 15 minutes to run just the smoke<br>

>>> > > tests.<br>

>>> > > That's unacceptable for a lot of our CI jobs.<br>

>>> ><br>

>><br>

>><br>

>> I rather spend 15 minutes running tempest than add a regression or a new<br>

>> bug, which already happen in the past.<br>

>><br>

> The smoke tests might not be the best test selection anyway, you should pick<br>

> some scenario which does<br>

> for example snapshot of images and volumes. yes, these are the slow ones,<br>

> but they can run in parallel.<br>

><br>

> Very likely you do not really want to run all tempest test, but 10~20 minute<br>

> time,<br>

> sounds reasonable for a sanity test.<br>

><br>

> The tempest config utility also should be extended by some parallel<br>

> capability,<br>

> and should be able to use already downloaded (part of the image) resources.<br>

><br>

> Tempest/testr/subunit worker balance is not always the best,<br>

> technically would be possible to do dynamic balancing, but it would require<br>

> a lot of work.<br>

> Let me know when it becomes the main concern, I can check what can/cannot be<br>

> done.<br>

><br>

><br>

>><br>

>>><br>

>>> > Ben, is the issue merely the time it takes? Is it the affect that time<br>

>>> > taken has on hardware availability?<br>

>>><br>

>>> It's both, but the main constraint is the infra job timeout, which is<br>

>>> about<br>

>>> 2.5hrs - if you look at our current jobs many regularly get close to (and<br>

>>> sometimes exceed this), so we just don't have the time budget available<br>

>>> to<br>

>>> run exhasutive tests every commit.<br>

>><br>

>><br>

>> We have green light from infra to increase the job timeout to 5 hours, we<br>

>> do that in our periodic full tempest job.<br>

><br>

><br>

> Sounds good, but I am afraid it could hurt more than helping, it could delay<br>

> other things get fixed by lot<br>

> especially if we got some extra flakiness, because of foobar.<br>

><br>

> You cannot have all possible tripleo configs on the gate anyway,<br>

> so something will pass which will require a quick fix.<br>

><br>

> IMHO the only real solution, is making the before test-run steps faster or<br>

> shorter.<br>

><br>

> Do you have any option to start the tempest running jobs in a more developed<br>

> state ?<br>

> I mean, having more things already done at the start time  (images/snapshot)<br>

> and just do a fast upgrade at the beginning of the job.<br>

><br>

> Openstack installation can be completed in a `fast` way (~minute) on<br>

> RHEL/Fedora systems<br>

> after the yum steps, also if you are able to aggregate all yum step to<br>

> single<br>

> command execution (transaction) you generally able to save a lot of time.<br>

><br>

> There is plenty of things what can be made more efficient before the test<br>

> run,<br>

> when you start considering everything evil which can be accounted for more<br>

> than 30 sec<br>

> of time, this can happen soon.<br>

><br>

> For example just executing the cpython interpreter for the openstack<br>

> commands is above 30 sec,<br>

> the work what they are doing can be done in much much faster way.<br>

><br>

> Lot of install steps actually does not depends on each other,<br>

> it allows more things to be done in parallel, we generally can have more<br>

> core than Ghz.<br>

><br>

><br>

>><br>

>>><br>

>>><br>

>>> > Should we focus on how much testing we can get into N time period?<br>

>>> > Then how do we decide an optimal N<br>

>>> > for our constraints?<br>

>>><br>

>>> Well yeah, but that's pretty much how/why we ended up with pingtest, it's<br>

>>> simple, fast, and provides an efficient way to do smoke tests, e.g<br>

>>> creating<br>

>>> just one heat resource is enough to prove multiple OpenStack services are<br>

>>> running, as well as the DB/RPC etc etc.<br>

>>><br>

>>> > I've been working on a full up functional test for OpenStack CI builds<br>

>>> > for a long time now, it works but takes<br>

>>> > more than 10 hours. IF you're interested in results kick through to<br>

>>> > Kibana here [0]. Let me know off list if you<br>

>>> > have any issues, the presentation of this data is all experimental<br>

>>> > still.<br>

>>><br>

>>> This kind of thing is great, and I'd support more exhaustive testing via<br>

>>> periodic jobs etc, but the reality is we need to focus on "bang for buck"<br>

>>> e.g the deepest possible coverage in the most minimal amount of time for<br>

>>> our per-commit tests - we rely on the project gates to provide a full API<br>

>>> surface test, and we need to focus on more basic things like "did the<br>

>>> service<br>

>>> start", and "is the API accessible".  Simple crud operations on a subset<br>

>>> of<br>

>>> the API's is totally fine for this IMO, whether via pingtest or some<br>

>>> other<br>

>>> means.<br>

>>><br>

>><br>

>> Right now we do have a periodic job running full tempest, with a few<br>

>> skips, and because of the lack of tempest tests in the patches, it's being<br>

>> pretty hard to keep it stable enough to have a 100% pass, and of course,<br>

>> also the installation very often fails (like in the last five days).<br>

>> For example, [1] is the latest run we have in periodic job that we get<br>

>> results from tempest, and we have 114 failures that was caused by some new<br>

>> code/change, and I have no idea which one was, just looking at the failures,<br>

>> I can notice that smoke tests plus minimum basic scenario tests would catch<br>

>> these failures and the developer could fix it and make me happy :)<br>

>> Now I have to spend several hours installing and debugging each one of<br>

>> those tests to identify where/why it fails.<br>

>> Before this run, we got 100% pass, but unfortunately I don't have the<br>

>> results anymore, it was removed already from <a href="http://logs.openstack.org" rel="noreferrer" target="_blank">logs.openstack.org</a><br>

>><br>

>><br>

>>><br>

>>> Steve<br>

>>><br>

>>><br>

>>> ______________________________<wbr>______________________________<wbr>______________<br>

>>> OpenStack Development Mailing List (not for usage questions)<br>

>>> Unsubscribe:<br>

>>> <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.<wbr>openstack.org?subject:<wbr>unsubscribe</a><br>

>>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-dev</a><br>

>><br>

>><br>

>> [1]<br>

>> <a href="http://logs.openstack.org/periodic/periodic-tripleo-ci-centos-7-ovb-nonha-tempest-oooq/0072651/logs/oooq/stackviz/#/stdin" rel="noreferrer" target="_blank">http://logs.openstack.org/<wbr>periodic/periodic-tripleo-ci-<wbr>centos-7-ovb-nonha-tempest-<wbr>oooq/0072651/logs/oooq/<wbr>stackviz/#/stdin</a><br>

>><br>

>> ______________________________<wbr>______________________________<wbr>______________<br>

>> OpenStack Development Mailing List (not for usage questions)<br>

>> Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.<wbr>openstack.org?subject:<wbr>unsubscribe</a><br>

>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-dev</a><br>

>><br>

><br>

><br>

> ______________________________<wbr>______________________________<wbr>______________<br>

> OpenStack Development Mailing List (not for usage questions)<br>

> Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.<wbr>openstack.org?subject:<wbr>unsubscribe</a><br>

> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-dev</a><br>

><br>

<br>

<br>

<br>

</div></div><span class="HOEnZb"><font color="#888888">--<br>

Emilien Macchi<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

______________________________<wbr>______________________________<wbr>______________<br>

OpenStack Development Mailing List (not for usage questions)<br>

Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.<wbr>openstack.org?subject:<wbr>unsubscribe</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-dev</a><br>

</div></div></blockquote></div><br></div></div>