<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Sun, Oct 20, 2013 at 12:18 PM, Tim Bell <span dir="ltr"><<a href="mailto:Tim.Bell@cern.ch" target="_blank">Tim.Bell@cern.ch</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br>

Is it easy ? No... it is hard, whether in an integrated test suite or on its own<br>

Can it be solved ?  Yes, we have done incredible things with the current QA infrastructure<br>

Should it be split off from other testing ? No, I want EVERY commit to have this check. Performance through benchmarking at scale is fundamental.<br>

<br>

Integration with the current process mean all code has to pass the bar... a new project makes it optional and therefore makes the users do the debugging... just the kind of thing that drives those users away... </blockquote>


<div><br></div><div><br></div><div>Tim, we already have a very basic gating test to check for degraded performance (large-ops test in tempest).  Does it have all the issues listed below? Yes, this doesn't detect a minor performance degradation, it doesn't work on RAX cloud (slower VMs) et cetera. But its a start.</div>


<div><br></div><div>After debugging the issues with nova-networking / rootwrap (<a href="https://bugs.launchpad.net/oslo/+bug/1199433">https://bugs.launchpad.net/oslo/+bug/1199433</a>, <a href="https://review.openstack.org/#/c/38000/">https://review.openstack.org/#/c/38000/</a>) that caused nova to timeout when booting just 50 instances, I added a test to boot up n (n=150 in gate) instances at once using the fake virt driver.  We are now gating on the nova-network version and are getting ready to enable gate on the neutron version too.</div>


<div><br></div><div>The tests are pretty fast too:</div><div><br></div><div><div>gate-tempest-devstack-vm-large-ops SUCCESS in 13m 44s</div><div>gate-tempest-devstack-vm-neutron-large-ops SUCCESS in 16m 09s (non-voting)</div>


</div><div><br></div><div>best,<br>Joe</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">


 </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<span class=""><font color="#888888"><br>

Tim<br>

</font></span><div class="im"><br>

> -----Original Message-----<br>

> From: Robert Collins [mailto:<a href="mailto:robertc@robertcollins.net">robertc@robertcollins.net</a>]<br>

> Sent: 20 October 2013 21:03<br>

> To: OpenStack Development Mailing List<br>

> Subject: Re: [openstack-dev] Announce of Rally - benchmarking system for OpenStack<br>

><br>

</div><div class=""><div class="h5">> On 21 October 2013 07:36, Alex Gaynor <<a href="mailto:alex.gaynor@gmail.com">alex.gaynor@gmail.com</a>> wrote:<br>

> > There's several issues involved in doing automated regression checking<br>

> > for<br>

> > benchmarks:<br>

> ><br>

> > - You need a platform which is stable. Right now all our CI runs on<br>

> > virtualized instances, and I don't think there's any particular<br>

> > guarantee it'll be the same underlying hardware, further virtualized<br>

> > systems tend to be very noisy and not give you the stability you need.<br>

> > - You need your benchmarks to be very high precision, if you really<br>

> > want to rule out regressions of more than N% without a lot of false positives.<br>

> > - You need more than just checks on individual builds, you need long<br>

> > term trend checking - 100 1% regressions are worse than a single 50% regression.<br>

><br>

> Let me offer a couple more key things:<br>

>  - you need a platform that is representative of your deployments:<br>

> 1000 physical hypervisors have rather different checkin patterns than<br>

> 1 qemu hypervisor.<br>

>  - you need a workload that is representative of your deployments:<br>

> 10000 VM's spread over 500 physical hypervisors routing traffic through one neutron software switch will have rather different load<br>

> characteristics than 5 qemu vm's in a kvm vm hosted all in one configuration.<br>

><br>

> neither the platform - # of components, their configuration, etc, nor the workload in devstack-gate are representative of production<br>

> deployments of any except the most modest clouds. Thats fine - devstack-gate to date has been about base functionality, not digging down<br>

> into race conditions.<br>

><br>

> I think having a dedicated tool aimed at:<br>

>  - setting up *many different* production-like environments and running<br>

>  - many production-like workloads and<br>

>  - reporting back which ones work and which ones don't<br>

><br>

> makes a huge amount of sense.<br>

><br>

> from the reports from that tool we can craft targeted unit test or isolated functional tests to capture the problem and prevent it<br>

> worsening or regressing (once fixed). See for instance Joe Gordons'<br>

> fake hypervisor which is great for targeted testing.<br>

><br>

> That said, I also agree with the sentiment expressed that the workload-driving portion of Rally doesn't seem different enough to Tempest<br>

> to warrant being separate; it seems to me that Rally could be built like this:<br>

><br>

> - a thing that does deployments spread out over a phase space of configurations<br>

> - instrumentation for deployments that permit the data visibility needed to analyse problems<br>

> - tests for tempest that stress a deployment<br>

><br>

> So the single-button-push Rally would:<br>

>  - take a set of hardware<br>

>  - in a loop<br>

>  - deploy a configuration, run Tempest, report data<br>

><br>

> That would reuse Tempest and still be a single button push data gathering thing, and if Tempest isn't capable of generating enough<br>

> concurrency/load [for a single test - ignore parallel execution of different tests] then that seems like something we should fix in Tempest,<br>

> because concurrency/race conditions are things we need tests for in devstack-gate.<br>

><br>

> -Rob<br>

><br>

> --<br>

> Robert Collins <<a href="mailto:rbtcollins@hp.com">rbtcollins@hp.com</a>><br>

> Distinguished Technologist<br>

> HP Converged Cloud<br>

><br>

> _______________________________________________<br>

> OpenStack-dev mailing list<br>

> <a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

<br>

_______________________________________________<br>

OpenStack-dev mailing list<br>

<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

</div></div></blockquote></div><br></div></div>