Open Stack

Mon Oct 21 12:11:10 UTC 2013

On Sun, Oct 20, 2013 at 12:18 PM, Tim Bell <Tim.Bell at cern.ch> wrote:

>
> Is it easy ? No... it is hard, whether in an integrated test suite or on
> its own
> Can it be solved ?  Yes, we have done incredible things with the current
> QA infrastructure
> Should it be split off from other testing ? No, I want EVERY commit to
> have this check. Performance through benchmarking at scale is fundamental.
>
> Integration with the current process mean all code has to pass the bar...
> a new project makes it optional and therefore makes the users do the
> debugging... just the kind of thing that drives those users away...

Tim, we already have a very basic gating test to check for degraded
performance (large-ops test in tempest).  Does it have all the issues
listed below? Yes, this doesn't detect a minor performance degradation, it
doesn't work on RAX cloud (slower VMs) et cetera. But its a start.

After debugging the issues with nova-networking / rootwrap (
https://bugs.launchpad.net/oslo/+bug/1199433,
https://review.openstack.org/#/c/38000/) that caused nova to timeout when
booting just 50 instances, I added a test to boot up n (n=150 in gate)
instances at once using the fake virt driver.  We are now gating on the
nova-network version and are getting ready to enable gate on the neutron
version too.

The tests are pretty fast too:

gate-tempest-devstack-vm-large-ops SUCCESS in 13m 44s
gate-tempest-devstack-vm-neutron-large-ops SUCCESS in 16m 09s (non-voting)

best,
Joe

>

> Tim
>
> > -----Original Message-----
> > From: Robert Collins [mailto:robertc at robertcollins.net]
> > Sent: 20 October 2013 21:03
> > To: OpenStack Development Mailing List
> > Subject: Re: [openstack-dev] Announce of Rally - benchmarking system for
> OpenStack
> >
> > On 21 October 2013 07:36, Alex Gaynor <alex.gaynor at gmail.com> wrote:
> > > There's several issues involved in doing automated regression checking
> > > for
> > > benchmarks:
> > >
> > > - You need a platform which is stable. Right now all our CI runs on
> > > virtualized instances, and I don't think there's any particular
> > > guarantee it'll be the same underlying hardware, further virtualized
> > > systems tend to be very noisy and not give you the stability you need.
> > > - You need your benchmarks to be very high precision, if you really
> > > want to rule out regressions of more than N% without a lot of false
> positives.
> > > - You need more than just checks on individual builds, you need long
> > > term trend checking - 100 1% regressions are worse than a single 50%
> regression.
> >
> > Let me offer a couple more key things:
> >  - you need a platform that is representative of your deployments:
> > 1000 physical hypervisors have rather different checkin patterns than
> > 1 qemu hypervisor.
> >  - you need a workload that is representative of your deployments:
> > 10000 VM's spread over 500 physical hypervisors routing traffic through
> one neutron software switch will have rather different load
> > characteristics than 5 qemu vm's in a kvm vm hosted all in one
> configuration.
> >
> > neither the platform - # of components, their configuration, etc, nor
> the workload in devstack-gate are representative of production
> > deployments of any except the most modest clouds. Thats fine -
> devstack-gate to date has been about base functionality, not digging down
> > into race conditions.
> >
> > I think having a dedicated tool aimed at:
> >  - setting up *many different* production-like environments and running
> >  - many production-like workloads and
> >  - reporting back which ones work and which ones don't
> >
> > makes a huge amount of sense.
> >
> > from the reports from that tool we can craft targeted unit test or
> isolated functional tests to capture the problem and prevent it
> > worsening or regressing (once fixed). See for instance Joe Gordons'
> > fake hypervisor which is great for targeted testing.
> >
> > That said, I also agree with the sentiment expressed that the
> workload-driving portion of Rally doesn't seem different enough to Tempest
> > to warrant being separate; it seems to me that Rally could be built like
> this:
> >
> > - a thing that does deployments spread out over a phase space of
> configurations
> > - instrumentation for deployments that permit the data visibility needed
> to analyse problems
> > - tests for tempest that stress a deployment
> >
> > So the single-button-push Rally would:
> >  - take a set of hardware
> >  - in a loop
> >  - deploy a configuration, run Tempest, report data
> >
> > That would reuse Tempest and still be a single button push data
> gathering thing, and if Tempest isn't capable of generating enough
> > concurrency/load [for a single test - ignore parallel execution of
> different tests] then that seems like something we should fix in Tempest,
> > because concurrency/race conditions are things we need tests for in
> devstack-gate.
> >
> > -Rob
> >
> > --
> > Robert Collins <rbtcollins at hp.com>
> > Distinguished Technologist
> > HP Converged Cloud
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131021/38eb1680/attachment.html>

Open Stack

[openstack-dev] Announce of Rally - benchmarking system for OpenStack

OpenStack

Community

Documentation

Branding & Legal