[openstack-dev] [nova][powervm] my notes from the meeting on powervm CI

Matthew Treinish mtreinish at kortar.org
Fri Oct 11 03:31:29 UTC 2013


On Thu, Oct 10, 2013 at 07:39:37PM -0700, Joe Gordon wrote:
> On Thu, Oct 10, 2013 at 7:28 PM, Matt Riedemann <mriedem at us.ibm.com> wrote:
> > >
> > > > 4. What is the max amount of time for us to report test results?  Dan
> > > > didn't seem to think 48 hours would fly. :)
> > >
> > > Honestly, I think that 12 hours during peak times is the upper limit of
> > > what could be considered useful. If it's longer than that, many patches
> > > could go into the tree without a vote, which defeats the point.
> >
> > Yeah, I was just joking about the 48 hour thing, 12 hours seems excessive
> > but I guess that has happened when things are super backed up with gate
> > issues and rechecks.
> >
> > Right now things take about 4 hours, with Tempest being around 1.5 hours
> > of that. The rest of the time is setup and install, which includes heat
> > and ceilometer. So I guess that raises another question, if we're really
> > setting this up right now because of nova, do we need to have heat and
> > ceilometer installed and configured in the initial delivery of this if
> > we're not going to run tempest tests against them (we don't right now)?
> >
> 
> 
> In general the faster the better, and if things get to slow enough that we
> have to wait for powervm CI to report back, I
> think its reasonable to go ahead and approve things without hearing back.
>  In reality if you can report back in under 12 hours this will rarely
> happen (I think).
> 
> 
> >
> > I think some aspect of the slow setup time is related to DB2 and how
> > the migrations perform with some of that, but the overall time is not
> > considerably different from when we were running this with MySQL so
> > I'm reluctant to blame it all on DB2.  I think some of our topology
> > could have something to do with it too since the IVM hypervisor is running
> > on a separate system and we are gated on how it's performing at any
> > given time.  I think that will be our biggest challenge for the scale
> > issues with community CI.
> >
> > >
> > > > 5. What are the minimum tests that need to run (excluding APIs that the
> > > > powervm driver doesn't currently support)?
> > > >         - smoke/gate/negative/whitebox/scenario/cli?  Right now we have
> > > > 1152 tempest tests running, those are only within api/scenario/cli and
> > > > we don't run everything.

Well that's almost a full run right now, the full tempest jobs have 1290 tests
of which we skip 65 because of bugs or configuration. (don't run neutron api
tests without neutron) That number is actually pretty high since you are
running with neutron. Right now the neutron gating jobs only have 221 jobs and
skip 8 of those. Can you share the list of things you've got working with
neutron so we can up the number of gating tests?

> > >
> > > I think that "a full run of tempest" should be required. That said, if
> > > there are things that the driver legitimately doesn't support, it makes
> > > sense to exclude those from the tempest run, otherwise it's not useful.
> >
> 
> ++
> 
> 
> 
> >  >
> > > I think you should publish the tempest config (or config script, or
> > > patch, or whatever) that you're using so that we can see what it means
> > > in terms of the coverage you're providing.
> >
> > Just to clarify, do you mean publish what we are using now or publish
> > once it's all working?  I can certainly attach our nose.cfg and
> > latest x-unit results xml file.
> >
> 
> We should publish all logs, similar to what we do for upstream (
> http://logs.openstack.org/96/48196/8/gate/gate-tempest-devstack-vm-full/70ae562/
> ).

Yes, and part of that is the devstack logs which shows all the configuration
steps for getting an environment up and running. This is sometimes very useful
for debugging. So this is probably information that you'll want to replicate in
whatever the logging output for the powervm jobs ends up being.

> > >
> > > > 6. Network service? We're running with openvswitch 1.10 today so we
> > > > probably want to continue with that if possible.
> > >
> > > Hmm, so that means neutron? AFAIK, not much of tempest runs with
> > > Nova/Neutron.
> > >
> > > I kinda think that since nova-network is our default right now (for
> > > better or worse) that the run should include that mode, especially if
> > > using neutron excludes a large portion of the tests.
> > >
> > > I think you said you're actually running a bunch of tempest right now,
> > > which conflicts with my understanding of neutron workiness. Can you
> > clarify?
> >
> > Correct, we're running with neutron using the ovs plugin. We basically have
> > the same issues that the neutron gate jobs have, which is related to
> > concurrency
> > issues and tenant isolation (we're doing the same as devstack with neutron
> > in that we don't run tempest with tenant isolation).  We are running most
> > of the nova and most of the neutron API tests though (we don't have all
> > of the neutron-dependent scenario tests working though, probably more due
> > to incompetence in setting up neutron than anything else).

I also agree with Dan here in the short term you should probably at least have a
run with nova-network since it's the default. It'll also let you run with
tenant isolation which should allow you to run these jobs in parallel which
might help with your speed issues. (It depends on several things)

There is only one neutron dependent scenario test that gets run in the neutron
gating jobs right now. So I'm sure that you're probably at least matching that.

> >
> > >
> > > > 7. Cinder backend? We're running with the storwize driver but we do we
> > > > do about the remote v7000?
> > >
> > > Is there any reason not to just run with a local LVM setup like we do in
> > > the real gate? I mean, additional coverage for the v7000 driver is
> > > great, but if it breaks and causes you to not have any coverage at all,
> > > that seems, like, bad to me :)
> >
> > Yeah, I think we'd just run with a local LVM setup, that's what we do for
> > x86_64 and s390x tempest runs. For whatever reason we thought we'd do
> > storwize for our ppc64 runs, probably just to have a matrix of coverage.
> >
> > >
> > > > Again, just getting some thoughts out there to help us figure out our
> > > > goals for this, especially around 4 and 5.
> > >
> > > Yeah, thanks for starting this discussion!
> > >


-Matt Treinish



More information about the OpenStack-dev mailing list