[openstack-dev] [Nova] Hypervisor CI requirement and deprecation plan
mriedem at linux.vnet.ibm.com
Tue Nov 26 16:49:43 UTC 2013
On Tuesday, November 26, 2013 10:07:02 AM, Sean Dague wrote:
> On 11/26/2013 09:56 AM, Russell Bryant wrote:
>> On 11/26/2013 09:38 AM, Bob Ball wrote:
>>>> -----Original Message-----
>>>> From: Russell Bryant [mailto:rbryant at redhat.com]
>>>> Sent: 26 November 2013 13:56
>>>> To: openstack-dev at lists.openstack.org
>>>> Cc: Sean Dague
>>>> Subject: Re: [openstack-dev] [Nova] Hypervisor CI requirement and
>>>> deprecation plan
>>>> On 11/26/2013 04:48 AM, Bob Ball wrote:
>>>>> I hope we can safely say that we should run against all "gating" tests which
>>>> require Nova? Currently we run quite a number of tests in the gate that
>>>> succeed even when Nova is not running as the gate isn't just for Nova but
>>>> for all projects.
>>>> Would you like to come up with a more detailed proposal? What tests
>>>> would you cut, and how much time does it save?
>>> I don't have a detailed proposal yet - but it's very possible that we'll want one in the coming weeks.
>>> In terms of the time saved, I noticed that a tempest smoke run with Nova absent took 400 seconds on one of my machines (a particularly slow one) - so I imagine that would translate to maybe a 300 second / 5 minute reduction in overall time. Total smoke took approximately 800 seconds on the same machine.
>> I don't think the smoke tests are really relevant here. That's not
>> related to Nova vs non-Nova tests, right?
>>> If the approach could be acceptable then yes, I'm happy to come up with a detailed set of tests that I would propose cutting.
>>> My primary hesitation with the approach is it would need Tempest reviewers to be aware of this extra type of test, and flag up if a test is added to the full tempest suite which should also be in the nova tempest suite.
>> Right now I don't think it's acceptable. I was suggesting a more
>> detailed proposal to help convince me. :-)
> So we already have the beginnings of service tags in Tempest, that would
> let you slice exactly like this. I don't think the infrastructure is
> fully complete yet, but the idea being that you could run the subset of
> tests that interact with "compute" or "networking" in any real way.
> Realize... that's not going to drop that many tests for something like
> compute, it's touched a lot.
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
Good to know about the service tags, I think I remember being broken at
some point after those tempest.conf.sample changes. :)
My overall concern, and I think the other guys doing this for virt
drivers will agree, is trying to scope down the exposure to unrelated
failures. For example, if there is a bug in swift breaking the gate,
it could start breaking the nova virt driver CI as well. When things
get bad in the gate, it takes some monstrous effort to rally people
across the projects to come together to unblock it (like what Joe
Gordon was doing last week).
I'm running Tempest internally about once per day when we rebase code
with the community and that's to cover running with the PowerVM driver
for nova, Storwize driver for cinder, OVS for neutron, with qpid and
DB2. We're running almost a full run except for the third party boto
tests and swift API tests. The thing is, when something fails, I have
to figure out if it's environmental (infra), a problem with tempest
(think instability with neutron in the gate), a configuration issue, or
a code bug. That's a lot for one person to have to cover, even a small
team. That's why at some points we just have to ignore/exclude tests
that continuously fail but we can't figure out (think intermittent gate
breaker bugs that are open for months). Now multiply this out across
all the nova virt drivers, the neutron plugins and I'm assuming at some
point the various glance backends and cinder drivers (haven't heard if
they are planning on the same types of CI requirements yet). I think
either we're going to have a lot of flaky/instable driver CI going on
so the scores can't be trusted, or we're going to develop a lot of
people that get really good at infra/QA (which would be a plus in the
long-run, but maybe not what those teams set out to be).
I don't have any good answers, I'm just trying to raise the issue since
this is complicated. I think it's also hard for people that aren't
forced to invest in infra/QA on a daily basis to understand and
appreciate the amount of effort it takes just to keep the wheels
spinning, so I want to keep expectations at a reasonable level.
Don't get me wrong, I absolutely agree with requiring third party CI
for the various vendor-specific drivers and plugins, that's a
no-brainer for openstack to scale. I think it will just be very
interesting to see the kinds of results coming out of all of these
disconnected teams come icehouse-3.
More information about the OpenStack-dev