[openstack-dev] [Nova] [Ironic] [QA] Status update // SFE for deprecation of Baremetal
Sean Dague
sean at dague.net
Mon Jul 21 22:48:19 UTC 2014
On 07/21/2014 06:13 PM, Devananda van der Veen wrote:
> On Mon, Jul 21, 2014 at 11:15 AM, Dan Smith <dms at danplanet.com
> <mailto:dms at danplanet.com>> wrote:
>
> > In addition to seeking a spec-freeze-exception for 95025, I would also
> > like some clarification of the requirement to test this upgrade
> > path. Some nova-core folks have pointed out that they do not want to
> > accept the nova.virt.ironic driver until the upgrade path from
> > nova.virt.baremetal *has test coverage*, but what that means is not
> > clear to me. It's been suggested that we use grenade (I am pretty sure
> > Sean suggested this at the summit, and I wrote it into my spec
> proposal
> > soon thereafter). After looking into grenade, I don't think it is the
> > right tool to test with, and I'm concerned that no one pointed
> this out
> > sooner.
>
> Grenade is our release test tool, so I think that, barring details, it's
> reasonable to use $GRENADE when talking about this sort of thing.
>
>
> Grenade uses tempest to validate the "old" and "new" stacks. Unless Sean
> and Matthew are willing to change that, this is a detail we can't ignore.
Old and new are just symbols, they can be anything you like. It is just
really 2 trees with 2 configs. For a bit of time after the juno release
we were accidentally upgrading from juno to juno, the tool chain didn't
really care (which actually made it a little harder to realize that we
borked up branch selection).
> I
> didn't realize that nova-bm doesn't work in devstack until you pointed
> it out in IRC last week. Since we're pretty good about requiring
> devstack support for new things like drivers, I would have expected
> nova-bm to work there, but obviously times were a bit different when
> that driver was merged.
>
>
> > Philosophically, this isn't an upgrade of one service from version
> X to
> > Y. It's a replacement of one nova driver with a completely different
> > driver. As I understand it, that's not what grenade is for. But maybe
> > I'm wrong on this, or maybe it's flexible.
>
> I think it's "start devstack on release X, validate, do some work,
> re-start devstack on release Y, validate". I'm not sure that it's
> ill-suited for this, but IANAGE.
>
>
>
>
> > I also have a technical objection: even if devstack can start and
> > properly configure nova.virt.baremteal (which I doubt, because it
> isn't
> > tested at all), it is going to fail the tempest/api/compute test suite
> > horribly. The baremetal driver never passed tempest, and never had
> > devstack-gate support. This matters because grenade uses tempest to
> > validate a stack pre- and post-upgrade. Therefore, since we know that
> > the old code is going to fail tempest, requiring grenade testing as a
> > precondition to accepting the ironic driver effectively means we
> need to
> > go develop the baremetal driver to a point it could pass tempest. I'm
> > going to assume no one is actually suggesting that, and instead
> believe
> > that none of us thought this through.
> >
> > (FWIW, Ironic doesn't pass the tempest/api/compute suite today, but
> > we're working hard on it.)
>
> Do the devstack exercises pass?
>
>
> A few of them passed, once upon a time, but the whole suite? It never
> passed on the baremetal driver for me. And it was never run or
> maintained in the gate.
>
>
> We test things like cells today (/me
> hears sdague scream in the background), which don't pass tempest, using
> the exercises to make sure it's at least able to create an instance.
We don't even really know that... but that's a longer story. :)
Anyway, I veto devstack exercises as a test for this, they are
impossible to debug.
> > So, I'd like to ask for suggestions on what sort of upgrade testing is
> > reasonable here. I'll toss out two ideas:
> > - load some fake data into the nova_bm schema, run the upgrade
> scripts,
> > start ironic, issue some API queries, and see if the data's correct
> > - start devstack, load some real data into the nova_bm schema, run the
> > upgrade scripts, then try to deploy an instance with ironic
>
> These were my suggestions last week, so I'll own up to them now.
> Obviously I think that something using grenade that goes from a
> functional environment on release X to a functional environment on
> release Y is best. However, I of course don't think it makes sense to
> spend a ton of time getting nova-bm to pass tempest just so we can shoot
> it in the head.
>
>
> I'm glad to hear that, since everything up to this point in your reply
> seems to indicate that we should go back and add test coverage (whether
> tempest or exercise.sh) for the very code we are trying to delete.
>
> So my question remains. Even for option #2, while we can "load some real
> data" into the nova_bm schema, since we can't do any functional testing
> on it today, I don't think we should be expected to go fix things to
> make that pass. This leaves us in the position of running tempest only
> once -- on the result of the migration. Is that sufficient from your
> perspective?
>
>
>
>
> I'm not really sure what to do here. I think that we need an upgrade
> path, and that it needs to be tested. I don't think our users would
> appreciate us removing any other virt driver and replacing it with a new
> one, avoiding an upgrade path because "it's a different driver now". I
> also don't want to spend a bunch of time on nova-bm, which we have
> already neglected in our other test requirements (which is maybe part of
> the problem here).
>
>
> Yea. Nova has kept the baremetal driver in tree with no testing
> whatsoever far beyond its relevance, hoping for Ironic to come along and
> replace it -- except no one was maintaining it, and the only user
> (TripleO) is eagerly moving to Ironic and doesn't care about a migration
> path.
>
> Virt drivers have been removed from Nova for less. Like some of those,
> baremetal was allowed in tree before the thrid-party CI requirements
> were in place. I don't understand the resistance to removing this driver
> which has never had any third-party CI, and does not have any team
> actively maintaining it.
>
>
> Assuming grenade can be flexible about what it runs against the old and
> new environments to determine "workyness", then I think the second
> option above is probably a pretty good level of assurance, given where
> we are right now.
>
>
> I would like to hear from the grenade engineers/maintainers on this one.
> My read of the code suggests this "test for workyness" is not pluggable
> at all -- but we can easily disable running tempest on the "old" cloud.
> Here's the code:
>
> 225 # Validate the install
> 226 echo_summary "Running base smoke test"
> 227 if [[ "$BASE_RUN_SMOKE" == "True" ]]; then
> 228 cd $BASE_RELEASE_DIR/tempest
> 229 tox -esmoke -- --concurrency=$TEMPEST_CONCURRENCY
> 230 fi
> 231 stop $STOP base-smoke 110
> ....
> 336 # Validate the upgrade
> 337 if [[ "$TARGET_RUN_SMOKE" == "True" ]]; then
> 338 echo_summary "Running tempest scenario and smoke tests"
> 339 cd $TARGET_RELEASE_DIR/tempest
> 340 tox -esmoke -- --concurrency=$TEMPEST_CONCURRENCY
> 341 stop $STOP run-smoke 330
> 342 fi
Right, we could skip old cloud validation for you. Though you'll have to
write something specific for the upgrade to test the resources. The
point would be to inject the configuration before the cut over.
It's also not clear to me that you'd actually need the whole rube
goldberg running on every run, because a relatively small number of
changes are going to impact this. A periodic + experimental job is
probably useful here.
-Sean
--
Sean Dague
http://dague.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140721/4df263e0/attachment.pgp>
More information about the OpenStack-dev
mailing list