[openstack-dev] [nova] reason for python-novaclient revert
Sean Dague
sean at dague.net
Tue Oct 29 10:36:06 UTC 2013
Andrew Laski correctly called us out for not really proving enough
information n the python-novaclient revert yesterday -
https://review.openstack.org/#/c/54108/. Appologies there. At the time
we were dealing with a gate that grenade was failing every change (for
the prior 6 hours), we were all on our first cup of coffee, and while we
got to resolution, we did so with an entirely unuseful commit message to
explain it.
Here's what happened. python-novaclient landed a change that changed the
user interface. This change meant that devstack exercises failed on
validating the details on getting aggregates.
However, upgrade testing is hard, and we had a loophole, that led us to
a wedge in the gate.
For the grenade jobs we prep 2 versions of the OpenStack codebase,
grizzly and master (yes, still grizzly and master, we're working on
that). The grizzly tree is grizzly devstack, which means it's grizzly on
all the core servers, but master on all the clients. However, the
grizzly tree doesn't get "zuulified", which was the crux of the issue.
By zuulified I mean think about the zuul queue. How do we actually test
a change 15 deep in the gate? We aren't testing just that change, but
all the gerrit proposed changes above it. That means that zuul needs to
go through and update relevant git trees beyond master, but to the
proposed change sets for all the jobs in front of it. This is accross
projects, and should be across branches.
But we'd not gotten the system to do this correctly on the "old" side
yet. Which means that python-novaclient landed a breaking change, but
the "old" side built a grizzly cloud with only master, not master +
gerrit. It passed the verification of the "old" cloud, then moved to the
new cloud, then ran a different set of tests to verify the new cloud,
which passed.
However, by threading the needle in this way, it meant no one else could
ever pass grenade again. The quick fix was the python-novaclient revert.
The real fix is probably this - https://review.openstack.org/#/c/53940/
which we were actually working on last week, to both update the set of
trees we are using, and update the zuul refs on the "old" side of the
equation. Once that lands I'll attempt to revert the revert, and ensure
that it actually gets caught in the system. Then we can work on updating
tests so it can get through. But right now it's a perfect test case to
proove that we did this right, so leaving it in the reverted state is
critical.
This also highlights one of the reasons I've been hard on folks recently
about some alternative upgrade or mixed version testing models, and
doing it outside of grenade. Everything is simple when you talk about a
single change. But when you are 15 or 20 deep in zuul gate, and have to
handle 3 proposed stable nova changes, 5 proposed master nova changes, a
keystone stable, a keystone master, and a few cinder master changes in
front of you to build the environments you need to test in the gate....
this gets complicated fast. Basically you aren't allowed to use git
inside your upgrade tool for this reason, because your tool has no idea
what it's supposed to actually test, only ZUUL knows. And, as you can
see, we've yet to get this whole thing mapped out the first time. :)
-Sean
--
Sean Dague
http://dague.net
More information about the OpenStack-dev
mailing list