[openstack-dev] [nova] [ironic] how to remove check-tempest-dsvm-ironic-pxe_ssh on Nova check

Devananda van der Veen devananda.vdv at gmail.com
Tue Nov 25 22:44:19 UTC 2014


On Tue Nov 25 2014 at 7:20:00 AM Sean Dague <sean at dague.net> wrote:

> On 11/25/2014 10:07 AM, Jim Rollenhagen wrote:
> > On Tue, Nov 25, 2014 at 08:02:56AM -0500, Sean Dague wrote:
> >> When at Summit I discovered that check-tempest-dsvm-ironic-pxe_ssh is
> >> now voting on Nova check queue. The reasons given is that the Nova team
> >> ignored the interface contract that was being provided to Ironic, broke
> >> them, so the Ironic team pushed for co-gating (which basically means the
> >> interface contract is now enforced by a 3rd party outside of Nova /
> Ironic).
> >>
> >> However, this was all in vague term, and I think is exactly the kind of
> >> thing we don't want to do. Which is use the gate as a proxy fight over
> >> teams breaking contracts with other teams.
> >>
> >> So I'd like to dive into what changes happened and what actually broke,
> >> so that we can get back to doing this smarter.
> >>
> >> Because if we are going to continue to grow as a community, we have to
> >> co-gate less. It has become a crutch to not think about interfaces and
> >> implications of changes, and is something we need to be doing a lot
> less of.
>

Definitely -- and we're already co-gating less than other projects :)

You might notice that ironic jobs aren't running in the check queues for
any other Integrated project, even though Ironic depends on interacting
directly with Glance, Neutron, Keystone, and (for some drivers) Swift.


> >
> > Completely agree here; I would love to not gate Nova on Ironic.
> >
> > The major problem is that Nova's driver API is explicitly not stable.[0]
> > If the driver API becomes stable and properly versioned, Nova should be
> > able to change the driver API without breaking Ironic.
> >
> > Now, this won't fix all cases where Nova could break Ironic. The
> > resource tracker or scheduler could change in a breaking way. However, I
> > think the driver API has been the most common cause of Ironic breakage,
> > so I think it's a great first step.
> >
> > // jim
> >
> > [0] http://docs.openstack.org/developer/nova/devref/
> policies.html#public-contractual-apis
>
> So we actually had a test in tree for that part of the contract...
>
>
We did. It was added here:
  https://review.openstack.org/#/c/98201

.. and removed here:
  https://review.openstack.org/#/c/111425/

.. because there is now unit test coverage for the internal API usage by
in-tree virt drivers via assertPublicAPISignatures() here:
  https://git.openstack.org/cgit/openstack/nova/tree/nova/test.py#n370


We don't need the surface to be public contractual, we just need to know
> what things Ironic is depending on and realize that can't be changed
> without some compatibility code put in place.


There is the virt driver API, which is the largest and most obvious one.
Then there's the HostManager, ResourceTracker, and SchedulerFilter classes
(or sublcasses thereof). None of these are "public contractual APIs" as
defined by Nova. Without any guarantee of stability in those interfaces, I
believe co-gating is a reasonable thing to do between the projects.

Also, there have been discussions [2] at and leading up to the Paris summit
(some ongoing for many cycles now) regarding changing every one of those
interfaces. Until those interfaces are refactored / split out / otherwise
deemed stable, I would like to continue running Ironic's functional tests
on all Nova changes. If you think we don't need to co-gate while that work
is underway, I'd like to understand why, and what that would look like.

Again... without knowing exactly what happened (I was on leave) it's
> hard to come up with a solution. However, I think the co-gate was an
> elephant gun that we don't actually want.
>

Apologies, but I don't recall exactly what time period you were on leave
for, so you may have already seen some or all of these.

I have looked up up several cases of asymmetrical breaks that happened (due
to changes in multiple projects) during Juno (before Ironic was voting in
Nova's check queue). At least one of these was the result of a change in
Nova after the Ironic driver merged. Links at the bottom for reference [1].

Here is a specific example where a patch introduced subtle behavior changes
within the resource tracker that were not caught by any of Nova's tests,
and would not have been caught by the API contract test, even if that had
been in place at the time, nor any other API contract test, since it did
not, in fact, change the API. It changed a behavior of the resource tracker
which, it turns out, libvirt does not use (at least within the
devstack-gate environment).

https://review.openstack.org/#/c/71557/33/nova/compute/resource_tracker.py

The problem is at line 385, where the supplied 'stats' are overwritten.
None of the libvirt-based tests touched this code path, though.

> def _write_ext_resources(self, resources):
>    resources['stats'] = {}        ## right here
>    resources['stats'].update(self.stats)
>    self.ext_resources_handler.write_resources(resources)



-Devananda


[1]
https://review.openstack.org/#/c/120857/
https://review.openstack.org/#/c/111538/
https://review.openstack.org/#/c/107562/
https://review.openstack.org/#/c/105599/
https://review.openstack.org/#/c/71557/
https://review.openstack.org/#/c/68942/

[2]
splitting out virt drivers
http://lists.openstack.org/pipermail/openstack-dev/2014-September/044872.html

splitting out the scheduler and resource tracker
https://etherpad.openstack.org/p/kilo-nova-scheduler-rt

support for clustered hypervisors // why Ironic subclasses ComputeManager
https://etherpad.openstack.org/p/juno-nova-clustered-hypervisor-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141125/0748dc67/attachment.html>


More information about the OpenStack-dev mailing list