[openstack-dev] [Nova] [Ironic] [TripleO] Fixing HostManager, take two

Devananda van der Veen devananda.vdv at gmail.com
Mon Jun 9 23:09:18 UTC 2014

Last week, we tried to fix a bug in the way that Nova's baremetal and
ironic drivers are using the HostManager / HostState classes --
they're incorrectly reporting capabilities in an older fashion, which
is not in use any more, and thus not exposing the node's "stats" to
the scheduler. The fix actually broke both drivers but went unnoticed
in reviews on the original patch. Reverting that took about a week,
and Ironic patches have been blocked since then, but that's not what
I'm writing about.

I'd like to present my view of all the related patches and propose a
way forward for this fix. I'd also like to thank Hans for looking into
this and proposing a fix in the first place, and thank Hans and many
others for helping to address the resulting issues very quickly.

This is the original bug:
  BaremetalHostManager cannot distinguish baremetal hosts from other hosts

The original attempted fix (now reverted):

This broke Ironic because it changed the signature of
HostState.__init__(), and it broke Nova-baremetal because it didn't
save "stats" in update_from_compute_node(). A fix was proposed for
each project...

for Nova:

for Ironic:

If 97806 had been part of the original 94043, this change would
probably not have negatively affected nova's baremetal driver.
However, it still would have broken Ironic until 97447 could have been
landed. I should have noticed this when the
check-tempest-dsvm-virtual-ironic-nv job on that patch failed (I, like
others, have apparently fallen into the bad habit of ignoring test
results which say "non-voting").

So, until such time as the necessary driver and other changes are able
to land in Nova, and at Sean's suggestion, we've proposed a change to
the nova unit tests to "watch" those internal APIs that Ironic depends

This will at least make it very explicit to any Nova reviewer that a
change to these APIs will affect Ironic. We can also set up a watch on
changes to this file, alerting us if there is a patch changing an API
that we depend on.

As for how to proceed, I would like to suggest the following:
- 97447 be reworked to support both the current and proposed HostState
parameter lists
- 94043 and 97806 be squashed and reproposed, but held until after
97447 and 98201 land
- a new patch be proposed to ironic to remove support for the now-old
HostState parameter list

Thoughts? Suggestions?


More information about the OpenStack-dev mailing list