Open Stack

Sat Apr 5 21:00:42 UTC 2014

One fairly common failure mode folk run into is registering a node
with a nova-bm/ironic environment that is itself part of that
environment. E.g. if you deploy ironic-conductor using Ironic (scaling
out a cluster say), that conductor can then potentially power itself
off if the node that represents itself happens to map to it in the
hash ring. It happens manually too when folk just are entering lots of
nodes and don't realise one of them is also a deployment server :).

I'm thinking that a good solution will have the following properties:
 - its possible to manually guard against this
 - we can easily make the guard work for nova deployed machines

And that we don't need to worry about:
 - arbitrary other machines in the cluster (because thats a heat
responsibility, to not request redeploy of too many machines at once).

For now, I only want to think about solving this for Ironic :).

I think the following design might work:
 - a config knob in ironic-conductor that specifies its own pm_address
 - we push that back up as part of the hash ring metadata
 - in the hash ring don't set a primary or fallback conductor if the
node pm address matches the conductor self pm address
 - in the Nova Ironic driver add instance metadata with the pm address
(only) of the node

Then we can just glue the instance metadata field to the conductor config key.

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud

Open Stack

[openstack-dev] [TripleO][Ironic][Nova-BM] avoiding self-power-off scenarios

OpenStack

Community

Documentation

Branding & Legal