<p dir="ltr">In case it isn't clear to others, or in case I've misunderstood, I'd like to start by rephrasing the problem statement.</p>
<p dir="ltr">* It is possible to use Ironic to deploy an instance of ironic-conductor on bare metal, which joins the same cluster that deployed it.<br>
* This, or some other event, could cause the hash ring distribution to change such that the instance of ironic-conductor is managed by itself.<br>
* A request to do any management (eg, power off) that instance will fail in interesting ways...</p>
<p dir="ltr">Adding a CONF setting that a conductor may optionally advertise, which alters the hash mapping and prevents self-managing is reasonable. The ironic.common.hash_ring will need to avoid mapping a node onto a conductor with the same advertised UUID, but I think that will be easy. We can't assume the driver has a "pm_address" key, though - some drivers may not. Since the hash ring already knows node UUID, and a node's UUID is known before an instance can be deployed to it, I think this will work. You can pass that node's UUID in via heat when deploying Ironic via Ironic, and the config will be present the first time the service starts, regardless of which power driver is used.</p>
<p dir="ltr">Also, the node UUID is already pushed out to Nova instance metadata :)</p>
<p dir="ltr">--<br>
Devananda</p>
<div class="gmail_quote">On Apr 5, 2014 2:01 PM, "Robert Collins" <<a href="mailto:robertc@robertcollins.net">robertc@robertcollins.net</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
One fairly common failure mode folk run into is registering a node<br>
with a nova-bm/ironic environment that is itself part of that<br>
environment. E.g. if you deploy ironic-conductor using Ironic (scaling<br>
out a cluster say), that conductor can then potentially power itself<br>
off if the node that represents itself happens to map to it in the<br>
hash ring. It happens manually too when folk just are entering lots of<br>
nodes and don't realise one of them is also a deployment server :).<br>
<br>
I'm thinking that a good solution will have the following properties:<br>
- its possible to manually guard against this<br>
- we can easily make the guard work for nova deployed machines<br>
<br>
And that we don't need to worry about:<br>
- arbitrary other machines in the cluster (because thats a heat<br>
responsibility, to not request redeploy of too many machines at once).<br>
<br>
For now, I only want to think about solving this for Ironic :).<br>
<br>
I think the following design might work:<br>
- a config knob in ironic-conductor that specifies its own pm_address<br>
- we push that back up as part of the hash ring metadata<br>
- in the hash ring don't set a primary or fallback conductor if the<br>
node pm address matches the conductor self pm address<br>
- in the Nova Ironic driver add instance metadata with the pm address<br>
(only) of the node<br>
<br>
Then we can just glue the instance metadata field to the conductor config key.<br>
<br>
-Rob<br>
<br>
--<br>
Robert Collins <<a href="mailto:rbtcollins@hp.com">rbtcollins@hp.com</a>><br>
Distinguished Technologist<br>
HP Converged Cloud<br>
<br>
_______________________________________________<br>
OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
</blockquote></div>