Not to side track the conversation too much, but some of our mid- and long-term goals for TripleO align closely with this discussion.<div><br></div><div>Specifically, we're working towards having Heat deploy HA pairs for all the important service bits inside an OpenStack cloud, and then using Heat to orchestrate no-downtime upgrades of those HA pairs. I believe folks are already working on this, but I don't think it's directly relevant to this discussion. </div>
<div><br></div><div>At some point, TripleO will need a mechanism for no-downtime upgrades for nova-compute nodes as well. Whether that is an in-place restart, evacuate, or (ideally) live-migrate, Heat is going to need to drive it, which means that it needs to be manageable via the API. The same mechanism could presumably be tied into monitoring, trigger an evacuation if the compute host was down for a certain length of time, and presumably also coordinate with Heat not to start autoscaling at the same time. This would need to avoid split-brain situations and the like...</div>
<div><br></div><div>Regards,</div><div>Devananda</div><div><br></div><div>(*) <a href="https://github.com/tripleo/incubator/blob/master/README.md#what-is-tripleo" target="_blank">https://github.com/tripleo/incubator/blob/master/README.md#what-is-tripleo</a></div>
<div><br><br>On Tuesday, April 23, 2013, Leen Besselink wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Tue, Apr 23, 2013 at 10:08:19AM -0400, Russell Bryant wrote:<br>
> On 04/23/2013 03:31 AM, Leen Besselink wrote:<br>
> >> I was only talking about the fencing off a compute node part, since<br>
> >> that's what you started the thread with. :-)<br>
> ><br>
> > I know I'm going in circles, just trying to get a feel for the best way to handle it.<br>
> ><br>
> >><br>
> >> Presumably you would still use nova APIs that already exist to move the<br>
> >> instances elsewhere. An 'evacuate' API went in to grizzly for this.<br>
> >><br>
> >> <a href="https://blueprints.launchpad.net/nova/+spec/rebuild-for-ha" target="_blank">https://blueprints.launchpad.net/nova/+spec/rebuild-for-ha</a><br>
> >><br>
> ><br>
> > So when any node fails in a Pacemaker cluster, you fence the node, tell OpenStack about the<br>
> > failed node and call evacuate for all the instances. The scheduler will just place them anywhere<br>
> > it pleases.<br>
> ><br>
> > (there is already a blueprint for evacuate to call the scheduler and even an other for handling a whole node)<br>
> ><br>
> > So, yeah, maybe that is enough.<br>
> ><br>
> > I guess I was hoping all machines would be the same. Now I'll need to make clusters. To OpenStack<br>
> > they will still all look the same I guess.<br>
> ><br>
> > But it will work with existing tested code, that is also important.<br>
><br>
> Yeah, it's not really ideal, but it's something that works with existing<br>
> tools. I thought of a pretty big hole here, though. We want to<br>
> restrict what a compute node can do as much as possible for security<br>
> reasons. Clustering them together and allowing them to communicate back<br>
> to the nova API to perform administrative functions (evacuating<br>
> instances) is extremely contrary to that goal.<br>
><br>
> In any case, I think the usage of fence-agents is good, but it should be<br>
> something outside of the existing OpenStack components that uses it.<br>
> Compute nodes need to be monitored, but they need to be restricted from<br>
> having any administrative capabilities for security reasons.<br>
><br>
> What should perform the monitoring, fencing, evacuating, and what not is<br>
> a bit of a question mark. I think as a community we are seriously<br>
> lacking good open source cloud infrastructure management tools.<br>
> Companies are developing their own for private use, or as proprietary<br>
> value adds, but we need some open solutions here.<br>
><br>
<br>
As Alex Glikson suggestion was to use ZooKeeper code in Nova which centralized this.<br>
<br>
So security wise that could be a good start, you don't need anything on the node itself. If it also simpler,<br>
you don't need to create clusters.<br>
<br>
If OpenStack had a central service which handles fencing then that might be enough.<br>
<br>
A simple fencing implementation would be that the service would just send an IPMI poweroff request to<br>
the node.<br>
<br>
Sends an other IPMI request to check if the current power state is off, marks the node as down and then calls the<br>
scheduler to start the instances somewhere else (evacuate).<br>
<br>
Or am I missing something ?<br>
<br>
> --<br>
> Russell Bryant<br>
><br>
> _______________________________________________<br>
> OpenStack-dev mailing list<br>
> <a>OpenStack-dev@lists.openstack.org</a><br>
> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br>
_______________________________________________<br>
OpenStack-dev mailing list<br>
<a>OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
</blockquote></div>