[openstack-dev] [nova][cinder] how to handle AZ bug 1496235?

Sam Morrison sorrison at gmail.com
Thu Sep 24 21:22:43 UTC 2015


> On 24 Sep 2015, at 6:19 pm, Sylvain Bauza <sbauza at redhat.com> wrote:
> 
> Ahem, Nova AZs are not failure domains - I mean the current implementation, in the sense of many people understand what is a failure domain, ie. a physical unit of machines (a bay, a room, a floor, a datacenter).
> All the AZs in Nova share the same controlplane with the same message queue and database, which means that one failure can be propagated to the other AZ.
> 
> To be honest, there is one very specific usecase where AZs *are* failure domains : when cells exact match with AZs (ie. one AZ grouping all the hosts behind one cell). That's the very specific usecase that Sam is mentioning in his email, and I certainly understand we need to keep that.
> 
> What are AZs in Nova is pretty well explained in a quite old blogpost : http://blog.russellbryant.net/2013/05/21/availability-zones-and-host-aggregates-in-openstack-compute-nova/ <http://blog.russellbryant.net/2013/05/21/availability-zones-and-host-aggregates-in-openstack-compute-nova/>
Yes an AZ may not be considered a failure domain in terms of control infrastructure, I think all operators understand this. If you want control infrastructure failure domains use regions.

However from a resource level (eg. running instance/ running volume) I would consider them some kind of failure domain. It’s a way of saying to a user if you have resources running in 2 AZs you have a more available service. 

Every cloud will have a different definition of what an AZ is, a rack/collection of racks/DC etc. openstack doesn’t need to decide what that is.

Sam

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150925/b9417e01/attachment.html>


More information about the OpenStack-dev mailing list