[openstack-dev] [nova][cinder] how to handle AZ bug 1496235?

Sylvain Bauza sbauza at redhat.com
Thu Sep 24 18:16:02 UTC 2015



Le 24/09/2015 19:45, Matt Riedemann a écrit :
>
>
> On 9/24/2015 11:50 AM, Mathieu Gagné wrote:
>> On 2015-09-24 3:04 AM, Duncan Thomas wrote:
>>>
>>> I proposed a session at the Tokyo summit for a discussion of Cinder 
>>> AZs,
>>> since there was clear confusion about what they are intended for and 
>>> how
>>> they should be configured. Since then I've reached out to and gotten
>>> good feedback from, a number of operators.
>>
>> Thanks for your proposition. I will make sure to attend this session.
>>
>>
>>> There are two distinct
>>> configurations for AZ behaviour in cinder, and both sort-of worked 
>>> until
>>> very recently.
>>>
>>> 1) No AZs in cinder
>>> This is the config where a single 'blob' of storage (most of the
>>> operators who responded so far are using Ceph, though that isn't
>>> required). The storage takes care of availability concerns, and any AZ
>>> info from nova should just be ignored.
>>
>> Unless I'm very mistaken, I think it's the main "feature" missing from
>> OpenStack itself. The concept of AZ isn't global and anyone can still
>> make it so Nova AZ != Cinder AZ.
>>
>> In my opinion, AZ should be a global concept where they are available
>> and the same for all services so Nova AZ == Cinder AZ. This could result
>> in a behavior similar to "regions within regions".
>>
>> We should survey and ask how AZ are actually used by operators and
>> users. Some might create an AZ for each server racks, others for each
>> power segments in their datacenter or even business units so they can
>> segregate to specific physical servers. Some AZ use cases might just be
>> a "perverted" way of bypassing shortcomings in OpenStack itself. We
>> should find out those use cases and see if we should still support them
>> or offer them an existing or new alternatives.
>>
>> (I don't run Ceph yet, only SolidFire but I guess the same could apply)
>>
>> For people running Ceph (or other big clustered block storage), they
>> will have one big Cinder backend. For resources or business reasons,
>> they can't afford to create as many clusters (and Cinder AZ) as there
>> are AZ in Nova. So they end up with one big Cinder AZ (lets call it
>> az-1) in Cinder. Nova won't be able to create volumes in Cinder az-2 if
>> an instance is created in Nova az-2.
>>
>> May I suggest the following solutions:
>>
>> 1) Add ability to disable this whole AZ concept in Cinder so it doesn't
>> fail to create volumes when Nova asks for a specific AZ. This could
>> result in the same behavior as cinder.cross_az_attach config.
>
> That's essentially what this does:
>
> https://review.openstack.org/#/c/217857/
>
> It defaults to False though so you have to be aware and set it if 
> you're hitting this problem.
>
> The nova block_device code that tries to create the volume and passes 
> the nova AZ should have probably been taking into account the 
> cinder.cross_az_attach config option, because just blindly passing it 
> was the reason why cinder added that option.  There is now a change up 
> for review to consider cinder.cross_az_attach in block_device:
>
> https://review.openstack.org/#/c/225119/
>
> But that's still making the assumption that we should be passing the 
> AZ on the volume create request and will still fail if the AZ isn't in 
> cinder (and allow_availability_zone_fallback=False in cinder.conf).
>
> In talking with Duncan this morning he's going to propose a spec for 
> an attempt to clean some of this up and decouple nova from handling 
> this logic.  Basically a new Cinder API where you give it an AZ and it 
> tells you if that's OK.  We could then use this on the nova side 
> before we ever get to the compute node and fail.

MHO is like you, we should decouple Nova AZs from Cinder AZs and just 
have a lazy relationship between those by getting a way to call Cinder 
to know which AZ before calling the scheduler.


>
>>
>> 2) Add ability for a volume backend to be in multiple AZ. Of course,
>> this would defeat the whole AZ concept. This could however be something
>> our operators/users might accept.
>
> I'd nix this on the point about it defeating the purpose of AZs.

Well, if we rename Cinder AZs to something else, then I'm honestly not 
really opiniated,since it's already always confusing, because Nova AZs 
are groups of hosts, not anything else.

If we keep the naming as AZs, then I'm not OK since it creates more 
confusion.

-Sylvain


>
>>
>>
>>> 2) Cinder AZs map to Nova AZs
>>> In this case, some combination of storage / networking / etc couples
>>> storage to nova AZs. It is may be that an AZ is used as a unit of
>>> scaling, or it could be a real storage failure domain. Eitehr way, 
>>> there
>>> are a number of operators who have this configuration and want to keep
>>> it. Storage can certainly have a failure domain, and limiting the
>>> scalability problem of storage to a single cmpute AZ can have definite
>>> advantages in failure scenarios. These people do not want cross-az 
>>> attach.
>>>
>>> My hope at the summit session was to agree these two configurations,
>>> discuss any scenarios not covered by these two configuration, and nail
>>> down the changes we need to get these to work properly. There's
>>> definitely been interest and activity in the operator community in
>>> making nova and cinder AZs interact, and every desired interaction I've
>>> gotten details about so far matches one of the above models.
>>
>>
>




More information about the OpenStack-dev mailing list