Open Stack

Fri Mar 29 09:32:14 UTC 2019

On 28/03, Sean McGinnis wrote:
> On Thu, Mar 28, 2019 at 08:56:25AM -0500, Ben Nemec wrote:
> > Fixed the cinder tag so this should show up for people who filter on it.
> >
> > On 3/28/19 3:18 AM, vkalaitzis wrote:
> > > I would like to ask if cinder-volume supports active/active setups.
> > > I found that there was a blueprint for adding support (https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support),
> > > but as far as I am concerned, no official documentation of such feature
> > > exists so far.
> >
> > My understanding is that Cinder does support active-active configuration
> > now, but someone from the Cinder team will have to provide details.
> >
>
> The support is merged in Cinder for active/active, but the hold up now has been
> getting Cinder driver maintainers to certify their drivers and storage devices
> will work correctly when running in that configuration.
>
> I seem to recall some work being done with the Ceph/RBD driver, but I don't
> believe anything has merged for any drivers to enable that.
>

Hi,

The RBD driver can be deployed in Active-Active [1] and there is work
underway for Triple-O to support Active-Active deployments, but there
has not been exhaustive testing to confirm there are no issues with the
current state of the feature and with the driver itself under these
conditions.

Like Sean says, the hold up is caused by testing and validation.  We
would need a deterministic mechanism to model and run failure test
cases.  Without it we won't know if the tests are passing because the
failure happened when the code was in a happy place.

We could also run some kind of chaos monkey, but then we would need to
be able to determine what the expected state of the system is, based on
where the code was when the monkey hit it.

For this purpose, a couple of year back, I started working on a testing
framework that would allow us to inject predefined actions, replace
methods return codes, generate side effects, and even inject arbitrary
code into a running Cinder deployment.  The API request included the
testing action as well as the normal request made to Cinder, in the
testing action we also defined where in the code and in which service it
should be run.  I vaguely remember the injection mechanism also allowed
calls to be changed from async to sync in order to be able to return
data to inspect internal service states.

Unfortunately, a change in priorities made me drop this work on the
early PoC stages, and I don't know when I'll be able to get back to it.

Cheers,
Gorka.

[1]: https://opendev.org/openstack/cinder/src/branch/master/cinder/volume/drivers/rbd.py#L218

Open Stack

[cinder] Cinder Volume Active/Active Support

OpenStack

Community

Documentation

Branding & Legal