<div dir="ltr">@Gorka, do you have what you have from the framework library available somewhere?<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Em sex, 29 de mar de 2019 às 06:35, Gorka Eguileor <<a href="mailto:geguileo@redhat.com">geguileo@redhat.com</a>> escreveu:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 28/03, Sean McGinnis wrote:<br>

> On Thu, Mar 28, 2019 at 08:56:25AM -0500, Ben Nemec wrote:<br>

> > Fixed the cinder tag so this should show up for people who filter on it.<br>

> ><br>

> > On 3/28/19 3:18 AM, vkalaitzis wrote:<br>

> > > I would like to ask if cinder-volume supports active/active setups.<br>

> > > I found that there was a blueprint for adding support (<a href="https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support" rel="noreferrer" target="_blank">https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support</a>),<br>

> > > but as far as I am concerned, no official documentation of such feature<br>

> > > exists so far.<br>

> ><br>

> > My understanding is that Cinder does support active-active configuration<br>

> > now, but someone from the Cinder team will have to provide details.<br>

> ><br>

><br>

> The support is merged in Cinder for active/active, but the hold up now has been<br>

> getting Cinder driver maintainers to certify their drivers and storage devices<br>

> will work correctly when running in that configuration.<br>

><br>

> I seem to recall some work being done with the Ceph/RBD driver, but I don't<br>

> believe anything has merged for any drivers to enable that.<br>

><br>

<br>

Hi,<br>

<br>

The RBD driver can be deployed in Active-Active [1] and there is work<br>

underway for Triple-O to support Active-Active deployments, but there<br>

has not been exhaustive testing to confirm there are no issues with the<br>

current state of the feature and with the driver itself under these<br>

conditions.<br>

<br>

Like Sean says, the hold up is caused by testing and validation.  We<br>

would need a deterministic mechanism to model and run failure test<br>

cases.  Without it we won't know if the tests are passing because the<br>

failure happened when the code was in a happy place.<br>

<br>

We could also run some kind of chaos monkey, but then we would need to<br>

be able to determine what the expected state of the system is, based on<br>

where the code was when the monkey hit it.<br>

<br>

For this purpose, a couple of year back, I started working on a testing<br>

framework that would allow us to inject predefined actions, replace<br>

methods return codes, generate side effects, and even inject arbitrary<br>

code into a running Cinder deployment.  The API request included the<br>

testing action as well as the normal request made to Cinder, in the<br>

testing action we also defined where in the code and in which service it<br>

should be run.  I vaguely remember the injection mechanism also allowed<br>

calls to be changed from async to sync in order to be able to return<br>

data to inspect internal service states.<br>

<br>

Unfortunately, a change in priorities made me drop this work on the<br>

early PoC stages, and I don't know when I'll be able to get back to it.<br>

<br>

Cheers,<br>

Gorka.<br>

<br>

<br>

<br>

<br>

[1]: <a href="https://opendev.org/openstack/cinder/src/branch/master/cinder/volume/drivers/rbd.py#L218" rel="noreferrer" target="_blank">https://opendev.org/openstack/cinder/src/branch/master/cinder/volume/drivers/rbd.py#L218</a><br>

<br>

</blockquote></div>