[openstack-dev] [fuel] [HA] How long we need to wait for cloud recovery after some destructive scenarios?
tnurlygayanov at mirantis.com
Wed Jun 3 20:05:18 UTC 2015
Anastasia, thank you!
On Wed, Jun 3, 2015 at 1:55 PM, Anastasia Urlapova <aurlapova at mirantis.com>
> some numbers and devs recommendations you can find by link, it is our
> HA Guid, feel free to contribute.
> On Wed, Jun 3, 2015 at 1:06 PM, Timur Nurlygayanov <
> tnurlygayanov at mirantis.com> wrote:
>> Looks like I forgot to add the link to  in the first email:
>>  https://github.com/stackforge/haos
>> On Wed, Jun 3, 2015 at 12:50 PM, Timur Nurlygayanov <
>> tnurlygayanov at mirantis.com> wrote:
>>> Hi team,
>>> I'm working on HA / destructive / recovery automated tests  for
>>> OpenStack clouds and I want to get some expectations from users, operators
>>> and developers for the speed of OpenStack recovery after some destructive
>>> For example, how long cluster should be unavailable if one of three
>>> controller will be destroyed? I think that the right answer is '0 seconds,
>>> no downtime' - users shouldn't see anything strange when we lost one
>>> controller in our cloud (if it is 'true' HA configuration).
>>> In the real world I can see that such destructive scenarios require some
>>> time to recover the cloud (1-15 minutes in different cases) - and I just
>>> want to get your expectations or the requirements.
>>> How fast we can / should fully recover the cloud in the following cases:
>>> 1. Restart RabbitMQ services
>>> 2. Restart MySQL / Galera services
>>> 3. Restart Neutron services (like L3 agents)
>>> 4. Hard shutdown of any OpenStack controllers
>>> 5. Shutdown of the ethernet interfaces of management / data networks
>>> Of course, it depends on the configuration, but we can describe some
>>> common, 'expected', asseptance values (SLA) for downtime in differrent
>>> destructive cases and use them to verify the clouds today and in the future.
>>> We will use these values in HAOS project , which will allow to
>>> validate any clouds with the same scenarios and with the same SLA for
>>> recovery time.
>>> Any comments are welcome :)
>>> Thank you!
>>> Senior QA Engineer
>>> OpenStack Projects
>>> Mirantis Inc
>> Senior QA Engineer
>> OpenStack Projects
>> Mirantis Inc
>> OpenStack Development Mailing List (not for usage questions)
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
Senior QA Engineer
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev