[openstack-dev] [fuel] [HA] How long we need to wait for cloud recovery after some destructive scenarios?

Anastasia Urlapova aurlapova at mirantis.com
Wed Jun 3 10:55:02 UTC 2015


Timur,
some numbers and devs recommendations you can find by link[0], it is our HA
Guid, feel free to contribute.

Nastya.

[0]
https://wiki.openstack.org/wiki/HAGuideImprovements/TOC#HA_Intro_and_Concepts

On Wed, Jun 3, 2015 at 1:06 PM, Timur Nurlygayanov <
tnurlygayanov at mirantis.com> wrote:

> Looks like I forgot to add the link to [1] in the first email:
>
> [1] https://github.com/stackforge/haos
>
> On Wed, Jun 3, 2015 at 12:50 PM, Timur Nurlygayanov <
> tnurlygayanov at mirantis.com> wrote:
>
>> Hi team,
>>
>> I'm working on HA / destructive / recovery automated tests [1] for
>> OpenStack clouds and I want to get some expectations from users, operators
>> and developers for the speed of OpenStack recovery after some destructive
>> actions.
>> For example, how long cluster should be unavailable if one of three
>> controller will be destroyed? I think that the right answer is '0 seconds,
>> no downtime' - users shouldn't see anything strange when we lost one
>> controller in our cloud (if it is 'true' HA configuration).
>> In the real world I can see that such destructive scenarios require some
>> time to recover the cloud (1-15 minutes in different cases) - and I just
>> want to get your expectations or the requirements.
>>
>> How fast we can / should fully recover the cloud in the following cases:
>> 1. Restart RabbitMQ services
>> 2. Restart MySQL / Galera services
>> 3. Restart Neutron services (like L3 agents)
>> 4. Hard shutdown of any OpenStack controllers
>> 5. Shutdown of the ethernet interfaces of management / data networks
>>
>> Of course, it depends on the configuration, but we can describe some
>> common, 'expected', asseptance values (SLA) for downtime in differrent
>> destructive cases and use them to verify the clouds today and in the future.
>> We will use these values in HAOS project [1], which will allow to
>> validate any clouds with the same scenarios and with the same SLA for
>> recovery time.
>>
>> Any comments are welcome :)
>> Thank you!
>>
>> --
>>
>> Timur,
>> Senior QA Engineer
>> OpenStack Projects
>> Mirantis Inc
>>
>
>
>
> --
>
> Timur,
> Senior QA Engineer
> OpenStack Projects
> Mirantis Inc
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150603/7f2fa609/attachment.html>


More information about the OpenStack-dev mailing list