Open Stack

Tue Nov 11 02:13:18 UTC 2014

Hi all

The HA session was really well attended and I'd like to give some feedback
from the session.

Firstly there is some really good content here:
https://etherpad.openstack.org/p/kilo-crossproject-ha-integration

1. We SHOULD provide better health checks for OCF resources (
http://linux-ha.org/wiki/OCF_Resource_Agents).
These should be fast and reliable. We should probably bike shed on some
convention like "<project>-manage healthcheck"
and then roll this out for each project.

2. We should really move
https://github.com/madkiss/openstack-resource-agents to stackforge or
openstack if the author is agreeable to it (it's referred to in our
official docs).

3. All services SHOULD support Active/Active configurations
    (better scaling and it's always tested)

4. We should be testing HA (there are a number of ideas on the etherpad
about this)

5. Many services do not recovery in the case of failure mid-task
    This seems like a big problem to me (some leave the DB in a mess).
Someone linked to an interesting article (
crash-only-software: http://lwn.net/Articles/191059/)
<http://lwn.net/Articles/191059/> that suggests that we if we do this
correctly we should not need the concept of clean shutdown.
     (
https://github.com/openstack/oslo-incubator/blob/master/openstack/common/service.py#L459-L471
)
     I'd be interested in how people think this needs to be approached
(just raise bugs for each?).

Regards
Angus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141111/1eff41f5/attachment-0001.html>

Open Stack

[openstack-dev] [all] HA cross project session summary and next steps

OpenStack

Community

Documentation

Branding & Legal