[openstack-dev] [all] HA cross project session summary and next teps

Miguel Ángel Ajo majopela at redhat.com
Fri Nov 14 06:23:35 UTC 2014


Thank you for sharing, I missed that session.

Somehow related to the health checks: https://review.openstack.org/#/c/97748/

This is an spec/functionality for oslo I’m working on, to provide feedback to the process
manager that runs the daemons (init.d, pacemaker, systemd, pacemaker+systemd, upstart).

The idea is that daemons themselves could provide feedback about their inner
status, with an status code + status message. To allow, for example, degraded operation.

Feedback on the spec/comments is appreciated.

Best regards,
Miguel Ángel



Miguel Ángel
ajo @ freenode.net


On Thursday, 13 de November de 2014 at 12:59, Angus Salkeld wrote:

> On Tue, Nov 11, 2014 at 12:13 PM, Angus Salkeld <asalkeld at mirantis.com (mailto:asalkeld at mirantis.com)> wrote:
> > Hi all
> >  
> > The HA session was really well attended and I'd like to give some feedback from the session.
> >  
> > Firstly there is some really good content here: https://etherpad.openstack.org/p/kilo-crossproject-ha-integration
> >  
> > 1. We SHOULD provide better health checks for OCF resources (http://linux-ha.org/wiki/OCF_Resource_Agents).  
> > These should be fast and reliable. We should probably bike shed on some convention like "<project>-manage healthcheck"
> > and then roll this out for each project.
> >  
> > 2. We should really move https://github.com/madkiss/openstack-resource-agents to stackforge or openstack if the author is agreeable to it (it's referred to in our official docs).
> >  
>  
> I have chatted to the author of this repo and he is happy for it to live under stackforge or openstack. Or each OCF resource going into each of the projects.
> Does anyone have any particular preference? I suspect stackforge will be the path of least resistance.
>  
> -Angus
>   
> > 3. All services SHOULD support Active/Active configurations
> >     (better scaling and it's always tested)
> >  
> > 4. We should be testing HA (there are a number of ideas on the etherpad about this)
> >  
> > 5. Many services do not recovery in the case of failure mid-task
> >     This seems like a big problem to me (some leave the DB in a mess). Someone linked to an interesting article (
> > crash-only-software: http://lwn.net/Articles/191059/) (http://lwn.net/Articles/191059/) that suggests that we if we do this correctly we should not need the concept of clean shutdown.
> >      (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/service.py#L459-L471)
> >      I'd be interested in how people think this needs to be approached (just raise bugs for each?).
> >  
> > Regards
> > Angus
>  
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org (mailto:OpenStack-dev at lists.openstack.org)
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>  
>  


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141114/d0c77a0a/attachment.html>


More information about the OpenStack-dev mailing list