[Openstack-sigs] [meta] [self-healing] ANNOUNCE: Self-healing SIG officially formed

Adam Spiers aspiers at suse.com
Mon Nov 27 14:19:25 UTC 2017


TL;DR: the self-healing SIG is officially formed!  Watch the openstack-sigs 
mailing list for future developments. 

A longer version of this announcement can be found at 

    https://blog.adamspiers.org/2017/11/24/announcing-openstacks-self-healing-sig/


A SIG is born!
--------------

After an unofficial kick-off meeting at the last PTG in Denver, I 
proposed the creation of a new self-healing SIG: 

    http://lists.openstack.org/pipermail/openstack-sigs/2017-September/000054.html

At the recent Summit in Sydney, we had a Forum session around 30 
people attend the Sydney Forum session, which was extremely 
encouraging! You can read more details in the etherpad, but here is 
the quick summary ... 

Most importantly, we resolved the naming and scoping issues, 
concluding that to avoid biting off too much in one go, it was better 
to be pragmatic and start small: 

  - Initially focus on cloud infrastructure, and not worry too much
    about the user-facing impact of failures yet; we can add that
    concern whenever it makes sense (which is particularly relevant
    for telcos / NFV).

  - Not worry too much about optimization initially; Watcher is
    possibly the only project focusing on this right now, and again we
    can expand to include optimization any time we want.

Now that the naming and scoping issues are resolved, I am excited to 
announce that the Self-healing SIG is officially formed! 

Discussion went beyond mere administravia, however: 

  - We collected a few initial use cases.

  - We informally decided the governance of the SIG. I asked if anyone
    else would like to assume leadership, but noone seemed keen,
    dashing my hopes of avoiding extra work ;-)  But Eric Kao, PTL of
    Congress, generously offered to act as co-chair.

  - We discussed health check APIs, which were mentioned in at least 2
    or 3 other Forum sessions this time round.

  - We agreed that we wanted an IRC channel, and that it could host
    bi-weekly meetings. However as usual there was no clean solution
    to choosing a time which would suit everyone ;-/  I'll try to
    figure out what to do about this!


Get involved
------------

You are warmly invited to join, if this topic interests you: 

  - Ensure you are subscribed to the openstack-sigs mailing list:

      http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-sigs

    and watch out watch out for posts tagged =[self-healing]=.

  - Bookmark https://wiki.openstack.org/wiki/Self_healing_SIG
    which is the SIG's official home.


Next steps
----------

I will set up the IRC channel, and see if we can make progress on 
agreeing times for regular IRC meetings. 

Other than this administravia, it is of course up to the community to 
decide in which direction the SIG should go, but my suggestions are: 

  - Continue to collect use cases.  It makes sense to have a very
    lightweight process for this (at least, initially), so Eric has
    created a Google Doc and populated it with a suggested template and
    a first example:

      https://docs.google.com/document/d/13N36g2RlUYs8mw7hbfRXw6y2Jc-V2XGrXgfPXPpUvuU/edit?usp=sharing

    Feel free to add your own based on this template.

  - Collect links to any existing documentation or other resources which
    describe how existing services can be combined.  This awesome talk
    on Advanced Fault Management with Vitrage and Mistral is a perfect
    example:

        https://www.openstack.org/videos/sydney-2017/advanced-fault-management-with-vitrage-and-mistral

    and here is another:

        https://www.openstack.org/videos/barcelona-2016/building-self-healing-applications-with-aodh-zaqar-and-mistral

    but we need to make it easier for operators to understand which
    combinations like this are possible, and easier for them to be set
    up.

  - Finish the architecture diagram drafted in Denver:

      https://docs.google.com/drawings/d/1kEFtVpQ4c8HipSp34EVAkcSGmwyg1MzWf_H5oGTtl-Y/edit?usp=sharing

  - At a higher level, we could document reference stacks which address
    multiple self-healing cases.

  - Talk more with the OPNFV community to find out what capabilities
    they have which could be reused within non-NFV OpenStack clouds.

  - Perform gaps analysis on the use cases, and liase with specific
    projects to drive development in directions which can address those
    gaps.



More information about the openstack-sigs mailing list