<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">I'm not fully aware of the
      Heat/Ceilometer parts, but I know there are some attempts to mix
      them up in order to get some HA capabilities based on metrics
      given by Ceilometer.<br>
      <br>
      -Sylvain<br>
      <br>
      <br>
      Le 22/04/2013 14:58, Alex Glikson a écrit :<br>
    </div>
    <blockquote
cite="mid:OFB08ACA85.9B9D8660-ONC2257B55.0043EE7D-C2257B55.0047436A@il.ibm.com"
      type="cite"><font face="sans-serif" size="2">I think this is a
        good idea.</font>
      <br>
      <font face="sans-serif" size="2">We already have a framework in
        Nova
        to detect and report the failure (service group monitoring APIs,
        with DB
        and ZK backends already implemented), as well as APIs to list
        instances
        on a host and to evacuate individual instances (soon with
        destination selected
        by the scheduler). Indeed, the missing pieces now are the
        end-to-end orchestration
        (which is probably not going to happen within Nova, at least at
        the moment),
        and the mechanism(s) to isolate the failed host (e.g., to
        protect against
        false failure detection events) -- which could potentially
        happen in several
        places, as you mentioned. It might be the case that whatever can
        be done
        within Nova is already there -- the corresponding nova-compute
        will be
        considered down. So, maybe now the question is which additional
        components
        might be used (as you mentioned -- bare-metal, quantum, cinder,
        etc). Once
        the individual measures are clear (and implemented), the
        orchestration
        logic (wherever that would be) can use them.</font>
      <br>
      <br>
      <font face="sans-serif" size="2">Regards,</font>
      <br>
      <font face="sans-serif" size="2">Alex</font>
      <br>
      <br>
      <br>
      <br>
      <br>
      <font face="sans-serif" size="1" color="#5f5f5f">From:      
         </font><font face="sans-serif" size="1">Leen Besselink
        <a class="moz-txt-link-rfc2396E" href="mailto:ubuntu@consolejunkie.net"><ubuntu@consolejunkie.net></a></font>
      <br>
      <font face="sans-serif" size="1" color="#5f5f5f">To:      
         </font><font face="sans-serif" size="1">OpenStack Development
        Mailing List <a class="moz-txt-link-rfc2396E" href="mailto:openstack-dev@lists.openstack.org"><openstack-dev@lists.openstack.org></a>, </font>
      <br>
      <font face="sans-serif" size="1" color="#5f5f5f">Date:      
         </font><font face="sans-serif" size="1">22/04/2013 03:18 PM</font>
      <br>
      <font face="sans-serif" size="1" color="#5f5f5f">Subject:    
           </font><font face="sans-serif" size="1">[openstack-dev]
        blueprint proposal nova-compute fencing for HA ?</font>
      <br>
      <hr noshade="noshade">
      <br>
      <br>
      <br>
      <tt><font size="2">Hi,<br>
          <br>
          As I have not been at the summit and the technical videos of
          the Summit
          are not yet online I am not aware of what was discusses there.<br>
          <br>
          But I would like to submit a blueprint.<br>
          <br>
          My idea is:<br>
          <br>
          It is a step to support VM High availability.<br>
          <br>
          This part is about handling compute node failure.<br>
          <br>
          My proposal would be to create a framework/API/plugin/agent or
          whatever
          is needed for fencing off a nova-compute node.<br>
          <br>
          So when something detects that a nova-compute node isn't
          functional anymore
          it can fence off that nova-compute node.<br>
          <br>
          After which it can call 'evacuate' to start the instance(s)
          that were previously
          running on the failed compute node on other compute node(s).<br>
          <br>
          The implementation of the code that handles the fencing could
          be implemented
          in different ways for different environments:<br>
          <br>
          - The IPMI-code that handle baremetal provisining could for
          example be
          used to poweroff or reboot the node.<br>
          <br>
          - The Quantum networking code could be used to "disconnect"
          the
          instance(s) of the failed compute node (or the whole compute
          node) from
          their respective networks. If you are using overlays you could
          configure
          other machines not to accept tunnel traffic from the failed
          compute node
          for the networks of the instance(s)<br>
          <br>
          - You could also have a firewall agent configure the shared
          storage servers
          (or a firewall in between) to not accept traffic from the
          failed compute
          node<br>
          <br>
          I am sure other people have other ideas.<br>
          <br>
          My request would be to have an API and general framework which
          can call
          the different implementations that are configured for that
          environment.<br>
          <br>
          Does that make any sense ?<br>
          <br>
          Or maybe should this be handled by creating clusters with for
          example pacemaker
          like I assume oVirt might be doing with their proposals:<br>
          <br>
        </font></tt><a moz-do-not-send="true"
href="https://blueprints.launchpad.net/nova/+spec/rhev-m-ovirt-clusters-as-compute-resources/"><tt><font
            size="2">https://blueprints.launchpad.net/nova/+spec/rhev-m-ovirt-clusters-as-compute-resources/</font></tt></a><tt><font
          size="2"><br>
          <br>
          As I am not yet all that familar with the structure of
          OpenStack or how
          it is organized it could be I am asking in the wrong place to
          discuss this
          or if it architecturally does not fit in then do let me know
          where I went
          wrong.<br>
          <br>
          I've looked at the list of existing blueprints and I at least
          see other
          evacuate, fault-tolerance/HA- and other related blueprints as
          well:<br>
          <br>
        </font></tt><a moz-do-not-send="true"
        href="https://blueprints.launchpad.net/nova/+spec/evacuate-host"><tt><font
            size="2">https://blueprints.launchpad.net/nova/+spec/evacuate-host</font></tt></a><tt><font
          size="2"><br>
        </font></tt><a moz-do-not-send="true"
href="https://blueprints.launchpad.net/nova/+spec/find-host-and-evacuate-instance"><tt><font
            size="2">https://blueprints.launchpad.net/nova/+spec/find-host-and-evacuate-instance</font></tt></a><tt><font
          size="2"><br>
        </font></tt><a moz-do-not-send="true"
href="https://blueprints.launchpad.net/nova/+spec/unify-migrate-and-live-migrate"><tt><font
            size="2">https://blueprints.launchpad.net/nova/+spec/unify-migrate-and-live-migrate</font></tt></a><tt><font
          size="2"><br>
        </font></tt><a moz-do-not-send="true"
        href="https://etherpad.openstack.org/HavanaUnifyMigrateAndLiveMigrate"><tt><font
            size="2">https://etherpad.openstack.org/HavanaUnifyMigrateAndLiveMigrate</font></tt></a><tt><font
          size="2"><br>
        </font></tt><a moz-do-not-send="true"
href="https://blueprints.launchpad.net/nova/+spec/live-migration-scheduling"><tt><font
            size="2">https://blueprints.launchpad.net/nova/+spec/live-migration-scheduling</font></tt></a><tt><font
          size="2"><br>
        </font></tt><a moz-do-not-send="true"
href="https://blueprints.launchpad.net/nova/+spec/bare-metal-fault-tolerance"><tt><font
            size="2">https://blueprints.launchpad.net/nova/+spec/bare-metal-fault-tolerance</font></tt></a><tt><font
          size="2"><br>
        </font></tt><a moz-do-not-send="true"
href="http://openstacksummitapril2013.sched.org/event/92e3468e458c13616331e75f15685560#.UXUeVXyuiw4"><tt><font
            size="2">http://openstacksummitapril2013.sched.org/event/92e3468e458c13616331e75f15685560#.UXUeVXyuiw4</font></tt></a><tt><font
          size="2"><br>
        </font></tt><a moz-do-not-send="true"
href="https://blueprints.launchpad.net/nova/+spec/live-migration-scheduling"><tt><font
            size="2">https://blueprints.launchpad.net/nova/+spec/live-migration-scheduling</font></tt></a><tt><font
          size="2"><br>
          <br>
          I think it would be a good idea to have an idea of what all of
          the usecases
          are and then split them up in tasks.<br>
          <br>
          Hope this is helpful.<br>
          <br>
          Have a nice day,<br>
                         
          Leen.<br>
          <br>
          _______________________________________________<br>
          OpenStack-dev mailing list<br>
          <a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
        </font></tt><a moz-do-not-send="true"
        href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"><tt><font
            size="2">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</font></tt></a><tt><font
          size="2"><br>
          <br>
        </font></tt>
      <br>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
OpenStack-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a>
<a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>