<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix"><br>
      Coarse grained health checks/recovery at the host or instance
      level might be good enough in some cases, but I can imagine others
      where the host/instance/etc. is up but some aspect of the service
      has crashed/degraded/etc. What's hard about dealing with the
      latter (e.g. service specific monitoring) is that the
      implementation (and perhaps even the recovery) needs to be handled
      in a way that may be service specific as well. For those cases,
      I'd think it's likely that some HA solution (outside of OpenStack)
      is already being leveraged....and perhaps the question becomes
      more about ensuring in the context of OpenStack that the right
      automation & orchestration exists to properly deploy &
      host that workload with its HA solution, and that the underlying
      infrastructure provisioned can meet requirements around isolation,
      performance, etc.<br>
      <br>
      -Eric<br>
      <br>
      On 12/3/2014 11:04 AM, Maish Saidel-Keesing wrote:<br>
    </div>
    <blockquote cite="mid:547F5ECB.8000604@maishsk.com" type="cite">
      <meta content="text/html; charset=ISO-8859-1"
        http-equiv="Content-Type">
      I would agree with you whole heartedly Tyler.<br>
      <br>
      It is perhaps blasphemy - but Enterprise wants the same they can
      get today with VMware - and the is standard HA.<br>
      <br>
      Maish<br>
      <div class="moz-cite-prefix">On 03/12/2014 20:37, Britten, Tyler
        wrote:<br>
      </div>
      <blockquote
cite="mid:8E5538C32EC50A429F8963329BFD60593BE6172D@MX104CL01.corp.emc.com"
        type="cite">
        <meta http-equiv="Content-Type" content="text/html;
          charset=ISO-8859-1">
        <meta name="Generator" content="Microsoft Word 14 (filtered
          medium)">
        <style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:Verdana;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.hoenzb
        {mso-style-name:hoenzb;}
span.EmailStyle18
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri","sans-serif";}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
        <div class="WordSection1">
          <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">It

              seems like the main ask from the ‘pets’ side of the
              enterprise is not instance monitoring/recovery, but
              hypervisor monitoring for instance recovery- KVM host
              fails, something is checking for a heartbeat, and once
              that host is marked as offline, it would check the db for
              the instances running on that host and schedule them to
              start on other remaining hosts. Ovbiously this would
              require shared ephemeral storage (NFS) or limit recovery
              to boot from volume instances.<o:p></o:p></span></p>
          <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
          <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Am

              I offbase?<o:p></o:p></span></p>
          <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
          <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
          <p class="MsoNormal"><b><span
style="font-size:9.0pt;font-family:"Verdana","sans-serif";color:gray"
                lang="FR">Tyler Britten<o:p></o:p></span></b></p>
          <p class="MsoNormal"><span
style="font-size:9.0pt;font-family:"Verdana","sans-serif";color:gray"
              lang="FR">Global Cloud Solutions | </span><span
style="font-size:9.0pt;font-family:"Verdana","sans-serif";color:gray">EMC</span><sup><span
style="font-size:9.0pt;font-family:"Verdana","sans-serif";color:gray">2</span></sup><span
style="font-size:9.0pt;font-family:"Verdana","sans-serif";color:gray"
              lang="FR"><o:p></o:p></span></p>
          <p class="MsoNormal"><span
style="font-size:9.0pt;font-family:"Verdana","sans-serif";color:gray">717.448.4057</span><span
style="font-size:9.0pt;font-family:"Verdana","sans-serif";color:#1F497D">
            </span><span
style="font-size:9.0pt;font-family:"Verdana","sans-serif";color:gray">|
              <a moz-do-not-send="true"
                href="mailto:tyler.britten@emc.com">tyler.britten@emc.com</a>
              | <a moz-do-not-send="true"
                href="https://twitter.com/vmtyler"> @VMTyler</a><o:p></o:p></span></p>
          <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
          <p class="MsoNormal"><b><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">
              Jason Venner [<a moz-do-not-send="true"
                class="moz-txt-link-freetext"
                href="mailto:jvenner@mirantis.com">mailto:jvenner@mirantis.com</a>]
              <br>
              <b>Sent:</b> Wednesday, December 03, 2014 13:31<br>
              <b>To:</b> Daniel P. Berrange<br>
              <b>Cc:</b> <a moz-do-not-send="true"
                class="moz-txt-link-abbreviated"
                href="mailto:Enterprise-wg@lists.openstack.org">Enterprise-wg@lists.openstack.org</a>;
              Stefano Maffulli<br>
              <b>Subject:</b> Re: [Win The Enterprise-wg]
              libvirtWatchdog status<o:p></o:p></span></p>
          <p class="MsoNormal"><o:p> </o:p></p>
          <div>
            <p class="MsoNormal">Would having an event we push in the
              bus be sufficient?<o:p></o:p></p>
            <div>
              <p class="MsoNormal">I can see about having this added to
                our nova contributor's work queues.<o:p></o:p></p>
            </div>
          </div>
          <div>
            <p class="MsoNormal"><o:p> </o:p></p>
            <div>
              <p class="MsoNormal">On Wed, Dec 3, 2014 at 2:19 AM,
                Daniel P. Berrange <<a moz-do-not-send="true"
                  href="mailto:berrange@redhat.com" target="_blank">berrange@redhat.com</a>>

                wrote:<o:p></o:p></p>
              <div>
                <div>
                  <p class="MsoNormal" style="margin-bottom:12.0pt">On
                    Tue, Dec 02, 2014 at 08:48:52PM -0500, Steve Gordon
                    wrote:<br>
                    > ----- Original Message -----<br>
                    > > From: "Stefano Maffulli" <<a
                      moz-do-not-send="true"
                      href="mailto:stefano@openstack.org">stefano@openstack.org</a>><br>
                    > > To: "Daniel P. Berrange" <<a
                      moz-do-not-send="true"
                      href="mailto:berrange@redhat.com">berrange@redhat.com</a>>,

                    <a moz-do-not-send="true"
                      href="mailto:Enterprise-wg@lists.openstack.org">Enterprise-wg@lists.openstack.org</a><br>
                    > ><br>
                    > > hi Daniel,<br>
                    > ><br>
                    > > during today's meeting for the Win The
                    Enterprise working group we<br>
                    > > noticed libvirtWatchdog. The wiki page<br>
                    > > <a moz-do-not-send="true"
                      href="https://wiki.openstack.org/wiki/LibvirtWatchdog"
                      target="_blank">https://wiki.openstack.org/wiki/LibvirtWatchdog</a>
                    is authored by you<br>
                    > > originally so I'm reaching out to learn
                    more about the status of this<br>
                    > > feature.<br>
                    > ><br>
                    > > In the WTE team, one of the priorities is
                    to understand the status of<br>
                    > > features that allow non-ephemeral
                    (persistent) workloads on OpenStack<br>
                    > > (aka the "pet" use case). libvirtWatchdog
                    was mentioned during a session<br>
                    > > in Paris, saying that it currently
                    supports KVM and Linux guests only.<br>
                    > ><br>
                    > > What are the plans for its future
                    (can/should it be extended to other<br>
                    > > guests/hypervisors)? Who's maintaining it
                    at the moment? Is there any<br>
                    > > other documentation besides the wiki page?<br>
                    ><br>
                    > I'll take a crack at it and then Dan can tell
                    me how wrong I am since it's probably my fault it
                    was in the etherpad ;). The watchdog feature in
                    OpenStack is exposing capabilities in the underlying
                    Libvirt [1] and Qemu [2][3] layers which allow you
                    to attach an i6300esb watchdog device to the guest
                    and assign a lifecycle action to take if it is
                    triggered. Fundamentally there's nothing preventing
                    other hypervisor projects from implementing this,
                    I'm not sure which ones if any actually have however
                    (and when I cover the second part of your question
                    below it might become clear why).<br>
                    ><br>
                    > As to why it only works with Linux guests (or
                    more accurately why it doesn't work for Windows - I
                    wouldn't be surprised if the BSD family or other
                    OSes do support it to some degree but I've never
                    checked) I believe it was originally intended to but
                    there were a few issues uncovered during the chase,
                    in particular:<br>
                    ><br>
                    > 1) The default Window's driver for the device
                    only displays the PCI information for it (it doesn't
                    actually do anything with the device).<br>
                    ><br>
                    > 2) The Intel driver for this device on Windows
                    only ever worked with 32-bit editions of Windows.<br>
                    ><br>
                    > 3) The Intel driver for this device on Windows
                    always assumes it's in a specific PCI slot.<br>
                    ><br>
                    > 4) There's no framework within Windows for
                    triggering a watchdog device and we weren't able to
                    determine if there were any Windows applications
                    capable of  triggering one either.<br>
                    ><br>
                    > Basically while you can attach the device to a
                    Windows guest for it to actually be used it would
                    require someone to write a proper driver for the
                    device that works on Windows and there would need to
                    be applications that know how to actually make use
                    of it. In the Linux case I believe there is wider
                    support for it and it can be triggered by common
                    panics and lockups (Rich's blog [3] gives some more
                    examples).<br>
                    ><br>
                    > For the gorier details see: <a
                      moz-do-not-send="true"
                      href="https://bugzilla.redhat.com/show_bug.cgi?id=610063"
                      target="_blank">
                      https://bugzilla.redhat.com/show_bug.cgi?id=610063</a>.<o:p></o:p></p>
                </div>
              </div>
              <p class="MsoNormal">Yep, that's pretty much it.<br>
                <br>
                Also note there's a missing feature in Nova in that we
                have no mechanism<br>
                to notify the end user when a watchdog fires on their
                VMs. Libvirt has<br>
                this notification ability but we've nowhere to send this
                info in OpenStack.<br>
                We need some kind of formal alerting system to get a
                message back to the<br>
                end user (or to an ochestration tool like Heat),  so
                they can take action<br>
                when it fires.<br>
                <br>
                Regards,<br>
                Daniel<br>
                <span class="hoenzb"><span style="color:#888888">--</span></span><span
                  style="color:#888888"><br>
                  <span class="hoenzb">|: <a moz-do-not-send="true"
                      href="http://berrange.com" target="_blank">http://berrange.com</a> 
                        -o-    <a moz-do-not-send="true"
                      href="http://www.flickr.com/photos/dberrange/"
                      target="_blank">http://www.flickr.com/photos/dberrange/</a>
                    :|</span><br>
                  <span class="hoenzb">|: <a moz-do-not-send="true"
                      href="http://libvirt.org" target="_blank">http://libvirt.org</a> 
                                -o-             <a
                      moz-do-not-send="true"
                      href="http://virt-manager.org" target="_blank">http://virt-manager.org</a>
                    :|</span><br>
                  <span class="hoenzb">|: <a moz-do-not-send="true"
                      href="http://autobuild.org" target="_blank">http://autobuild.org</a> 
                         -o-         <a moz-do-not-send="true"
                      href="http://search.cpan.org/%7Edanberr/"
                      target="_blank">http://search.cpan.org/~danberr/</a>
                    :|</span><br>
                  <span class="hoenzb">|: <a moz-do-not-send="true"
                      href="http://entangle-photo.org" target="_blank">http://entangle-photo.org</a> 
                         -o-       <a moz-do-not-send="true"
                      href="http://live.gnome.org/gtk-vnc"
                      target="_blank">http://live.gnome.org/gtk-vnc</a>
                    :|</span></span><o:p></o:p></p>
              <div>
                <div>
                  <p class="MsoNormal"><br>
                    _______________________________________________<br>
                    Enterprise-wg mailing list<br>
                    <a moz-do-not-send="true"
                      href="mailto:Enterprise-wg@lists.openstack.org">Enterprise-wg@lists.openstack.org</a><br>
                    <a moz-do-not-send="true"
                      href="http://lists.openstack.org/cgi-bin/mailman/listinfo/enterprise-wg"
                      target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/enterprise-wg</a><o:p></o:p></p>
                </div>
              </div>
            </div>
            <p class="MsoNormal"><br>
              <br clear="all">
              <o:p></o:p></p>
            <div>
              <p class="MsoNormal"><o:p> </o:p></p>
            </div>
            <p class="MsoNormal">-- <o:p></o:p></p>
            <div>
              <div>
                <div>
                  <p class="MsoNormal">Jason Venner<o:p></o:p></p>
                </div>
                <p class="MsoNormal">Vice President and Chief Architect<br>
                  Mirantis Inc<o:p></o:p></p>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
              </div>
            </div>
          </div>
        </div>
        <br>
        <fieldset class="mimeAttachmentHeader"></fieldset>
        <br>
        <pre wrap="">_______________________________________________
Enterprise-wg mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Enterprise-wg@lists.openstack.org">Enterprise-wg@lists.openstack.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/enterprise-wg">http://lists.openstack.org/cgi-bin/mailman/listinfo/enterprise-wg</a>
</pre>
      </blockquote>
      <br>
      <pre class="moz-signature" cols="72">-- 
Maish Saidel-Keesing
</pre>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Enterprise-wg mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Enterprise-wg@lists.openstack.org">Enterprise-wg@lists.openstack.org</a>
<a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/enterprise-wg">http://lists.openstack.org/cgi-bin/mailman/listinfo/enterprise-wg</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>