[openstack-dev] [vitrage] [nova] [HA] [masakari] VM Heartbeat / Healthcheck Monitoring

Vikash Kumar vikash.kumar at oneconvergence.com
Fri May 19 04:06:41 UTC 2017


Hi Greg,

    Please include my email in this spec also. We are also dealing with HA
of Virtual Instances (especially for Vendors) and will participate.

On Thu, May 18, 2017 at 11:33 PM, Waines, Greg <Greg.Waines at windriver.com>
wrote:

> Yes I am good with writing spec for this in masakari-spec.
>
>
>
> Do you use gerrit for this git ?
>
> Do you have a template for your specs ?
>
>
>
> Greg.
>
>
>
>
>
>
>
> *From: *Sam P <sam47priya at gmail.com>
> *Reply-To: *"openstack-dev at lists.openstack.org" <openstack-dev at lists.
> openstack.org>
> *Date: *Thursday, May 18, 2017 at 1:51 PM
> *To: *"openstack-dev at lists.openstack.org" <openstack-dev at lists.
> openstack.org>
> *Subject: *Re: [openstack-dev] [vitrage] [nova] [HA] [masakari] VM
> Heartbeat / Healthcheck Monitoring
>
>
>
> Hi Greg,
>
> Thank you Adam for followup.
>
> This is new feature for masakari-monitors and think  Masakari can
>
> accommodate this feature in  masakari-monitors.
>
> From the implementation prospective, it is not that hard to do.
>
> However, as you can see in our Boston presentation, Masakari will
>
> replace its monitoring parts ( which is masakari-monitors) with,
>
> nova-host-alerter, **-process-alerter, and **-instance-alerter. (**
>
> part is not defined yet..:p)...
>
> Therefore, I would like to save this specifications, and make sure we
>
> will not miss  anything in the transformation..
>
> Does is make sense to write simple spec for this in masakari-spec [1]?
>
> So we can discuss about the requirements how to implement it.
>
>
>
> [1] https://github.com/openstack/masakari-specs
>
>
>
> --- Regards,
>
> Sampath
>
>
>
>
>
>
>
> On Thu, May 18, 2017 at 2:29 AM, Adam Spiers <aspiers at suse.com> wrote:
>
> I don't see any reason why masakari couldn't handle that, but you'd
>
> have to ask Sampath and the masakari team whether they would consider
>
> that in scope for their roadmap.
>
>
>
> Waines, Greg <Greg.Waines at windriver.com> wrote:
>
>
>
> Sure.  I can propose a new user story.
>
>
>
> And then are you thinking of including this user story in the scope of
>
> what masakari would be looking at ?
>
>
>
> Greg.
>
>
>
>
>
> From: Adam Spiers <aspiers at suse.com>
>
> Reply-To: "openstack-dev at lists.openstack.org"
>
> <openstack-dev at lists.openstack.org>
>
> Date: Wednesday, May 17, 2017 at 10:08 AM
>
> To: "openstack-dev at lists.openstack.org"
>
> <openstack-dev at lists.openstack.org>
>
> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat /
>
> Healthcheck Monitoring
>
>
>
> Thanks for the clarification Greg.  This sounds like it has the
>
> potential to be a very useful capability.  May I suggest that you
>
> propose a new user story for it, along similar lines to this existing
>
> one?
>
>
>
>
>
> http://specs.openstack.org/openstack/openstack-user-
> stories/user-stories/proposed/ha_vm.html
>
>
>
> Waines, Greg <Greg.Waines at windriver.com<mailto:Greg.Waines at windriver.com>
> <Greg.Waines at windriver.com%3e>>
>
> wrote:
>
> Yes that’s correct.
>
> VM Heartbeating / Health-check Monitoring would introduce intrusive /
>
> white-box type monitoring of VMs / Instances.
>
>
>
> I realize this is somewhat in the gray-zone of what a cloud should be
>
> monitoring or not,
>
> but I believe it provides an alternative for Applications deployed in VMs
>
> that do not have an external monitoring/management entity like a VNF
> Manager
>
> in the MANO architecture.
>
> And even for VMs with VNF Managers, it provides a highly reliable
>
> alternate monitoring path that does not rely on Tenant Networking.
>
>
>
> You’re correct, that VM HB/HC Monitoring would leverage
>
> https://wiki.libvirt.org/page/Qemu_guest_agent
>
> that would require the agent to be installed in the images for talking
>
> back to the compute host.
>
> ( there are other examples of similar approaches in openstack ... the
>
> murano-agent for installation, the swift-agent for object store management
> )
>
> Although here, in the case of VM HB/HC Monitoring, via the QEMU Guest
>
> Agent, the messaging path is internal thru a QEMU virtual serial device.
>
> i.e. a very simple interface with very few dependencies ... it’s up and
>
> available very early in VM lifecycle and virtually always up.
>
>
>
> Wrt failure modes / use-cases
>
>
>
> ·         a VM’s response to a Heartbeat Challenge Request can be as
>
> simple as just ACK-ing,
>
> this alone allows for detection of:
>
>
>
> o    a failed or hung QEMU/KVM instance, or
>
>
>
> o    a failed or hung VM’s OS, or
>
>
>
> o    a failure of the VM’s OS to schedule the QEMU Guest Agent daemon, or
>
>
>
> o    a failure of the VM to route basic IO via linux sockets.
>
>
>
> ·         I have had feedback that this is similar to the virtual hardware
>
> watchdog of QEMU/KVM (
>
> https://libvirt.org/formatdomain.html#elementsWatchdog )
>
>
>
> ·         However, the VM Heartbeat / Health-check Monitoring
>
>
>
> o   provides a higher-level (i.e. application-level) heartbeating
>
>
>
> §  i.e. if the Heartbeat requests are being answered by the Application
>
> running within the VM
>
>
>
> o   provides more than just heartbeating, as the Application can use it to
>
> trigger a variety of audits,
>
>
>
> o   provides a mechanism for the Application within the VM to report a
>
> Health Status / Info back to the Host / Cloud,
>
>
>
> o   provides notification of the Heartbeat / Health-check status to
>
> higher-level cloud entities thru Vitrage
>
>
>
> §  e.g.   VM-Heartbeat-Monitor - to - Vitrage - (EventAlarm) - Aodh - ...
>
> - VNF-Manager
>
>
>
> - (StateChange) - Nova - ... - VNF Manager
>
>
>
>
>
> Greg.
>
>
>
>
>
> From: Adam Spiers <aspiers at suse.com<mailto:aspiers at suse.com>
> <aspiers at suse.com%3e>>
>
> Reply-To:
>
> "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.
> org> <openstack-dev at lists.openstack.org%3e>"
>
> <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.
> org> <openstack-dev at lists.openstack.org%3e>>
>
> Date: Tuesday, May 16, 2017 at 7:29 PM
>
> To:
>
> "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.
> org> <openstack-dev at lists.openstack.org%3e>"
>
> <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.
> org> <openstack-dev at lists.openstack.org%3e>>
>
> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat /
>
> Healthcheck Monitoring
>
>
>
> Waines, Greg
>
> <Greg.Waines at windriver.com<mailto:Greg.Waines at windriver.com>
> <mailto:Greg.Waines at windriver.com><mailto:Greg.Waines at windriver.com%3e>
> <Greg.Waines at windriver.com%3e%3cmailto:Greg.Waines at windriver.com%3e%3cmailto:Greg.Waines at windriver.com%3e%3e>
> >
>
> wrote:
>
> thanks for the pointers Sam.
>
>
>
> I took a quick look.
>
> I agree that the VM Heartbeat / Health-check looks like a good fit into
>
> Masakari.
>
>
>
> Currently your instance monitoring looks like it is strictly black-box
>
> type monitoring thru libvirt events.
>
> Is that correct ?
>
> i.e. you do not do any intrusive type monitoring of the instance thru the
>
> QUEMU Guest Agent facility
>
>        correct ?
>
>
>
> That is correct:
>
>
>
>
>
> https://github.com/openstack/masakari-monitors/blob/master/
> masakarimonitors/instancemonitor/instance.py
>
>
>
> I think this is what VM Heartbeat / Health-check would add to Masaraki.
>
> Let me know if you agree.
>
>
>
> OK, so you are looking for something slightly different I guess, based
>
> on this QEMU guest agent?
>
>
>
>     https://wiki.libvirt.org/page/Qemu_guest_agent
>
>
>
> That would require the agent to be installed in the images, which is
>
> extra work but I imagine quite easily justifiable in some scenarios.
>
> What failure modes do you have in mind for covering with this
>
> approach - things like the guest kernel freezing, for instance?
>
>
>
>
>
> __________________________________________________________________________
>
> OpenStack Development Mailing List (not for usage questions)
>
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __________________________________________________________________________
>
> OpenStack Development Mailing List (not for usage questions)
>
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Regards,
Vikash
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170519/e92aa794/attachment.html>


More information about the OpenStack-dev mailing list