[openstack-dev] [vitrage] [nova] [HA] [masakari] VM Heartbeat / Healthcheck Monitoring

Sam P sam47priya at gmail.com
Thu May 18 18:14:15 UTC 2017


Hi Greg,

 Thank you.
> Do you use gerrit for this git ?
Yes, we use gerrit, same as other openstack projects.
https://review.openstack.org/#/admin/projects/openstack/masakari-specs
Here is the list for current and past spec works.
https://review.openstack.org/#/q/project:openstack/masakari-specs

> Do you have a template for your specs ?
Yes, please see the template in pike directory.
https://github.com/openstack/masakari-specs/blob/master/doc/source/specs/pike/template.rst


--- Regards,
Sampath



On Fri, May 19, 2017 at 3:03 AM, Waines, Greg <Greg.Waines at windriver.com> wrote:
> Yes I am good with writing spec for this in masakari-spec.
>
>
>
> Do you use gerrit for this git ?
>
> Do you have a template for your specs ?
>
>
>
> Greg.
>
>
>
>
>
>
>
> From: Sam P <sam47priya at gmail.com>
> Reply-To: "openstack-dev at lists.openstack.org"
> <openstack-dev at lists.openstack.org>
> Date: Thursday, May 18, 2017 at 1:51 PM
> To: "openstack-dev at lists.openstack.org" <openstack-dev at lists.openstack.org>
> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] [masakari] VM Heartbeat /
> Healthcheck Monitoring
>
>
>
> Hi Greg,
>
> Thank you Adam for followup.
>
> This is new feature for masakari-monitors and think  Masakari can
>
> accommodate this feature in  masakari-monitors.
>
> From the implementation prospective, it is not that hard to do.
>
> However, as you can see in our Boston presentation, Masakari will
>
> replace its monitoring parts ( which is masakari-monitors) with,
>
> nova-host-alerter, **-process-alerter, and **-instance-alerter. (**
>
> part is not defined yet..:p)...
>
> Therefore, I would like to save this specifications, and make sure we
>
> will not miss  anything in the transformation..
>
> Does is make sense to write simple spec for this in masakari-spec [1]?
>
> So we can discuss about the requirements how to implement it.
>
>
>
> [1] https://github.com/openstack/masakari-specs
>
>
>
> --- Regards,
>
> Sampath
>
>
>
>
>
>
>
> On Thu, May 18, 2017 at 2:29 AM, Adam Spiers <aspiers at suse.com> wrote:
>
> I don't see any reason why masakari couldn't handle that, but you'd
>
> have to ask Sampath and the masakari team whether they would consider
>
> that in scope for their roadmap.
>
>
>
> Waines, Greg <Greg.Waines at windriver.com> wrote:
>
>
>
> Sure.  I can propose a new user story.
>
>
>
> And then are you thinking of including this user story in the scope of
>
> what masakari would be looking at ?
>
>
>
> Greg.
>
>
>
>
>
> From: Adam Spiers <aspiers at suse.com>
>
> Reply-To: "openstack-dev at lists.openstack.org"
>
> <openstack-dev at lists.openstack.org>
>
> Date: Wednesday, May 17, 2017 at 10:08 AM
>
> To: "openstack-dev at lists.openstack.org"
>
> <openstack-dev at lists.openstack.org>
>
> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat /
>
> Healthcheck Monitoring
>
>
>
> Thanks for the clarification Greg.  This sounds like it has the
>
> potential to be a very useful capability.  May I suggest that you
>
> propose a new user story for it, along similar lines to this existing
>
> one?
>
>
>
>
>
> http://specs.openstack.org/openstack/openstack-user-stories/user-stories/proposed/ha_vm.html
>
>
>
> Waines, Greg <Greg.Waines at windriver.com<mailto:Greg.Waines at windriver.com>>
>
> wrote:
>
> Yes that’s correct.
>
> VM Heartbeating / Health-check Monitoring would introduce intrusive /
>
> white-box type monitoring of VMs / Instances.
>
>
>
> I realize this is somewhat in the gray-zone of what a cloud should be
>
> monitoring or not,
>
> but I believe it provides an alternative for Applications deployed in VMs
>
> that do not have an external monitoring/management entity like a VNF Manager
>
> in the MANO architecture.
>
> And even for VMs with VNF Managers, it provides a highly reliable
>
> alternate monitoring path that does not rely on Tenant Networking.
>
>
>
> You’re correct, that VM HB/HC Monitoring would leverage
>
> https://wiki.libvirt.org/page/Qemu_guest_agent
>
> that would require the agent to be installed in the images for talking
>
> back to the compute host.
>
> ( there are other examples of similar approaches in openstack ... the
>
> murano-agent for installation, the swift-agent for object store management )
>
> Although here, in the case of VM HB/HC Monitoring, via the QEMU Guest
>
> Agent, the messaging path is internal thru a QEMU virtual serial device.
>
> i.e. a very simple interface with very few dependencies ... it’s up and
>
> available very early in VM lifecycle and virtually always up.
>
>
>
> Wrt failure modes / use-cases
>
>
>
> ·         a VM’s response to a Heartbeat Challenge Request can be as
>
> simple as just ACK-ing,
>
> this alone allows for detection of:
>
>
>
> o    a failed or hung QEMU/KVM instance, or
>
>
>
> o    a failed or hung VM’s OS, or
>
>
>
> o    a failure of the VM’s OS to schedule the QEMU Guest Agent daemon, or
>
>
>
> o    a failure of the VM to route basic IO via linux sockets.
>
>
>
> ·         I have had feedback that this is similar to the virtual hardware
>
> watchdog of QEMU/KVM (
>
> https://libvirt.org/formatdomain.html#elementsWatchdog )
>
>
>
> ·         However, the VM Heartbeat / Health-check Monitoring
>
>
>
> o   provides a higher-level (i.e. application-level) heartbeating
>
>
>
> §  i.e. if the Heartbeat requests are being answered by the Application
>
> running within the VM
>
>
>
> o   provides more than just heartbeating, as the Application can use it to
>
> trigger a variety of audits,
>
>
>
> o   provides a mechanism for the Application within the VM to report a
>
> Health Status / Info back to the Host / Cloud,
>
>
>
> o   provides notification of the Heartbeat / Health-check status to
>
> higher-level cloud entities thru Vitrage
>
>
>
> §  e.g.   VM-Heartbeat-Monitor - to - Vitrage - (EventAlarm) - Aodh - ...
>
> - VNF-Manager
>
>
>
> - (StateChange) - Nova - ... - VNF Manager
>
>
>
>
>
> Greg.
>
>
>
>
>
> From: Adam Spiers <aspiers at suse.com<mailto:aspiers at suse.com>>
>
> Reply-To:
>
> "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>"
>
> <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
>
> Date: Tuesday, May 16, 2017 at 7:29 PM
>
> To:
>
> "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>"
>
> <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
>
> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat /
>
> Healthcheck Monitoring
>
>
>
> Waines, Greg
>
> <Greg.Waines at windriver.com<mailto:Greg.Waines at windriver.com><mailto:Greg.Waines at windriver.com><mailto:Greg.Waines at windriver.com%3e>>
>
> wrote:
>
> thanks for the pointers Sam.
>
>
>
> I took a quick look.
>
> I agree that the VM Heartbeat / Health-check looks like a good fit into
>
> Masakari.
>
>
>
> Currently your instance monitoring looks like it is strictly black-box
>
> type monitoring thru libvirt events.
>
> Is that correct ?
>
> i.e. you do not do any intrusive type monitoring of the instance thru the
>
> QUEMU Guest Agent facility
>
>        correct ?
>
>
>
> That is correct:
>
>
>
>
>
> https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/instancemonitor/instance.py
>
>
>
> I think this is what VM Heartbeat / Health-check would add to Masaraki.
>
> Let me know if you agree.
>
>
>
> OK, so you are looking for something slightly different I guess, based
>
> on this QEMU guest agent?
>
>
>
>     https://wiki.libvirt.org/page/Qemu_guest_agent
>
>
>
> That would require the agent to be installed in the images, which is
>
> extra work but I imagine quite easily justifiable in some scenarios.
>
> What failure modes do you have in mind for covering with this
>
> approach - things like the guest kernel freezing, for instance?
>
>
>
>
>
> __________________________________________________________________________
>
> OpenStack Development Mailing List (not for usage questions)
>
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __________________________________________________________________________
>
> OpenStack Development Mailing List (not for usage questions)
>
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list