[openstack-dev] [masakari] Intrusive Instance Monitoring

Sam P sam47priya at gmail.com
Thu May 18 18:06:30 UTC 2017


Hi Greg,

 Thank you for proposal.
 #BTW, I replied to our discussion in [1].

 Masakari mainly focuses on black box monitoring the VMs.
 But that does not mean Masakari do not do white box type of monitoring.
 There will be a configuration options for operators for whether to
use it or not and how to configure it.
 For masakari, this is one of the ways to extend its instance
monitoring capabilities.

 I really appreciate it if you could write a spec for this in [2], and
it will help masakari community and openstack-ha community to
understand the requirements and
 support them in future developments.

[1] http://lists.openstack.org/pipermail/openstack-dev/2017-May/117003.html
[2] https://github.com/openstack/masakari-specs
--- Regards,
Sampath



On Thu, May 18, 2017 at 6:15 AM, Waines, Greg <Greg.Waines at windriver.com> wrote:
> ( I have been having a discussion with Adam Spiers on
> [openstack-dev][vitrage][nova] on this topic ... thought I would switchover
> to [masakari] )
>
>
>
> I am interested in contributing an implementation of Intrusive Instance
> Monitoring,
>
> initially specifically VM Heartbeat / Heath-check Monitoring thru the QEMU
> Guest Agent (https://wiki.libvirt.org/page/Qemu_guest_agent).
>
>
>
> I’d like to know whether Masakari project leaders would consider a blueprint
> on “VM Heartbeat / Health-check Monitoring”.
>
> See below for some more details,
>
> Greg.
>
>
>
> -------------------------------------
>
>
>
>
>
> VM Heartbeating / Health-check Monitoring would introduce intrusive /
> white-box type monitoring of VMs / Instances to Masakari.
>
>
>
> Briefly, “VM Heartbeat / Health-check Monitoring”
>
> ·         is optionally enabled thru a Nova flavor extra-spec,
>
> ·         is a service that runs on an OpenStack Compute Node,
>
> ·         it sends periodic Heartbeat / Health-check Challenge Requests to a
> VM
> over a virtio-serial-device setup between the Compute Node and the VM thru
> QEMU,
> ( https://wiki.libvirt.org/page/Qemu_guest_agent )
>
> ·         on loss of heartbeat or a failed health check status will result
> in fault event, against the VM, being
> reported to Masakari and any other registered reporting backends like
> Mistral, or Vitrage.
>
>
>
> I realize this is somewhat in the gray-zone of what a cloud should be
> monitoring or not,
>
> but I believe it provides an alternative for Applications deployed in VMs
> that do not have an external monitoring/management entity like a VNF Manager
> in the MANO architecture.
>
> And even for VMs with VNF Managers, it provides a highly reliable alternate
> monitoring path that does not rely on Tenant Networking.
>
>
>
> VM HB/HC Monitoring would leverage
> https://wiki.libvirt.org/page/Qemu_guest_agent
>
> that would require the agent to be installed in the images for talking back
> to the compute host.
>
> ( there are other examples of similar approaches in openstack ... the
> murano-agent for installation, the swift-agent for object store management )
>
> Although here, in the case of VM HB/HC Monitoring, via the QEMU Guest Agent,
> the messaging path is internal thru a QEMU virtual serial device.  i.e. a
> very simple interface with very few dependencies ... it’s up and available
> very early in VM lifecycle and virtually always up.
>
>
>
> Wrt failure modes / use-cases
>
> ·         a VM’s response to a Heartbeat Challenge Request can be as simple
> as just ACK-ing,
> this alone allows for detection of:
>
> o    a failed or hung QEMU/KVM instance, or
>
> o    a failed or hung VM’s OS, or
>
> o    a failure of the VM’s OS to schedule the QEMU Guest Agent daemon, or
>
> o    a failure of the VM to route basic IO via linux sockets.
>
> ·         I have had feedback that this is similar to the virtual hardware
> watchdog of QEMU/KVM (https://libvirt.org/formatdomain.html#elementsWatchdog
> )
>
> ·         However, the VM Heartbeat / Health-check Monitoring
>
> o   provides a higher-level (i.e. application-level) heartbeating
>
> §  i.e. if the Heartbeat requests are being answered by the Application
> running within the VM
>
> o   provides more than just heartbeating, as the Application can use it to
> trigger a variety of audits,
>
> o   provides a mechanism for the Application within the VM to report a
> Health Status / Info back to the Host / Cloud,
>
> o   provides notification of the Heartbeat / Health-check status to
> higher-level cloud entities thru Masakari, Mistral and/or Vitrage
>
> §  e.g.   VM-Heartbeat-Monitor - to - Vitrage - (EventAlarm) - Aodh - ... -
> VNF-Manager
>
> - (StateChange) - Nova - ... - VNF Manager
>
>
>
> NOTE: perhaps the reporting to Vitrage would be a separate blueprint within
> Masakari.
>
>
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list