[openstack-dev] [vitrage] [nova] VM Heartbeat / Healthcheck Monitoring

Waines, Greg Greg.Waines at windriver.com
Wed May 10 17:24:21 UTC 2017

Some other UPDATES on this proposal (from outside the mailing list):

·         this should probably be based on an ‘image property’ rather than a ‘flavor extraspec’,
since it requires code to be included in the guest/VM image,

·         rather than use a unique virtio-serial link for the Heartbeat/Health-check Monitoring Messaging,
propose that we leverage the existing http://wiki.qemu.org/Features/GuestAgent

o    NOVA already supports a ‘hw_qemu_guest_agent=True’ image property
which results in NOVA setting up a virtio-serial connection to a QEMU Guest Agent
within the Guest/VM,

o    use this for the transport messaging layer for VM Heartbeating/Health-checking

With respect to ... where to propose / contribute this functionality,
Given that

·         this may require very little work in NOVA (by using QEMU Guest Agent), and

·         the fact that the primary result of VM Heartbeating / Health-checking is to report per-instance HB/HC status to Vitrage,
I am thinking that this would fit better simply in Vitrage.
An optional functionality enabled thru /etc/vitrage/vitrage.conf .

Comments ?

From: Greg Waines <Greg.Waines at windriver.com>
Reply-To: "openstack-dev at lists.openstack.org" <openstack-dev at lists.openstack.org>
Date: Tuesday, May 9, 2017 at 1:11 PM
To: "openstack-dev at lists.openstack.org" <openstack-dev at lists.openstack.org>
Subject: [openstack-dev] [vitrage] [nova] VM Heartbeat / Healthcheck Monitoring

I am looking for guidance on where to propose some “VM Heartbeat / Health-check Monitoring” functionality that I would like to contribute to openstack.

Briefly, “VM Heartbeat / Health-check Monitoring”

·         is optionally enabled thru a Nova flavor extra-spec,

·         is a service that runs on an OpenStack Compute Node,

·         it sends periodic Heartbeat / Health-check Challenge Requests to a VM
over a virtio-serial-device setup between the Compute Node and the VM thru QEMU,

·         on loss of heartbeat or a failed health check status will result in fault event, against the VM, being
reported to Vitrage thru its data-source API.

Where should I contribute this functionality ?

·         put it ALL in Vitrage ... both the monitoring and the data-source reporting ?

·         put the monitoring in Nova, and just the data source reporting in Vitrage ?

·         other ?


p.s. other info ...

Benefits of “VM Heartbeat / Health-check Monitoring”

·         monitors health of OS and Applications INSIDE the VM

o   i.e. even just a simple Ack of the Heartbeat would validate that the OS is running, IO mechanisms (sockets, etc)
are working and processes are getting scheduled

·         health-check status reporting can trigger and report on either high-level or detailed application-specific audits within the VM,

·         the simple virtio-serial-device interface thru QEMU is UP very early in VM life cycle and is virtually always up

o   i.e. its available for reporting issues virtually all the time,

o          ... compared to reporting issues over Tenant Network to a remote VNFManager which relies on Ethernet and IP Networking within the VM itself and then any provider network and adjacent routers around the compute nodes ...

·         uses a simple “Line-Delimited JSON” Format over virtio serial device ( http://www.linux-kvm.org/page/Virtio-serial_API )

o   simple to implement protocol inside VM, in pretty much any language

o   ( although would provide reference implementation )

·         provides more thorough instance monitoring than libvirt’s emulated hardware watchdog ( https://libvirt.org/formatdomain.html#elementsWatchdog )
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170510/b6b43349/attachment.html>

More information about the OpenStack-dev mailing list