[openstack-dev] [vitrage] [nova] [HA] [masakari] VM Heartbeat / Healthcheck Monitoring

Sam P sam47priya at gmail.com
Sat May 20 01:12:12 UTC 2017


Hi Vikash,
 Great... I will add you as reviewer to this spec.
 Thank you..
--- Regards,
Sampath



On Fri, May 19, 2017 at 1:06 PM, Vikash Kumar
<vikash.kumar at oneconvergence.com> wrote:
> Hi Greg,
>
>     Please include my email in this spec also. We are also dealing with HA
> of Virtual Instances (especially for Vendors) and will participate.
>
> On Thu, May 18, 2017 at 11:33 PM, Waines, Greg <Greg.Waines at windriver.com>
> wrote:
>>
>> Yes I am good with writing spec for this in masakari-spec.
>>
>>
>>
>> Do you use gerrit for this git ?
>>
>> Do you have a template for your specs ?
>>
>>
>>
>> Greg.
>>
>>
>>
>>
>>
>>
>>
>> From: Sam P <sam47priya at gmail.com>
>> Reply-To: "openstack-dev at lists.openstack.org"
>> <openstack-dev at lists.openstack.org>
>> Date: Thursday, May 18, 2017 at 1:51 PM
>> To: "openstack-dev at lists.openstack.org"
>> <openstack-dev at lists.openstack.org>
>> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] [masakari] VM Heartbeat
>> / Healthcheck Monitoring
>>
>>
>>
>> Hi Greg,
>>
>> Thank you Adam for followup.
>>
>> This is new feature for masakari-monitors and think  Masakari can
>>
>> accommodate this feature in  masakari-monitors.
>>
>> From the implementation prospective, it is not that hard to do.
>>
>> However, as you can see in our Boston presentation, Masakari will
>>
>> replace its monitoring parts ( which is masakari-monitors) with,
>>
>> nova-host-alerter, **-process-alerter, and **-instance-alerter. (**
>>
>> part is not defined yet..:p)...
>>
>> Therefore, I would like to save this specifications, and make sure we
>>
>> will not miss  anything in the transformation..
>>
>> Does is make sense to write simple spec for this in masakari-spec [1]?
>>
>> So we can discuss about the requirements how to implement it.
>>
>>
>>
>> [1] https://github.com/openstack/masakari-specs
>>
>>
>>
>> --- Regards,
>>
>> Sampath
>>
>>
>>
>>
>>
>>
>>
>> On Thu, May 18, 2017 at 2:29 AM, Adam Spiers <aspiers at suse.com> wrote:
>>
>> I don't see any reason why masakari couldn't handle that, but you'd
>>
>> have to ask Sampath and the masakari team whether they would consider
>>
>> that in scope for their roadmap.
>>
>>
>>
>> Waines, Greg <Greg.Waines at windriver.com> wrote:
>>
>>
>>
>> Sure.  I can propose a new user story.
>>
>>
>>
>> And then are you thinking of including this user story in the scope of
>>
>> what masakari would be looking at ?
>>
>>
>>
>> Greg.
>>
>>
>>
>>
>>
>> From: Adam Spiers <aspiers at suse.com>
>>
>> Reply-To: "openstack-dev at lists.openstack.org"
>>
>> <openstack-dev at lists.openstack.org>
>>
>> Date: Wednesday, May 17, 2017 at 10:08 AM
>>
>> To: "openstack-dev at lists.openstack.org"
>>
>> <openstack-dev at lists.openstack.org>
>>
>> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat /
>>
>> Healthcheck Monitoring
>>
>>
>>
>> Thanks for the clarification Greg.  This sounds like it has the
>>
>> potential to be a very useful capability.  May I suggest that you
>>
>> propose a new user story for it, along similar lines to this existing
>>
>> one?
>>
>>
>>
>>
>>
>>
>> http://specs.openstack.org/openstack/openstack-user-stories/user-stories/proposed/ha_vm.html
>>
>>
>>
>> Waines, Greg <Greg.Waines at windriver.com<mailto:Greg.Waines at windriver.com>>
>>
>> wrote:
>>
>> Yes that’s correct.
>>
>> VM Heartbeating / Health-check Monitoring would introduce intrusive /
>>
>> white-box type monitoring of VMs / Instances.
>>
>>
>>
>> I realize this is somewhat in the gray-zone of what a cloud should be
>>
>> monitoring or not,
>>
>> but I believe it provides an alternative for Applications deployed in VMs
>>
>> that do not have an external monitoring/management entity like a VNF
>> Manager
>>
>> in the MANO architecture.
>>
>> And even for VMs with VNF Managers, it provides a highly reliable
>>
>> alternate monitoring path that does not rely on Tenant Networking.
>>
>>
>>
>> You’re correct, that VM HB/HC Monitoring would leverage
>>
>> https://wiki.libvirt.org/page/Qemu_guest_agent
>>
>> that would require the agent to be installed in the images for talking
>>
>> back to the compute host.
>>
>> ( there are other examples of similar approaches in openstack ... the
>>
>> murano-agent for installation, the swift-agent for object store management
>> )
>>
>> Although here, in the case of VM HB/HC Monitoring, via the QEMU Guest
>>
>> Agent, the messaging path is internal thru a QEMU virtual serial device.
>>
>> i.e. a very simple interface with very few dependencies ... it’s up and
>>
>> available very early in VM lifecycle and virtually always up.
>>
>>
>>
>> Wrt failure modes / use-cases
>>
>>
>>
>> ·         a VM’s response to a Heartbeat Challenge Request can be as
>>
>> simple as just ACK-ing,
>>
>> this alone allows for detection of:
>>
>>
>>
>> o    a failed or hung QEMU/KVM instance, or
>>
>>
>>
>> o    a failed or hung VM’s OS, or
>>
>>
>>
>> o    a failure of the VM’s OS to schedule the QEMU Guest Agent daemon, or
>>
>>
>>
>> o    a failure of the VM to route basic IO via linux sockets.
>>
>>
>>
>> ·         I have had feedback that this is similar to the virtual hardware
>>
>> watchdog of QEMU/KVM (
>>
>> https://libvirt.org/formatdomain.html#elementsWatchdog )
>>
>>
>>
>> ·         However, the VM Heartbeat / Health-check Monitoring
>>
>>
>>
>> o   provides a higher-level (i.e. application-level) heartbeating
>>
>>
>>
>> §  i.e. if the Heartbeat requests are being answered by the Application
>>
>> running within the VM
>>
>>
>>
>> o   provides more than just heartbeating, as the Application can use it to
>>
>> trigger a variety of audits,
>>
>>
>>
>> o   provides a mechanism for the Application within the VM to report a
>>
>> Health Status / Info back to the Host / Cloud,
>>
>>
>>
>> o   provides notification of the Heartbeat / Health-check status to
>>
>> higher-level cloud entities thru Vitrage
>>
>>
>>
>> §  e.g.   VM-Heartbeat-Monitor - to - Vitrage - (EventAlarm) - Aodh - ...
>>
>> - VNF-Manager
>>
>>
>>
>> - (StateChange) - Nova - ... - VNF Manager
>>
>>
>>
>>
>>
>> Greg.
>>
>>
>>
>>
>>
>> From: Adam Spiers <aspiers at suse.com<mailto:aspiers at suse.com>>
>>
>> Reply-To:
>>
>>
>> "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>"
>>
>>
>> <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
>>
>> Date: Tuesday, May 16, 2017 at 7:29 PM
>>
>> To:
>>
>>
>> "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>"
>>
>>
>> <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
>>
>> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat /
>>
>> Healthcheck Monitoring
>>
>>
>>
>> Waines, Greg
>>
>>
>> <Greg.Waines at windriver.com<mailto:Greg.Waines at windriver.com><mailto:Greg.Waines at windriver.com><mailto:Greg.Waines at windriver.com%3e>>
>>
>> wrote:
>>
>> thanks for the pointers Sam.
>>
>>
>>
>> I took a quick look.
>>
>> I agree that the VM Heartbeat / Health-check looks like a good fit into
>>
>> Masakari.
>>
>>
>>
>> Currently your instance monitoring looks like it is strictly black-box
>>
>> type monitoring thru libvirt events.
>>
>> Is that correct ?
>>
>> i.e. you do not do any intrusive type monitoring of the instance thru the
>>
>> QUEMU Guest Agent facility
>>
>>        correct ?
>>
>>
>>
>> That is correct:
>>
>>
>>
>>
>>
>>
>> https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/instancemonitor/instance.py
>>
>>
>>
>> I think this is what VM Heartbeat / Health-check would add to Masaraki.
>>
>> Let me know if you agree.
>>
>>
>>
>> OK, so you are looking for something slightly different I guess, based
>>
>> on this QEMU guest agent?
>>
>>
>>
>>     https://wiki.libvirt.org/page/Qemu_guest_agent
>>
>>
>>
>> That would require the agent to be installed in the images, which is
>>
>> extra work but I imagine quite easily justifiable in some scenarios.
>>
>> What failure modes do you have in mind for covering with this
>>
>> approach - things like the guest kernel freezing, for instance?
>>
>>
>>
>>
>>
>> __________________________________________________________________________
>>
>> OpenStack Development Mailing List (not for usage questions)
>>
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>> __________________________________________________________________________
>>
>> OpenStack Development Mailing List (not for usage questions)
>>
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
>
> --
> Regards,
> Vikash
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list