[openstack-dev] [nova] VM diagnostics - V3 proposal

Gary Kotton gkotton at vmware.com
Tue Dec 17 12:28:30 UTC 2013

Following the discussion yesterday I have updated the wiki - please see
https://wiki.openstack.org/wiki/Nova_VM_Diagnostics. The proposal is
backwards compatible and will hopefully provide us with the tools to be
able to troubleshoot VM issues.

On 12/16/13 5:50 PM, "Daniel P. Berrange" <berrange at redhat.com> wrote:

>On Mon, Dec 16, 2013 at 03:37:39PM +0000, John Garbutt wrote:
>> On 16 December 2013 15:25, Daniel P. Berrange <berrange at redhat.com>
>> > On Mon, Dec 16, 2013 at 06:58:24AM -0800, Gary Kotton wrote:
>> >> I'd like to propose the following for the V3 API (we will not touch
>> >> in case operators have applications that are written against this ­
>> >> may be the case for libvirt or xen. The VMware API support was added
>> >> in I1):
>> >>
>> >>  1.  We formalize the data that is returned by the API [1]
>> >
>> > Before we debate what standard data should be returned we need
>> > detail of exactly what info the current 3 virt drivers return.
>> > IMHO it would be better if we did this all in the existing wiki
>> > page associated with the blueprint, rather than etherpad, so it
>> > serves as a permanent historical record for the blueprint design.
>> +1
>> > While we're doing this I think we should also consider whether
>> > the 'get_diagnostics' API is fit for purpose more generally.
>> > eg currently it is restricted to administrators. Some, if
>> > not all, of the data libvirt returns is relevant to the owner
>> > of the VM but they can not get at it.
>> Ceilometer covers that ground, we should ask them about this API.
>If we consider what is potentially in scope for ceilometer and
>subtract that from what the libvirt get_diagnostics impl currently
>returns, you pretty much end up with the empty set. This might cause
>us to question if 'get_diagnostics' should exist at all from the
>POV of the libvirt driver's impl. Perhaps vmware/xen return data
>that is out of scope for ceilometer ?
>> > For a cloud administrator it might be argued that the current
>> > API is too inefficient to be useful in many troubleshooting
>> > scenarios since it requires you to invoke it once per instance
>> > if you're collecting info on a set of guests, eg all VMs on
>> > one host. It could be that cloud admins would be better
>> > served by an API which returned info for all VMs ona host
>> > at once, if they're monitoring say, I/O stats across VM
>> > disks to identify one that is causing I/O trouble ? IOW, I
>> > think we could do with better identifying the usage scenarios
>> > for this API if we're to improve its design / impl.
>> I like the API that helps you dig into info for a specific host that
>> other system highlight as problematic.
>> You can do things that could be expensive to compute, but useful for
>> troubleshooting.
>If things get expensive to compute, then it may well be preferrable to
>have separate APIs for distinct pieces of "interesting" diagnostic
>data. eg If they only care about one particular thing, they don't want
>to trigger expensive computations of things they don't care about seeing.
>b7b3ace5c560509caf1164f8f3f4dda62174e6374b07a85724183c      -o-
>7777ff9afdeb5949597de9596b75ab79abca0496a96703e15aa10              -o-
>5bb37036e37ed7e7dba4d88c00a289cfb0e42740528d5c7ca1bd690620 :|
>4174a142a6d8c5aa18ede84b47ec0db358b96c3b729232e004641e1       -o-
>s=313fd521d220dc3b7cbba305533de490bf614449d0489e705e15f2536657c222 :|
>6c64edaa5251e52b5fefc809d10b04a8482930f9ceccf981becc5e36ca8a       -o-
>f68b12b31379ea5d4e5e225a5a22195c335f3f1e09c68914e93d7db5ce3d66b :|
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org

More information about the OpenStack-dev mailing list