[openstack-dev] [nova] VM diagnostics - V3 proposal

John Garbutt john at johngarbutt.com
Thu Dec 19 14:27:40 UTC 2013


On 16 December 2013 15:50, Daniel P. Berrange <berrange at redhat.com> wrote:
> On Mon, Dec 16, 2013 at 03:37:39PM +0000, John Garbutt wrote:
>> On 16 December 2013 15:25, Daniel P. Berrange <berrange at redhat.com> wrote:
>> > On Mon, Dec 16, 2013 at 06:58:24AM -0800, Gary Kotton wrote:
>> >> I'd like to propose the following for the V3 API (we will not touch V2
>> >> in case operators have applications that are written against this – this
>> >> may be the case for libvirt or xen. The VMware API support was added
>> >> in I1):
>> >>
>> >>  1.  We formalize the data that is returned by the API [1]
>> >
>> > Before we debate what standard data should be returned we need
>> > detail of exactly what info the current 3 virt drivers return.
>> > IMHO it would be better if we did this all in the existing wiki
>> > page associated with the blueprint, rather than etherpad, so it
>> > serves as a permanent historical record for the blueprint design.
>>
>> +1
>>
>> > While we're doing this I think we should also consider whether
>> > the 'get_diagnostics' API is fit for purpose more generally.
>> > eg currently it is restricted to administrators. Some, if
>> > not all, of the data libvirt returns is relevant to the owner
>> > of the VM but they can not get at it.
>>
>> Ceilometer covers that ground, we should ask them about this API.
>
> If we consider what is potentially in scope for ceilometer and
> subtract that from what the libvirt get_diagnostics impl currently
> returns, you pretty much end up with the empty set. This might cause
> us to question if 'get_diagnostics' should exist at all from the
> POV of the libvirt driver's impl. Perhaps vmware/xen return data
> that is out of scope for ceilometer ?

Hmm, a good point.

>> > For a cloud administrator it might be argued that the current
>> > API is too inefficient to be useful in many troubleshooting
>> > scenarios since it requires you to invoke it once per instance
>> > if you're collecting info on a set of guests, eg all VMs on
>> > one host. It could be that cloud admins would be better
>> > served by an API which returned info for all VMs ona host
>> > at once, if they're monitoring say, I/O stats across VM
>> > disks to identify one that is causing I/O trouble ? IOW, I
>> > think we could do with better identifying the usage scenarios
>> > for this API if we're to improve its design / impl.
>>
>> I like the API that helps you dig into info for a specific host that
>> other system highlight as problematic.
>> You can do things that could be expensive to compute, but useful for
>> troubleshooting.
>
> If things get expensive to compute, then it may well be preferrable to
> have separate APIs for distinct pieces of "interesting" diagnostic
> data. eg If they only care about one particular thing, they don't want
> to trigger expensive computations of things they don't care about seeing.

Maybe that is what we need:
* API to get what ceilometer would tell you, maybe using its format
* API to "perform" expensive diagnostics

But then, we would just be duplicating ceilometer, which goes back to
your original point. And we are trying to get rid of the APIs that
just proxy to another service, so lets not add another one.

Maybe we should just remove this from the v3 API for now, and see who shouts?

John



More information about the OpenStack-dev mailing list