[openstack-dev] RFC: Synchronizing hypervisor <-> nova state with event notifications

Daniel P. Berrange berrange at redhat.com
Tue Jan 15 10:52:39 UTC 2013


This email is in relation to the following bug

  https://bugs.launchpad.net/nova/+bug/1099761

If a guest administrator shuts down their VM from inside the guest
OS using 'shutdown -h now' (or equivalent), it will take Nova upto
10 minutes to notice that the VM has stopped running

The Nova<->hypervisor state synchronization is done in a periodic
task (nova.compute.manager._sync_power_states) which only runs
every 10 minutes. It used to be every 60 seconds until this bug
was addressed:

  https://bugs.launchpad.net/nova/+bug/928910

I put most of the blame for that bug's perforance issues in the use
of the wrong libvirt APIs, but none the less I don't really want to
put the periodic task frequency back down to 60 seconds.

IMHO the key problem here is the design, not the implementation. In
particular the use of polling by the manager to detect the state
changes is inherantly inefficient. Libvirt provides an async event
notification mechanism allowing applications to have a callback
triggered whenever a guest changes lifecycle state. Other hypervisor
APIs like ESX have similar notification mechanisms. Making use of
this would allow state changes to be detected pretty much immediately.

The question is how to structure the processing of events. I don't
think that the hypervisor drivers should be directly processing events.
Instead I believe they need to pass along the event notifications to
the manager.py class. So my current thought is to introduce a new API
to nova.virt.api

  register_event_notifier(self, callback)

and have nova.compute.manager provide a callback impl to receive the
events. Before I start coding on this, I want some kind of confirmation
that this is an acceptable direction to go in, since there is no current
callback based interactions between nova.compute.manager & nova.virt.api

Even with the register_event_notifier() callacks, I figure we'd still
run the periodic _sync_power_states() tasks at 10 minute intervals. It
should merely not have an work todo when it runs. Alternatively we
could disable this periodic tasks if-only-if register_event_notifier()
was successful (ie didn't raise NotImplementedException)

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



More information about the OpenStack-dev mailing list