[Openstack] detecting errors when determining libvirt vm power state

Serge Hallyn serge.hallyn at canonical.com
Wed Sep 28 17:01:37 UTC 2011


Hi,

I'm looking at what first manifested as a bug when launching multiple
lxc containers simultaneously, i.e. 'euca-run-instances -n 4', as
reported at https://bugs.launchpad.net/ubuntu/+source/nova/+bug/842845.

The problem appears to be that nova uses self.driver.get_info().  Libvirt
can raise excpetions on this for several reasons - the vm could be bad or
not exist, or it could be in a transient state i.e. cgroups are not set
up yet.

What is the right way to handle this?  Should the drivers categorize
their exceptions into either 'broken' or 'transient' ones, so that
nova can detect former and bail, and retry on the latter?

Note that while the bug was raised for lxc, I suspect the same should
be possible with kvm ones.  However the qemu GetInfo method doesn't
get its cpu/mem usage info from cgroups, so it would not happen the
exact same way.

-serge




More information about the Openstack mailing list