[Openstack] [libvirt] [RFC PATCH] lxc: don't return error on GetInfo when cgroups not yet set up

Serge Hallyn serge.hallyn at canonical.com
Fri Sep 30 14:52:03 UTC 2011


Quoting Serge E. Hallyn (serge.hallyn at canonical.com):
> Quoting Daniel P. Berrange (berrange at redhat.com):
> > On Wed, Sep 28, 2011 at 02:14:52PM -0500, Serge E. Hallyn wrote:
> > > Nova (openstack) calls libvirt to create a container, then
> > > periodically checks using GetInfo to see whether the container
> > > is up.  If it does this too quickly, then libvirt returns an
> > > error, which in libvirt.py causes an exception to be raised,
> > > the same type as if the container was bad.
> > lxcDomainGetInfo(), holds a mutex on 'dom' for the duration of
> > its execution. It checks for virDomainObjIsActive() before
> > trying to use the cgroups.
> 
> Yes, it does, but
> 
> > lxcDomainStart(), holds the mutex on 'dom' for the duration of
> > its execution, and does not return until the container is running
> > and cgroups are present.
> 
> No.  It calls the lxc_controller with --background.  The controller
> main task in turn exits before the cgroups have been set up.  There
> is the race.

So what is the right fix here?  Should the controller write out another
file when it is past the part which should be locked, and the driver
waits for that file to exist before it drops the driver mutex?  If we
do that, do we risk having the driver hang when the controller has
hung?

-serge




More information about the Openstack mailing list