[openstack-dev] [Containers] Nova virt driver requirements

Daniel P. Berrange berrange at redhat.com
Thu Jul 10 15:32:22 UTC 2014


On Thu, Jul 10, 2014 at 08:19:36AM -0700, James Bottomley wrote:
> On Thu, 2014-07-10 at 14:47 +0100, Daniel P. Berrange wrote:
> > On Thu, Jul 10, 2014 at 05:36:59PM +0400, Dmitry Guryanov wrote:
> > > I have a question about mounts - in OpenVZ project each container has its own 
> > > filesystem in an image file. So to start a container we mount this filesystem 
> > > in host OS (because all containers share the same linux kernel). Is it a 
> > > security problem from the Openstack's developers vision?
> > > 
> > > 
> > > I have this question, because libvirt's driver uses libguestfs to copy some 
> > > files into guest filesystem instead of simple mount on host. Mounting with 
> > > libguestfs is slower, then mount on host, so there should be strong reasons, 
> > > why libvirt driver does it.
> > 
> > We consider mounting untrusted filesystems on the host kernel to be
> > an unacceptable security risk. A user can craft a malicious filesystem
> > that expliots bugs in the kernel filesystem drivers. This is particularly
> > bad if you allow the kernel to probe for filesystem type since Linux
> > has many many many filesystem drivers most of which are likely not
> > audited enough to be considered safe against malicious data. Even the
> > mainstream ext4 driver had a crasher bug present for many years
> > 
> >   https://lwn.net/Articles/538898/
> >   http://libguestfs.org/guestfs.3.html#security-of-mounting-filesystems
> 
> Actually, there's a hidden assumption here that makes this statement not
> necessarily correct for containers.  You're assuming the container has
> to have raw access to the device it's mounting.  For hypervisors, this
> is true, but it doesn't have to be for containers because the mount
> operation is separate from raw read and write so we can allow or deny
> them granularly.

I wasn't actually. In the Libvirt LXC case, Nova takes an image from
glance and mounts it on the host, and then sets up the container to
have its root at the filesystem on the host where it mounted the
image. So the container does not have any raw block access, but Nova
is still mounting an untrusted image from Glance in the host which
is a risk.

> Consider the old use case, where the container root is actually a
> subdirectory of the host filesystem which gets bind mounted.  The
> container has no possibility of altering the underlying block device
> there.  For block roots, which we also do, at least in the VPS world,
> they're mostly initialised by the hosting provider and the VPS
> environment doesn't actually get to read or write directly to them
> (there's often a block on this).  Of course, they *can* be set up so the
> VPS has raw access and I believe some are, but it's a choice not a
> requirement.

Where you could avoid the risk is if the image you're getting from
glance is not in fact a filesystem, but rather a tar.gz of the container
filesystem. Then Nova would simply be extracting the contents of the
tar archive and not accessing an untrusted filessytem image from
glance. IIUC, this is more or less what Docker does.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



More information about the OpenStack-dev mailing list