[openstack-dev] [Containers] Nova virt driver requirements

Eric Windisch ewindisch at docker.com
Fri Jul 11 14:11:07 UTC 2014


>
>
> > We consider mounting untrusted filesystems on the host kernel to be
> > an unacceptable security risk. A user can craft a malicious filesystem
> > that expliots bugs in the kernel filesystem drivers. This is particularly
> > bad if you allow the kernel to probe for filesystem type since Linux
> > has many many many filesystem drivers most of which are likely not
> > audited enough to be considered safe against malicious data. Even the
> > mainstream ext4 driver had a crasher bug present for many years
> >
> >   https://lwn.net/Articles/538898/
> >   http://libguestfs.org/guestfs.3.html#security-of-mounting-filesystems
>
> Actually, there's a hidden assumption here that makes this statement not
> necessarily correct for containers.  You're assuming the container has
> to have raw access to the device it's mounting.


I believe it does in the context of the Cinder API, but it does not in the
general context of mounting devices.

I advocate having a filesystem-as-a-service or host-mount-API which nicely
aligns with desires to mount devices on behalf of containers "on the host".
However, it doesn't exclude the fact that there are APIs and services those
contract is, explicitly, to provide block into guests. I'll reiterate again
and say that is where the contract should end (it should not extend to the
ability of guest operating systems to mount, that would be silly).

None of this excludes having an opinion that mounting inside of a guest is
a *useful feature*, even if I don't believe it to be a contractually
obligated one. There is probably no harm in contemplating what mounting
inside of a guest would look like.


> For hypervisors, this
> is true, but it doesn't have to be for containers because the mount
> operation is separate from raw read and write so we can allow or deny
> them granularly.
>

I have been considering allowing containers read-only view of a block
device. We could use seccomp to allow the mount syscall to succeed inside a
container, although it would be forbidden by a missing SYS_CAP_ADMIN
capability. The syscall would instead be trapped and performed by a
privileged process elsewhere on the host.

The read-only view of the block device should not itself be a security
concern. In fact, it could prove to be a useful feature in its own right.
It is the ability to write to the block device which is a risk should it be
mounted.

Having that read-only view also provides a certain awareness to the
container of the existence of that volume. It allows the container to
ATTEMPT to perform a mount operation, even if its denied by policy. That,
of course, is where seccomp would come into play...

-- 
Regards,
Eric Windisch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140711/9d17b64c/attachment.html>


More information about the OpenStack-dev mailing list