<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class=""><br>
> We consider mounting untrusted filesystems on the host kernel to be<br>
> an unacceptable security risk. A user can craft a malicious filesystem<br>
> that expliots bugs in the kernel filesystem drivers. This is particularly<br>
> bad if you allow the kernel to probe for filesystem type since Linux<br>
> has many many many filesystem drivers most of which are likely not<br>
> audited enough to be considered safe against malicious data. Even the<br>
> mainstream ext4 driver had a crasher bug present for many years<br>
><br>
> <a href="https://lwn.net/Articles/538898/" target="_blank">https://lwn.net/Articles/538898/</a><br>
> <a href="http://libguestfs.org/guestfs.3.html#security-of-mounting-filesystems" target="_blank">http://libguestfs.org/guestfs.3.html#security-of-mounting-filesystems</a><br>
<br>
</div>Actually, there's a hidden assumption here that makes this statement not<br>
necessarily correct for containers. You're assuming the container has<br>
to have raw access to the device it's mounting.</blockquote><div><br></div><div>I believe it does in the context of the Cinder API, but it does not in the general context of mounting devices.</div><div><br></div><div>
I advocate having a filesystem-as-a-service or host-mount-API which nicely aligns with desires to mount devices on behalf of containers "on the host". However, it doesn't exclude the fact that there are APIs and services those contract is, explicitly, to provide block into guests. I'll reiterate again and say that is where the contract should end (it should not extend to the ability of guest operating systems to mount, that would be silly).</div>
<div><br></div><div>None of this excludes having an opinion that mounting inside of a guest is a *useful feature*, even if I don't believe it to be a contractually obligated one. There is probably no harm in contemplating what mounting inside of a guest would look like.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">For hypervisors, this<br>
is true, but it doesn't have to be for containers because the mount<br>
operation is separate from raw read and write so we can allow or deny<br>
them granularly.<br></blockquote><div><br></div><div>I have been considering allowing containers read-only view of a block device. We could use seccomp to allow the mount syscall to succeed inside a container, although it would be forbidden by a missing SYS_CAP_ADMIN capability. The syscall would instead be trapped and performed by a privileged process elsewhere on the host.</div>
<div><br></div><div>The read-only view of the block device should not itself be a security concern. In fact, it could prove to be a useful feature in its own right. It is the ability to write to the block device which is a risk should it be mounted.</div>
<div><br></div><div>Having that read-only view also provides a certain awareness to the container of the existence of that volume. It allows the container to ATTEMPT to perform a mount operation, even if its denied by policy. That, of course, is where seccomp would come into play...</div>
</div><div><br></div>-- <br><div dir="ltr">Regards,<div>Eric Windisch</div></div>
</div></div>