[tripleo] Stop using host's /run|/var/run inside containers

Sofer Athlan-Guyot sathlang at redhat.com
Fri Jun 19 12:48:10 UTC 2020


Hi,

not really a reply, but some random command as the title picked my
curiousity which might gives more context.

Cédric Jeanneret <cjeanner at redhat.com> writes:

> On 6/18/20 9:42 AM, Cédric Jeanneret wrote:
>> Hello all!
>> 
>> While working on podman integration, especially the SELinux part of it,
>> I was wondering why we kept using the host's /run (or its replicated
>> /var/run) location inside containers. And I'm still wondering, 2 years
>> later ;).
>> 
>> Reasons:
>> - from time to time, there are patches adding a ":z" flag to the run
>> bind-mount. This breaks the system, since the host systemd can't
>> write/access container_file_t SELinux context. Doing a relabeling might
>> therefore prevent a service restart.
>> 
>> - in order to keep things in a clean, understandable tree, getting a
>> dedicated shared directory for the container's sockets makes sense, as
>> it might make things easier to check (for instance, "is this or that
>> service running in a container?")
>> 
>> - if an operator runs a restorecon during runtime, it will break
>> container services
>> 
>> - mounting /run directly in the containers might expose unwanted
>> sockets, such as DBus (this creates SELinux denials, and we're
>> monkey-patching things and doing really ugly changes to prevent it).
>> It's more than probable other unwanted shared sockets end in the
>> containers, and it might expose the host at some point. Here again, from
>> time to time we see new SELinux policies being added in order to solve
>> the denials, and it creates big holes in the host security
>> 
>> AFAIK, no *host* service is accessed by any container services, right?
>> If so, could we imagine moving the shared /run to some other location on
>> the host, such as /run/containers, or /container-run, or any other
>> *dedicated* location we can manage as we want on a SELinux context?
>
> Small addendum/errata:
>
> some containers DO need to access some specific sockets/directories in
> /run, such as /run/netns and, probably, /run/openvswitch (iirc this one
> isn't running in a container).
> For those specific cases, we can of course mount the specific locations
> inside the container's /run.
>
> This addendum doesn't change the main question though :)
>

So I run that command on controller and compute (train ... sorry old
version, but the command stands) out of curiousity.

Get all the containers that mounts run:

for i in $(podman ps --format '{{.Names}}') ; do echo $i; podman inspect $i | jq '.[]|.Mounts[]|.Source + " -> " + .Destination'; done | awk '/^[a-z]/{container=$1}/run/{print container " : " $0}'

# controller:

swift_proxy : "/run -> /run"
ceph-mgr-controller-0 : "/var/run/ceph -> /var/run/ceph"
ceph-mon-controller-0 : "/var/run/ceph -> /var/run/ceph"
openstack-cinder-backup-podman-0 : "/run -> /run"
ovn_controller : "/run -> /run"
ovn_controller : "/var/lib/openvswitch/ovn -> /run/ovn"
nova_scheduler : "/run -> /run"
iscsid : "/run -> /run"
ovn-dbs-bundle-podman-0 : "/var/lib/openvswitch/ovn -> /run/openvswitch"
ovn-dbs-bundle-podman-0 : "/var/lib/openvswitch/ovn -> /run/ovn"
redis-bundle-podman-0 : "/var/run/redis -> /var/run/redis"

# compute
nova_compute : "/run -> /run"
ovn_metadata_agent : "/run/netns -> /run/netns"
ovn_metadata_agent : "/run/openvswitch -> /run/openvswitch"
ovn_controller : "/run -> /run"
ovn_controller : "/var/lib/openvswitch/ovn -> /run/ovn"
nova_migration_target : "/run/libvirt -> /run/libvirt"
iscsid : "/run -> /run"
nova_libvirt : "/run -> /run"
nova_libvirt : "/var/run/libvirt -> /var/run/libvirt"
nova_virtlogd : "/run -> /run"
nova_virtlogd : "/var/run/libvirt -> /var/run/libvirt"
neutron-haproxy-ovnmeta-a80e1d01-9c65-4fd3-8393-0bf5b66d175e : "/run/netns -> /run/netns"

So the usual suspects in this particular example seems to be
cinder-backup, iscsid, ceph, swift, redis.

Openvswitch seems to do the right thing here.

I guess that the nova one must be required somehow.

>
>> 
>> I would therefore get some feedback about this proposed change.
>> 
>> For the containers, nothing should change:
>> - they will get their /run populated with other containers sockets
>> - they will NOT be able to access the host services at all.
>> 
>> Thank you for your feedback, ideas and thoughts!
>> 
>> Cheers,
>> 
>> C.
>> 
>
> -- 
> Cédric Jeanneret (He/Him/His)
> Sr. Software Engineer - OpenStack Platform
> Deployment Framework TC
> Red Hat EMEA
> https://www.redhat.com/
>
-- 
Sofer Athlan-Guyot
chem on #irc
DFG:Upgrades




More information about the openstack-discuss mailing list