[tripleo] Stop using host's /run|/var/run inside containers
Sofer Athlan-Guyot
sathlang at redhat.com
Fri Jun 19 12:48:10 UTC 2020
Hi,
not really a reply, but some random command as the title picked my
curiousity which might gives more context.
Cédric Jeanneret <cjeanner at redhat.com> writes:
> On 6/18/20 9:42 AM, Cédric Jeanneret wrote:
>> Hello all!
>>
>> While working on podman integration, especially the SELinux part of it,
>> I was wondering why we kept using the host's /run (or its replicated
>> /var/run) location inside containers. And I'm still wondering, 2 years
>> later ;).
>>
>> Reasons:
>> - from time to time, there are patches adding a ":z" flag to the run
>> bind-mount. This breaks the system, since the host systemd can't
>> write/access container_file_t SELinux context. Doing a relabeling might
>> therefore prevent a service restart.
>>
>> - in order to keep things in a clean, understandable tree, getting a
>> dedicated shared directory for the container's sockets makes sense, as
>> it might make things easier to check (for instance, "is this or that
>> service running in a container?")
>>
>> - if an operator runs a restorecon during runtime, it will break
>> container services
>>
>> - mounting /run directly in the containers might expose unwanted
>> sockets, such as DBus (this creates SELinux denials, and we're
>> monkey-patching things and doing really ugly changes to prevent it).
>> It's more than probable other unwanted shared sockets end in the
>> containers, and it might expose the host at some point. Here again, from
>> time to time we see new SELinux policies being added in order to solve
>> the denials, and it creates big holes in the host security
>>
>> AFAIK, no *host* service is accessed by any container services, right?
>> If so, could we imagine moving the shared /run to some other location on
>> the host, such as /run/containers, or /container-run, or any other
>> *dedicated* location we can manage as we want on a SELinux context?
>
> Small addendum/errata:
>
> some containers DO need to access some specific sockets/directories in
> /run, such as /run/netns and, probably, /run/openvswitch (iirc this one
> isn't running in a container).
> For those specific cases, we can of course mount the specific locations
> inside the container's /run.
>
> This addendum doesn't change the main question though :)
>
So I run that command on controller and compute (train ... sorry old
version, but the command stands) out of curiousity.
Get all the containers that mounts run:
for i in $(podman ps --format '{{.Names}}') ; do echo $i; podman inspect $i | jq '.[]|.Mounts[]|.Source + " -> " + .Destination'; done | awk '/^[a-z]/{container=$1}/run/{print container " : " $0}'
# controller:
swift_proxy : "/run -> /run"
ceph-mgr-controller-0 : "/var/run/ceph -> /var/run/ceph"
ceph-mon-controller-0 : "/var/run/ceph -> /var/run/ceph"
openstack-cinder-backup-podman-0 : "/run -> /run"
ovn_controller : "/run -> /run"
ovn_controller : "/var/lib/openvswitch/ovn -> /run/ovn"
nova_scheduler : "/run -> /run"
iscsid : "/run -> /run"
ovn-dbs-bundle-podman-0 : "/var/lib/openvswitch/ovn -> /run/openvswitch"
ovn-dbs-bundle-podman-0 : "/var/lib/openvswitch/ovn -> /run/ovn"
redis-bundle-podman-0 : "/var/run/redis -> /var/run/redis"
# compute
nova_compute : "/run -> /run"
ovn_metadata_agent : "/run/netns -> /run/netns"
ovn_metadata_agent : "/run/openvswitch -> /run/openvswitch"
ovn_controller : "/run -> /run"
ovn_controller : "/var/lib/openvswitch/ovn -> /run/ovn"
nova_migration_target : "/run/libvirt -> /run/libvirt"
iscsid : "/run -> /run"
nova_libvirt : "/run -> /run"
nova_libvirt : "/var/run/libvirt -> /var/run/libvirt"
nova_virtlogd : "/run -> /run"
nova_virtlogd : "/var/run/libvirt -> /var/run/libvirt"
neutron-haproxy-ovnmeta-a80e1d01-9c65-4fd3-8393-0bf5b66d175e : "/run/netns -> /run/netns"
So the usual suspects in this particular example seems to be
cinder-backup, iscsid, ceph, swift, redis.
Openvswitch seems to do the right thing here.
I guess that the nova one must be required somehow.
>
>>
>> I would therefore get some feedback about this proposed change.
>>
>> For the containers, nothing should change:
>> - they will get their /run populated with other containers sockets
>> - they will NOT be able to access the host services at all.
>>
>> Thank you for your feedback, ideas and thoughts!
>>
>> Cheers,
>>
>> C.
>>
>
> --
> Cédric Jeanneret (He/Him/His)
> Sr. Software Engineer - OpenStack Platform
> Deployment Framework TC
> Red Hat EMEA
> https://www.redhat.com/
>
--
Sofer Athlan-Guyot
chem on #irc
DFG:Upgrades
More information about the openstack-discuss
mailing list