[openstack-dev] [kolla] on Dockerfile patterns

David Vossel dvossel at redhat.com
Wed Oct 15 15:09:03 UTC 2014



----- Original Message -----
> I'm not arguing that everything should be managed by one systemd, I'm just
> saying, for certain types of containers, a single docker container with
> systemd in it might be preferable to trying to slice it unnaturally into
> several containers.
> 
> Systemd has invested a lot of time/effort to be able to relaunch failed
> services, support spawning and maintaining unix sockets and services across
> them, etc, that you'd have to push out of and across docker containers. All
> of that can be done, but why reinvent the wheel? Like you said, pacemaker
> can be made to make it all work, but I have yet to see a way to deploy
> pacemaker services anywhere near as easy as systemd+yum makes it. (Thanks be
> to redhat. :)
> 
> The answer seems to be, its not "dockerish". Thats ok. I just wanted to
> understand the issue for what it is. If there is a really good reason for
> not wanting to do it, or that its just "not the way things are done". I've
> had kind of the opposite feeling regarding docker containers. Docker use to
> do very bad things when killing the container. nasty if you wanted your
> database not to go corrupt. killing pid 1 is a bit sketchy then forcing the
> container down after 10 seconds was particularly bad. having something like
> systemd in place allows the database to be notified, then shutdown properly.
> Sure you can script up enough shell to make this work, but you have to do
> some difficult code, over and over again... Docker has gotten better more
> recently but it still makes me a bit nervous using it for statefull things.
> 
> As for recovery, systemd can do the recovery too. I'd argue at this point in
> time, I'd expect systemd recovery to probably work better then some custom

yes, systemd can do recovery and that is part of the problem. From my perspective
there should be one resource management system. Whether that be pacemaker, kubernetes,
or some other distributed system, it doesn't matter.  If you are mixing systemd
with these other external distributed orchestration/management tools you have containers
that are silently failing/recovering without the management layer having any clue.

centralized recovery. There's one tool responsible for detecting and invoking recovery.
Everything else in the system is designed to make that possible.

If we want to put a process in the container to manage multiple services, we'd need
the ability to escalate failures to the distributed management tool.  Systemd could
work if it was given the ability to act more as a watchdog after starting services than
invoke recovery. If systemd could be configured to die (or potentially gracefully cleanup
the containers resource's before dieing) whenever a failure is detected, then systemd
might make sense. 

I'm approaching this from a system management point of view. Running systemd in your
one off container that you're managing manually does not have the same drawbacks.
I don't have a vendetta against systemd or anything, I just think it's a step backwards
to put systemd in containers. I see little value in having containers become lightweight
virtual machines. Containers have much more to offer.

-- Vossel



> shell scripts when it comes to doing the right thing recovering at bring up.
> The other thing is, recovery is not just about pid1 going away. often it
> sticks around and other badness is going on. Its A way to know things are
> bad, but you can't necessarily rely on it to know the container's healty.
> You need more robust checks for that.
> 
> Thanks,
> Kevin
> 
> ________________________________________
> From: David Vossel [dvossel at redhat.com]
> Sent: Tuesday, October 14, 2014 4:52 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [kolla] on Dockerfile patterns
> 
> ----- Original Message -----
> > Ok, why are you so down on running systemd in a container?
> 
> It goes against the grain.
> 
> From a distributed systems view, we gain quite a bit of control by
> maintaining
> "one service per container". Containers can be re-organised and re-purposed
> dynamically.
> If we have systemd trying to manage an entire stack of resources within a
> container,
> we lose this control.
> 
> From my perspective a containerized application stack needs to be managed
> externally
> by whatever is orchestrating the containers to begin with. When we take a
> step back
> and look at how we actually want to deploy containers, systemd doesn't make
> much sense.
> It actually limits us in the long run.
> 
> Also... recovery. Using systemd to manage a stack of resources within a
> single container
> makes it difficult for whatever is externally enforcing the availability of
> that container
> to detect the health of the container.  As it is now, the actual service is
> pid 1 of a
> container. If that service dies, the container dies. If systemd is pid 1,
> there can
> be all kinds of chaos occurring within the container, but the external
> distributed
> orchestration system won't have a clue (unless it invokes some custom health
> monitoring
> tools within the container itself, which will likely be the case someday.)
> 
> -- Vossel
> 
> 
> > Pacemaker works, but its kind of a pain to setup compared just yum
> > installing
> > a few packages and setting init to systemd. There are some benefits for
> > sure, but if you have to force all the docker components onto the same
> > physical machine anyway, why bother with the extra complexity?
> >
> > Thanks,
> > Kevin
> >
> > ________________________________________
> > From: David Vossel [dvossel at redhat.com]
> > Sent: Tuesday, October 14, 2014 3:14 PM
> > To: OpenStack Development Mailing List (not for usage questions)
> > Subject: Re: [openstack-dev] [kolla] on Dockerfile patterns
> >
> > ----- Original Message -----
> > > Same thing works with cloud init too...
> > >
> > >
> > > I've been waiting on systemd working inside a container for a while. it
> > > seems
> > > to work now.
> >
> > oh no...
> >
> > > The idea being its hard to write a shell script to get everything up and
> > > running with all the interactions that may need to happen. The init
> > > system's
> > > already designed for that. Take a nova-compute docker container for
> > > example,
> > > you probably need nova-compute, libvirt, neutron-openvswitch-agent, and
> > > the
> > > celiometer-agent all backed in. Writing a shell script to get it all
> > > started
> > > and shut down properly would be really ugly.
> > >
> > > You could split it up into 4 containers and try and ensure they are
> > > coscheduled and all the pieces are able to talk to each other, but why?
> > > Putting them all in one container with systemd starting the subprocesses
> > > is
> > > much easier and shouldn't have many drawbacks. The components code is
> > > designed and tested assuming the pieces are all together.
> >
> > What you need is a dependency model that is enforced outside of the
> > containers. Something
> > that manages the order containers are started/stopped/recovered in. This
> > allows
> > you to isolate your containers with 1 service per container, yet still
> > express that
> > container with service A needs to start before container with service B.
> >
> > Pacemaker does this easily. There's even a docker resource-agent for
> > Pacemaker now.
> > https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/docker
> >
> > -- Vossel
> >
> > ps. don't run systemd in a container... If you think you should, talk to me
> > first.
> >
> > >
> > > You can even add a ssh server in there easily too and then ansible in to
> > > do
> > > whatever other stuff you want to do to the container like add other
> > > monitoring and such....
> > >
> > > Ansible or puppet or whatever should work better in this arrangement too
> > > since existing code assumes you can just systemctl start foo;
> > >
> > > Kevin
> > > ________________________________________
> > > From: Lars Kellogg-Stedman [lars at redhat.com]
> > > Sent: Tuesday, October 14, 2014 12:10 PM
> > > To: OpenStack Development Mailing List (not for usage questions)
> > > Subject: Re: [openstack-dev] [kolla] on Dockerfile patterns
> > >
> > > On Tue, Oct 14, 2014 at 02:45:30PM -0400, Jay Pipes wrote:
> > > > With Docker, you are limited to the operating system of whatever the
> > > > image
> > > > uses.
> > >
> > > See, that's the part I disagree with.  What I was saying about ansible
> > > and puppet in my email is that I think the right thing to do is take
> > > advantage of those tools:
> > >
> > >   FROM ubuntu
> > >
> > >   RUN apt-get install ansible
> > >   COPY my_ansible_config.yaml /my_ansible_config.yaml
> > >   RUN ansible /my_ansible_config.yaml
> > >
> > > Or:
> > >
> > >   FROM Fedora
> > >
> > >   RUN yum install ansible
> > >   COPY my_ansible_config.yaml /my_ansible_config.yaml
> > >   RUN ansible /my_ansible_config.yaml
> > >
> > > Put the minimal instructions in your dockerfile to bootstrap your
> > > preferred configuration management tool. This is exactly what you
> > > would do when booting, say, a Nova instance into an openstack
> > > environment: you can provide a shell script to cloud-init that would
> > > install whatever packages are required to run your config management
> > > tool, and then run that tool.
> > >
> > > Once you have bootstrapped your cm environment you can take advantage
> > > of all those distribution-agnostic cm tools.
> > >
> > > In other words, using docker is no more limiting than using a vm or
> > > bare hardware that has been installed with your distribution of
> > > choice.
> > >
> > > > [1] Is there an official MySQL docker image? I found 553 Dockerhub
> > > > repositories for MySQL images...
> > >
> > > Yes, it's called "mysql".  It is in fact one of the official images
> > > highlighted on https://registry.hub.docker.com/.
> > >
> > > > >I have looked into using Puppet as part of both the build and runtime
> > > > >configuration process, but I haven't spent much time on it yet.
> > > >
> > > > Oh, I don't think Puppet is any better than Ansible for these things.
> > >
> > > I think it's pretty clear that I was not suggesting it was better than
> > > ansible.  That is hardly relevant to this discussion.  I was only
> > > saying that is what *I* have looked at, and I was agreeing that *any*
> > > configuration management system is probably better than writing shells
> > > cript.
> > >
> > > > How would I go about essentially transferring the ownership of the RPC
> > > > exchanges that the original nova-conductor container managed to the new
> > > > nova-conductor container? Would it be as simple as shutting down the
> > > > old
> > > > container and starting up the new nova-conductor container using things
> > > > like
> > > > --link rabbitmq:rabbitmq in the startup docker line?
> > >
> > > I think that you would not necessarily rely on --link for this sort of
> > > thing.  Under kubernetes, you would use a "service" definition, in
> > > which kubernetes maintains a proxy that directs traffic to the
> > > appropriate place as containers are created and destroyed.
> > >
> > > Outside of kubernetes, you would use some other service discovery
> > > mechanism; there are many available (etcd, consul, serf, etc).
> > >
> > > But this isn't particularly a docker problem.  This is the same
> > > problem you would face running the same software on top of a cloud
> > > environment in which you cannot predict things like ip addresses a
> > > priori.
> > >
> > > --
> > > Lars Kellogg-Stedman <lars at redhat.com> | larsks @
> > > {freenode,twitter,github}
> > > Cloud Engineering / OpenStack          | http://blog.oddbit.com/
> > >
> > >
> > > _______________________________________________
> > > OpenStack-dev mailing list
> > > OpenStack-dev at lists.openstack.org
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list