[OpenStack-Infra] ze04 & #532575

Paul Belanger pabelanger at redhat.com
Thu Jan 11 16:05:20 UTC 2018


On Thu, Jan 11, 2018 at 07:58:11AM -0500, David Shrewsbury wrote:
> This is probably mostly my fault since I did not WIP or -2 my change in
> 532575 to keep it
> from getting merged without some infra coordination.
> 
> Because of that change, it is also required that we change the user
> zuul-executor starts
> as from root to zuul [1], and that we also open up the new default finger
> port on the
> executors [2]. Once those are in place, we should be ok to restart the
> executors.
> 
> As for ze04, since that one restarted as the 'root' user, and never dropped
> privileges
> to the 'zuul' user due to 532575, I'm not sure what state it is going to be
> in after applying
> [1] and [2]. Would it create files/directories as root that would now be
> inaccessible if it
> were to restart with the zuul user? Think logs, work dirs, etc...
> 
For permissions, we should likely confirm that puppet-zuul will properly setup
zuul:zuul on the required folders. Then next puppet run we'd be protected.
> 
> -Dave
> 
> 
> [1] https://review.openstack.org/532594
> [2] https://review.openstack.org/532709
> 
> 
> On Wed, Jan 10, 2018 at 11:53 PM, Ian Wienand <iwienand at redhat.com> wrote:
> 
> > Hi,
> >
> > To avoid you having to pull apart the logs starting ~ [1], we
> > determined that ze04.o.o was externally rebooted at 01:00UTC (there is
> > a rather weird support ticket which you can look at, which is assigned
> > to a rackspace employee but in our queue, saying the host became
> > unresponsive).
> >
> > Unfortunately that left a bunch of jobs orphaned and necessitated a
> > restart of zuul.
> >
> > However, recent changes to not run the executor as root [2] were thus
> > partially rolled out on ze04 as it came up after reboot.  As a
> > consequence when the host came back up the executor was running as
> > root with an invalid finger server.
> >
> > The executor on ze04 has been stopped, and the host placed in the
> > emergency file to avoid it coming back.  There are now some in-flight
> > patches to complete this transition, which will need to be staged a
> > bit more manually.
> >
> > The other executors have been left as is, based on the KISS theory
> > they shouldn't restart and pick up the code until this has been dealt
> > with.
> >
> > Thanks,
> >
> > -i
> >
> >
> > [1] http://eavesdrop.openstack.org/irclogs/%23openstack-
> > infra/%23openstack-infra.2018-01-11.log.html#t2018-01-11T01:09:20
> > [2] https://review.openstack.org/#/c/532575/
> >
> > _______________________________________________
> > OpenStack-Infra mailing list
> > OpenStack-Infra at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
> 
> 
> 
> 
> -- 
> David Shrewsbury (Shrews)

> _______________________________________________
> OpenStack-Infra mailing list
> OpenStack-Infra at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra




More information about the OpenStack-Infra mailing list