[OpenStack-Infra] ze04 & #532575

Ian Wienand iwienand at redhat.com
Thu Jan 11 04:53:30 UTC 2018


Hi,

To avoid you having to pull apart the logs starting ~ [1], we
determined that ze04.o.o was externally rebooted at 01:00UTC (there is
a rather weird support ticket which you can look at, which is assigned
to a rackspace employee but in our queue, saying the host became
unresponsive).

Unfortunately that left a bunch of jobs orphaned and necessitated a
restart of zuul.

However, recent changes to not run the executor as root [2] were thus
partially rolled out on ze04 as it came up after reboot.  As a
consequence when the host came back up the executor was running as
root with an invalid finger server.

The executor on ze04 has been stopped, and the host placed in the
emergency file to avoid it coming back.  There are now some in-flight
patches to complete this transition, which will need to be staged a
bit more manually.

The other executors have been left as is, based on the KISS theory
they shouldn't restart and pick up the code until this has been dealt
with.

Thanks,

-i


[1] http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2018-01-11.log.html#t2018-01-11T01:09:20
[2] https://review.openstack.org/#/c/532575/



More information about the OpenStack-Infra mailing list