[OpenStack-Infra] opendev.org downtime Thu Jul 25 07:00 UTC 2019

Ian Wienand iwienand at redhat.com
Thu Jul 25 07:57:44 UTC 2019


We received reports of connectivity issues to opendev.org at about
06:30 [1].

After some initial investigation, I could not contact
gitea-lb01.opendev.org via ipv4 or 6.

Upon checking it's console I saw a range of kernel errors that suggest
the host was probably having issues with it's disk [2].

I attempted to hard-reboot it, and it went into an error state.  The
initial error in the server status was

 {'message': 'Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreateWithFlags)', 'code': 500, 'created': '2019-07-25T07:25:25Z'}

After a short period, I tried again and got a different error state

 {'message': "internal error: process exited while connecting to monitor: lc=,keyid=masterKey0,iv=jHURYcYDkXqGBu4pC24bew==,format=base64 -drive 'file=rbd:volumes/volume-41553c15-6b12-4137-a318-7caf6a9eb44c:id=cinder:auth_supported=cephx\\;none:mon_host=\\:6789", 'code': 500, 'created': '2019-07-25T07:27:21Z'}

The vexxhost status page [3] is currently not showing any outages in
the sjc1 region where this resides.

I think this probably requires vexxhost to confirm the status of
load-balancer VM.

I tried to launch a new node, at least to have it ready in case of
bigger issues.  This failed with errors about the image service [4].
This further suggets there might be some storage issues on the

I then checked on the gitea* backend servers, and they have similar
messages in their kernel logs referring to storage too (I should have
done this first, probably).  So this again suggests it is a
region-wide issue.

I have reached out to mnaser on IRC.  I think he is GMT-4 usually so
that gives a few hours to expect a response.  This will also mean more
experienced gitea admins will be around too.  Given it appears to be a
backend provider issue, I will not take further at this point.



[1] http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2019-07-25.log.html#t2019-07-25T06:36:51
[2] http://paste.openstack.org/show/754834/
[3] https://status.vexxhost.com/
[4] http://paste.openstack.org/show/754835/

