[Third-party-announce] Gerrit account xio-ise-iscsi-ci is disabled

Hedlind, Richard Richard.Hedlind at X-IO.com
Thu Dec 17 22:04:10 UTC 2015


My triage notes so far:

Jenkins service crashed on the local CI master due to an out of memory issue (seemingly caused by zuul as it had eaten up a lot of system memory at that point).
The zuul scheduler (version 2.1.1.dev15)  could no longer communicate with Jenkins to submit the job for change 252250,13 and hit the following exception:

2015-12-16 22:07:44,274 INFO zuul.Gerrit: Updating information for 252250,13
2015-12-16 22:07:45,386 ERROR zuul.Scheduler: Exception in run handler:
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 831, in run
    while pipeline.manager.processQueue():
  File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 1441, in processQueue
    item, nnfi, ready_ahead)
  File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 1413, in _processOneItem
    self.reportItem(item)
  File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 1498, in reportItem
    item.reported = not self._reportItem(item)
  File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 1552, in _reportItem
    self.updateBuildDescriptions(item.current_build_set)
  File "/usr/local/lib/python2.7/dist-packages/zuul/scheduler.py", line 1460, in updateBuildDescriptions
    self.sched.launcher.setBuildDescription(build, desc)
  File "/usr/local/lib/python2.7/dist-packages/zuul/launcher/gearman.py", line 518, in setBuildDescription
    timeout=300)
  File "/usr/local/lib/python2.7/dist-packages/gear/__init__.py", line 1450, in submitJob
    raise GearmanError("Unable to submit job to any connected servers")
GearmanError: Unable to submit job to any connected servers
2015-12-16 22:07:45,387 INFO zuul.IndependentPipelineManager: Reporting change <Change 0x7f3f6ce87f50 252250,13>, actions: [<ActionReporter <zuul.reporter.gerrit.Reporter object at 0x7f40bf4bdb50>, {'verified': 0}>]

This caused zuul to end up in an infinite loop of trying to post the job, hit the exception, post a comment on the change to gerrit and try again.

Remediation steps identified so far: 

1) Updated zuul to latest version 2.1.1.dev109 (completed)
2) Look at zuul source to see if an infinite loop can be identified after hitting above exception.
3) Add protection/alerting mechanism to handle Jenkins crash.

I would also like to know what the steps are to have the CI account enabled again?

Richard

-----Original Message-----
From: Anita Kuno [mailto:anteaya at anteaya.info] 
Sent: Thursday, December 17, 2015 10:32 AM
To: Announcements for Third Party CI Operators. <third-party-announce at lists.openstack.org>
Subject: [Third-party-announce] Gerrit account xio-ise-iscsi-ci is disabled

https://wiki.openstack.org/wiki/ThirdPartySystems/X-IO_technologies_CI

This account is disabled and the connection to Gerrit is closed.

This account was autogenerating comments to Gerrit patch 252250 to the tune of 4MB of content:
http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2015-12-17.log.html#t2015-12-17T17:07:41

This account will remain disabled until the Zuul backlog created by this occurance has been cleared: http://status.openstack.org/zuul/ and until I hear from the operators of this system telling me that they are willing to take responsibility for their actions and they will do so in the future.

Thank you,
Anita.

_______________________________________________
Third-party-announce mailing list
Third-party-announce at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/third-party-announce
Please attend the third party meetings: http://eavesdrop.openstack.org/#Third_Party_Meeting



More information about the Third-party-announce mailing list