[OpenStack-Infra] Jenkins job may run on same node in CI environment.
Guo Qing GH Hu
hguoqing at cn.ibm.com
Tue Feb 11 08:47:26 UTC 2014
Dear all,
I found that in following scenario jenkins job may run on same node in CI
environment.
When nodepool is trying to delete the node 1 on jenkins, but the node has
been assigned to queued 2, so nodepool will be failed to delete node 1.
2014-01-21 03:13:16,520 DEBUG nodepool.NodeUpdateListener: Received:
onFinalized
{"name":"gate-ci-devstack-test","url":"job/gate-ci-devstack-test/","build":{"full_url":"
https://172.16.2.115/job/gate-ci-devstack-test/3815/
","number":3815,"phase":"FINISHED","status":"FAILURE","url":"job/gate-ci-devstack-test/3815/","parameters":{"BASE_LOG_PATH":"70/61470/3/check","LOG_PATH":"70/61470/3/check/gate-ci-devstack-test/739f893","ZUUL_BRANCH":"master","ZUUL_CHANGE":"61470","ZUUL_CHANGE_IDS":"61470,3","ZUUL_CHANGES":"openstack/nova:master:refs/changes/70/61470/3","ZUUL_COMMIT":"956132e8df4377e66d5b78b5b9864c7da37c6bde","ZUUL_PATCHSET":"3","ZUUL_PIPELINE":"check","ZUUL_PROJECT":"openstack/nova","ZUUL_REF":"refs/zuul/master/Z1473cd61831a445792d06152612ce7f9","ZUUL_URL":"
http://172.16.2.118/p
","ZUUL_UUID":"739f893379b84a64a22ea4db1721f7e7"},"node_name":"
devstack-precise-check-v1-gemini-cdl-7323"}}
2014-01-21 03:13:16,544 DEBUG nodepool.NodeUpdateListener: Received:
onStarted
{"name":"gate-ci-devstack-test","url":"job/gate-ci-devstack-test/","build":{"full_url":"
https://172.16.2.115/job/gate-ci-devstack-test/3823/
","number":3823,"phase":"STARTED","url":"job/gate-ci-devstack-test/3823/","parameters":{"BASE_LOG_PATH":"61/43061/7/check","LOG_PATH":"61/43061/7/check/gate-ci-devstack-test/cf9f514","ZUUL_BRANCH":"master","ZUUL_CHANGE":"43061","ZUUL_CHANGE_IDS":"43061,7","ZUUL_CHANGES":"openstack/nova:master:refs/changes/61/43061/7","ZUUL_COMMIT":"a6cd36551b778be3903eb552c22338e16708ed6e","ZUUL_PATCHSET":"7","ZUUL_PIPELINE":"check","ZUUL_PROJECT":"openstack/nova","ZUUL_REF":"refs/zuul/master/Z8721bb3c40d743d882b4c18ff896a079","ZUUL_URL":"
http://172.16.2.118/p
","ZUUL_UUID":"cf9f514c02a04da8a83ba2222dc4bebd"},"node_name":"
devstack-precise-check-v1-gemini-cdl-7323"}}
2014-01-21 03:13:16,551 INFO nodepool.NodeUpdateListener: Setting node id:
7323 to USED
2014-01-21 03:13:16,557 DEBUG nodepool.JenkinsManager: Manager jenkins01
running task <nodepool.jenkins_manager.NodeExistsTask object at
0x7faa14315090>
2014-01-21 03:13:17,576 DEBUG nodepool.JenkinsManager: Manager jenkins01
running task <nodepool.jenkins_manager.DeleteNodeTask object at
0x7faa10095e90>
2014-01-21 03:13:17,883 ERROR nodepool.NodeCompleteThread: Exception
handling event for devstack-precise-check-v1-gemini-cdl-7323:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/nodepool/nodepool.py", line
65, in run
self.handleEvent(session)
File "/usr/local/lib/python2.7/dist-packages/nodepool/nodepool.py", line
101, in handleEvent
self.nodepool.deleteNode(session, node)
File "/usr/local/lib/python2.7/dist-packages/nodepool/nodepool.py", line
1032, in deleteNode
jenkins.deleteNode(jenkins_name)
File
"/usr/local/lib/python2.7/dist-packages/nodepool/jenkins_manager.py", line
118, in deleteNode
return self.submitTask(DeleteNodeTask(name=name))
File "/usr/local/lib/python2.7/dist-packages/nodepool/task_manager.py",
line 90, in submitTask
return task.wait()
File "/usr/local/lib/python2.7/dist-packages/nodepool/task_manager.py",
line 51, in run
self.done(self.main(client))
File
"/usr/local/lib/python2.7/dist-packages/nodepool/jenkins_manager.py", line
64, in main
return jenkins.delete_node(self.args['name'])
File
"/usr/local/lib/python2.7/dist-packages/python_jenkins-0.2.1-py2.7.egg/jenkins/__init__.py",
line 508, in delete_node
raise JenkinsException('delete[%s] failed' % (name))
JenkinsException: delete[devstack-precise-check-v1-gemini-cdl-7323] failed
This error often occurs when all nodes are busy for zuul queue.
Changing following codes to no delay in nodepool.py can reduce the
probability:
time.sleep(DELETE_DELAY)
self.nodepool.deleteNode(session, node)
But as you see the above log, the probability still exists, how to resolve
it thoroughly ? Welcome your opinions!
Thanks & Best Regards,
Godwin Hu(胡国清)
Software Engineer
IBM China System and Technology Lab(CSTL), Beijing
Tel(Seat): 86-010-82451453
Location : Ring Building, 1BW270
E-mail Address: hguoqing at cn.ibm.com
Address: IBM ZGC Campus. Ring Building, 28# ZhongGuanCun Software Park,
Shang Di, Beijing P.R.China 100193
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-infra/attachments/20140211/1eb52170/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 5192 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-infra/attachments/20140211/1eb52170/attachment-0001.gif>
More information about the OpenStack-Infra
mailing list