[OpenStack-Infra] Zuul get stuck when fetching patch from gerrit, how to deal with this?

Antoine Musso hashar at free.fr
Thu Mar 5 09:35:17 UTC 2015


On 04/03/15 14:29, liuxinguo wrote:
>
> The network of our CI is poor so sometime the zuul will get stuck when 
> fetching patch from gerrit. When running “ps -ef|grep zuul” I find it 
> get stuck like this:
>
> root at 3rd-ci-master:/var/log/zuul# ps -ef|grep zuul
>
> root       698  9074  0 16:48 pts/0    00:00:00 grep --color=auto zuul
>
> zuul     31730     1  2 03:01 ?        00:16:37 /usr/bin/python 
> /usr/local/bin/zuul-server
>
> zuul     31739     1  0 03:01 ?        00:00:00 /usr/bin/python 
> /usr/local/bin/zuul-merger
>
> zuul     31748 31730  0 03:01 ?        00:00:00 /usr/bin/python 
> /usr/local/bin/zuul-server
>
> zuul     32008 31739  0 05:08 ?        00:00:00 git remote update origin
>
> zuul     32009 32008  0 05:08 ?        00:00:00 git fetch --multiple 
> origin
>
> zuul     32010 32009  0 05:08 ?        00:00:00 git fetch --append origin
>
> zuul     32011 32010  0 05:08 ?        00:00:00 /bin/bash 
> /var/lib/zuul/git/.ssh_wrapper -p 29418 
> huawei-volume-ci at review.openstack.org 
> <mailto:huawei-volume-ci at review.openstack.org> git-upload-pack 
> '/openstack/cinder'
>
> zuul     32012 32011  0 05:08 ?        00:00:00 ssh -i 
> /var/lib/zuul/ssh/id_rsa -p 29418 
> huawei-volume-ci at review.openstack.org 
> <mailto:huawei-volume-ci at review.openstack.org> git-upload-pack 
> '/openstack/cinder'‍
>
> I have to add a Crontab to restart the zuul and zuul-merge service and 
> kill all these stucked progesses or do this manually.
>
> ·Is there any better method to avoid the “git remote update origin” 
> and “ssh_wrapper” getting stuck when the network is poor.
>
> Thanks for any input!
>

Hello,

The zuul-merger ensures the repository is up-to-date before attempting 
to merge the proposed patch. It indeeds uses 'git remote update'.  git 
sends a list of objects to the server which let the server send you only 
the missing objects.

Since the zuul-merger repositories have a reference created for each 
patch merged, they are all sent to Gerrit which takes quite a while. On 
Wikimedia setup it took up to 20 seconds for one of our biggest repos 
see https://phabricator.wikimedia.org/T70481 for details.

I proposed a tiny python script which let one clear out old references 
from the zuul-merger repositories. It looks at zuul refs and delete them 
when the commit date is older than a given amount of days.

https://review.openstack.org/#/c/109276/

Safe example usage:

   zuul_clear_refs.py --verbose --dry-run --until 90 /srv/zuul/git/project


Note that "git remote update" is really "git fetch --all". It might be 
possible to limit the ref between fetched, ie ignore the zuul ones when 
processing.

You probably want to have a cronjob which repack all the ref in a single 
pack ie:

   find /var/lib/zuul/git/ -maxdepth 3 -type d -name ".git" -exec git 
--git-dir="{}" pack-refs --all \;

Taken from Openstack puppet manifest for the zuul mergers:
https://github.com/openstack-infra/puppet-zuul/blob/0f585bef4f7822975/manifests/merger.pp#L27-L35


Antoine Musso


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-infra/attachments/20150305/3163127c/attachment.html>


More information about the OpenStack-Infra mailing list