[OpenStack-Infra] zuul-merger and garbage collection
James E. Blair
corvus at inaugust.com
Tue May 12 14:51:20 UTC 2015
Clark Boylan <cboylan at sapwetik.org> writes:
> I would expect git gc on zuul merger repos to be safe. git gc only
> cleans up unreachable refs if they are 30 days old by default.
However, a big part of the performance impact to zuul-mergers is that
there are so many transient refs that it creates. So there will be some
work for git gc to do, but the largest contributor to run time is
actually refs. The solution to that might be to have zuul delete old
refs, though we have to figure out which ones are old first.
"Heald, Nicola" <nicola.heald at hp.com> writes:
> From: Clark Boylan [cboylan at sapwetik.org]
>> One example we have run into with GitPython is that if the repo is
>> repacked (which git can do for you when it decides to) object files may
>> not exist any longer and need to be refound in the pack file instead.
>> The only way to get GitPython to see that is the make a new repo object.
> Ah, this *might* be what we're seeing. During normal zuul-merger
> operation, it should never have files open that are deleted on disk,
> should it?
This sounds like it might be a bug in either the merger or GitPython.
As Clark said, since we regularly recreate the GitPython object, perhaps
something is keeping a stale reference to a python object with an open
file. That's worth looking into.
> Sound like removing our `git gc` cronjob is a good idea?
It's probably not hurting anything because it is likely that whatever is
holding those deleted files open isn't actually in use. OTOH, it's
probably not doing much either.
More information about the OpenStack-Infra