[OpenStack-Infra] Fun (important!) project: optimize Gerrit's nova git repo
Zaro
zaro0508 at gmail.com
Mon Jun 13 16:29:21 UTC 2016
`git gc` enables prune by default [1]. Running `git gc` cleans up the
objects (6.4G -> 380M) and moves the refs to packed-refs file (382M ->
6M). I see the exact same result whether I run with C git or jgit.
Original files:
~/temp/nova.git.test$ du -hsx * | sort -r | head -10
6.4G nova.git.orig/objects
6.1M nova.git.orig/info
4.0K nova.git.orig/config
4.0K nova.git.orig/HEAD
382M nova.git.orig/refs
2.1M nova.git.orig/logs
0B nova.git.orig/hooks
0B nova.git.orig/description
0B nova.git.orig/branches
After a `git gc`:
~/temp/nova.git.test$ git gc
Counting objects: 1210923, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (155559/155559), done.
Writing objects: 100% (1210923/1210923), done.
Total 1210923 (delta 1002442), reused 1205966 (delta 997777)
Removing duplicate objects: 100% (256/256), done.
Checking connectivity: 1210923, done.
~/temp/nova.git.test$ du -hsx * | sort -r | head -10
6.1M packed-refs
6.1M info
4.0K config
4.0K HEAD
380M objects
64K logs
0B refs
0B hooks
0B description
0B branches
[1] https://git-scm.com/docs/git-gc ('prune is on by default')
On Mon, Jun 13, 2016 at 7:58 AM, James E. Blair <corvus at inaugust.com> wrote:
> Zaro <zaro0508 at gmail.com> writes:
>
>> I forgot to mention that the apps we use (gerrit and cgit) to host our
>> git repos do read the repos directly from disk therefore I think that
>> performing a gc on the repos would provide a performance improvement
>> (CPU and memory utilization) to gerrit and cgit. It might be
>> difficult to quantify how much of an improvement since both those apps
>> do some cacheing of the repo data. Anyways I think there would be
>> other benefits of `git gc` over `gerrit repack -adf` besides just
>> recovering disk space. -Khai
>
> What was the effect of 'git prune' with 'git gc'? In my original
> message, I mentioned that the two together had the greatest effect on
> disk space -- the change in ref structure could have a significant
> impact to cloning time as well, but also to our ability to issue
> corrective modifications to the repos.
>
> -Jim
More information about the OpenStack-Infra
mailing list