Web lists-archives.com

Re: Git blame performance on files with a lot of history




On Fri, Dec 14, 2018 at 1:31 PM Derrick Stolee <stolee@xxxxxxxxx> wrote:
>
> Please double-check that you have the 'core.commitGraph' config setting
> enabled, or you will not read the commit-graph at run-time:
>
>      git config core.commitGraph true
>

Yeah, this is what happens when trying too many things at once :( I
had removed it to get
with/without scores, and forgot to re-enable it before trying my last
set of experiments.
Here are the results with it enabled:
> time GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y /path/to/git rev-list --count --full-history HEAD -- important/file.C
10:32:06.665057 revision.c:483          bloom filter total queries:
286363 definitely not: 234605 maybe: 51758 false positives: 48212 fp
ratio: 0.168360
GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y  rev-list --count
HEAD -  2.62s user 0.14s system 97% cpu 2.830 total
> time /path/to/git rev-list --count --full-history HEAD -- ic/lv/src/iclv/drc_compiler.C
3576
/path/to/git rev-list      8.86s user 0.15s system 99% cpu 9.031 total

So I'm getting a 3x benefit, not bad! This is on the re-repacked repo,
which is why I ran again
with and without the Bloom filter.

Let's see what this does for blame:
> time GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y /path/to/git blame master -- important/file.C > /tmp/foo
Blaming lines: 100% (33179/33179), done.
12:50:42.703522 revision.c:483          bloom filter total queries: 0
definitely not: 0 maybe: 0 false positives: 0 fp ratio: -nan
GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y  blame master --
>   132.59s user 2.15s system 99% cpu 2:14.95 total

Seems like it's not implemented for blame operations. I'll be happy to
test any implementation.

Take care,

Clément