Re: We should add a "git gc --auto" after "git clone" due to commit graph
- Date: Wed, 3 Oct 2018 16:53:08 +0200
- From: SZEDER Gábor <szeder.dev@xxxxxxxxx>
- Subject: Re: We should add a "git gc --auto" after "git clone" due to commit graph
On Wed, Oct 03, 2018 at 04:22:12PM +0200, Ævar Arnfjörð Bjarmason wrote:
> On Wed, Oct 03 2018, SZEDER Gábor wrote:
> > On Wed, Oct 03, 2018 at 04:01:40PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >> On Wed, Oct 03 2018, SZEDER Gábor wrote:
> >> > On Wed, Oct 03, 2018 at 03:23:57PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >> >> Don't have time to patch this now, but thought I'd send a note / RFC
> >> >> about this.
> >> >>
> >> >> Now that we have the commit graph it's nice to be able to set
> >> >> e.g. core.commitGraph=true & gc.writeCommitGraph=true in ~/.gitconfig or
> >> >> /etc/gitconfig to apply them to all repos.
> >> >>
> >> >> But when I clone e.g. linux.git stuff like 'tag --contains' will be slow
> >> >> until whenever my first "gc" kicks in, which may be quite some time if
> >> >> I'm just using it passively.
> >> >>
> >> >> So we should make "git gc --auto" be run on clone,
> >> >
> >> > There is no garbage after 'git clone'...
> >> "git gc" is really "git gc-or-create-indexes" these days.
> > Because it happens to be convenient to create those indexes at
> > gc-time. But that should not be an excuse to run gc when by
> > definition no gc is needed.
> Ah, I thought you just had an objection to the "gc" name being used for
> non-gc stuff,
But you thought right, I do have an objection against that. 'git gc'
should, well, collect garbage. Any non-gc stuff is already violating
separation of concerns.
> but if you mean we shouldn't do a giant repack right after
> clone I agree.
And, I also mean that since 'git clone' knows that there can't
possibly be any garbage in the first place, then it shouldn't call 'gc
--auto' at all. However, since it also knows that there is a lot of
new stuff, then it should create a commit-graph if enabled.
> I meant that "gc --auto" would learn to do a subset of
> its work, instead of the current "I have work to do, let's do all of
> pack-refs/repack/commit-graph etc.".
> So we wouldn't be spending 5 minutes repacking linux.git right after
> cloning it, just ~10s generating the commit graph, and the same would
> happen if you rm'd .git/objects/info/commit-graph and ran "git commit",
> which would kick of "gc --auto" in the background and do the same thing.