Web lists-archives.com

Re: [PATCH/RFC 1/1] gc --auto: exclude the largest giant pack in low-memory config

On Fri, Mar 2, 2018 at 1:14 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@xxxxxxxxx> writes:
>> pack-objects could be a big memory hog especially on large repos,
>> everybody knows that. The suggestion to stick a .keep file on the
>> largest pack to avoid this problem is also known for a long time.
> Yup, but not that it is not "largest" per-se.  The thing being large
> is a mere consequence that it is the base pack that holds the bulk
> of older parts of the history (e.g. the one that you obtained via
> the initial clone).

Thanks, "base pack" sounds much better. I was having trouble with
wording because I didn't nail this one down.

>> Let's do the suggestion automatically instead of waiting for people to
>> come to Git mailing list and get the advice. When a certain condition
>> is met, gc --auto create a .keep file temporary before repack is run,
>> then remove it afterward.
>> gc --auto does this based on an estimation of pack-objects memory
>> usage and whether that fits in one third of system memory (the
>> assumption here is for desktop environment where there are many other
>> applications running).
>> Since the estimation may be inaccurate and that 1/3 threshold is
>> arbitrary, give the user a finer control over this mechanism as well:
>> if the largest pack is larger than gc.bigPackThreshold, it's kept.
> If this is a transient mechanism during a single gc session, it
> would be far more preferrable if we can find a way to do this
> without actually having a .keep file on the filesystem.

That was my first attempt, manipulating packed_git::pack_keep inside
pack-objects. Then my whole git.git was gone. I was scared off so I
did this instead.

I've learned my lesson though (never test dangerous operations on your
worktree!) and will do that pack_keep again _if_ this gc --auto still
sounds like a good direction to go.