Web lists-archives.com

Re: Auto-gc in the background can take a long time to be put in the background

On Tue, Mar 26 2019, Jeff King wrote:

> On Tue, Mar 26, 2019 at 08:22:23AM +0900, Mike Hommey wrote:
>> Recently, I've noticed that whenever the auto-gc message shows up about
>> being spawned in the background, it still takes a while for git to
>> return to the shell.
>> I've finally looked at what it was stuck on, and it's
>> `git reflog expire --all` taking more than 30s. I guess the question is
>> whether there's a reason this shouldn't run in the background? Another
>> is whether there's something that makes this slower than it should be.
> The reason is that it takes locks which can interfere with other
> operations; see 62aad1849f (gc --auto: do not lock refs in the
> background, 2014-05-25).

Even assuming we can never improve this I think we should make this part
configurable. It's assuming that the contention is otherwise going to be
with yourself in the same terminal, but it doesn't help if the primary
source of contention is going to be e.g. other concurrent processes in
the same repo.

> Unfortunately making it faster is hard. To handle expiring unreachable
> items, it has to know what's reachable. Which implies walking the commit
> graph. I don't recall offhand whether setting unreachable-expiration to
> "never" would skip that part. But if not, that should be low-hanging
> fruit.

I have a recently patch that does this that I need to re-roll:

> (I also wonder whether there is really much valuable in keeping
> unreachable things for a shorter period of time, and the default should
> simply be to just prune everything after 90 days, unreachable or not).

Do you mean unify gc.reflogExpire & gc.pruneExpire (and other
variables). Would that be cheaper somehow?

Or just blindly remove loose objects that are older than some mtime,
assuming that if anyone cared they'd be in a pack already?

The latter of those would be very useful, but if not carefully handled
could lead to corruption.