Re: Simultaneous gc and repack
- Date: Thu, 13 Apr 2017 14:28:07 -0400
- From: David Turner <novalis@xxxxxxxxxxx>
- Subject: Re: Simultaneous gc and repack
On Thu, 2017-04-13 at 12:08 -0600, Martin Fick wrote:
> On Thursday, April 13, 2017 11:03:14 AM Jacob Keller wrote:
> > On Thu, Apr 13, 2017 at 10:31 AM, David Turner
> <novalis@xxxxxxxxxxx> wrote:
> > > Git gc locks the repository (using a gc.pid file) so
> > > that other gcs don't run concurrently. But git repack
> > > doesn't respect this lock, so it's possible to have a
> > > repack running at the same time as a gc. This makes
> > > the gc sad when its packs are deleted out from under it
> > > with: "fatal: ./objects/pack/pack-$sha.pack cannot be
> > > accessed". Then it dies, leaving a large temp file
> > > hanging around.
> > >
> > > Does the following seem reasonable?
> > >
> > > 1. Make git repack, by default, check for a gc.pid file
> > > (using the same logic as git gc itself does).
> > > 2. Provide a --force option to git repack to ignore said
> > > check. 3. Make git gc provide that --force option when
> > > it calls repack under its own lock.
> > What about just making the code that calls repack today
> > just call gc instead? I guess it's more work if you don't
> > strictly need it but..?
> There are many scanerios where this does not achieve the
> same thing. On the obvious side, gc does more than
> repacking, but on the other side, repacking has many
> switches that are not available via gc.
> Would it make more sense to move the lock to repack instead
> of to gc?
Other gc operations might step on each other too (e.g. packing refs).
That would be less bad (and less common), but it still seems worth