Web lists-archives.com

Re: Simultaneous gc and repack




On Thu, 2017-04-13 at 12:08 -0600, Martin Fick wrote:
> On Thursday, April 13, 2017 11:03:14 AM Jacob Keller wrote:
> > On Thu, Apr 13, 2017 at 10:31 AM, David Turner 
> 
> <novalis@xxxxxxxxxxx> wrote:
> > > Git gc locks the repository (using a gc.pid file) so
> > > that other gcs don't run concurrently. But git repack
> > > doesn't respect this lock, so it's possible to have a
> > > repack running at the same time as a gc.  This makes
> > > the gc sad when its packs are deleted out from under it
> > > with: "fatal: ./objects/pack/pack-$sha.pack cannot be
> > > accessed".  Then it dies, leaving a large temp file
> > > hanging around.
> > > 
> > > Does the following seem reasonable?
> > > 
> > > 1. Make git repack, by default, check for a gc.pid file
> > > (using the same logic as git gc itself does).
> > > 2. Provide a --force option to git repack to ignore said
> > > check. 3. Make git gc provide that --force option when
> > > it calls repack under its own lock.
> > 
> > What about just making the code that calls repack today
> > just call gc instead? I guess it's more work if you don't
> > strictly need it but..?
> 
> There are many scanerios where this does not achieve the 
> same thing.  On the obvious side, gc does more than 
> repacking, but on the other side, repacking has many 
> switches that are not available via gc.
> 
> Would it make more sense to move the lock to repack instead 
> of to gc?

Other gc operations might step on each other too (e.g. packing refs). 
That would be less bad (and less common), but it still seems worth
avoiding.