Web lists-archives.com

Re: reftable [v5]: new ref storage format

On Mon, Aug 07, 2017 at 03:40:48PM +0000, David Turner wrote:

> > -----Original Message-----
> > From: Shawn Pearce [mailto:spearce@xxxxxxxxxxx]
> > In git-core, I'm worried about the caveats related to locking. Git tries to work
> > nicely on NFS, and it seems LMDB wouldn't. Git also runs fine on a read-only
> > filesystem, and LMDB gets a little weird about that. Finally, Git doesn't have
> > nearly the risks LMDB has about a crashed reader or writer locking out future
> > operations until the locks have been resolved. This is especially true with shared
> > user repositories, where another user might setup and own the semaphore.
> FWIW, git has problems with stale lock file in the event of a crash (refs/foo.lock 
> might still exist, and git does nothing to clean it up).
> In my testing (which involved a *lot* of crashing), I never once had to clean up a
> stale LMDB lock.  That said, I didn't test on a RO filesystem.

Yeah, I'd expect LMDB to do much better than Git in a crash, because it
relies on flock. So when the kernel goes away, so too does your lock
(ditto if a git process dies without remembering to remove the lock,
though I don't think we've ever had such a bug).

But that's also why it may not work well over NFS (though my impression
is that flock _does_ work on modern NFS; I've been lucky enough not to
ever use it). Lack of NFS support wouldn't be a show-stopper for most
people, but it would be for totally replacing the existing code, I'd
think. I'm just not clear on what the state of lmdb-on-nfs is.

Assuming it could work, the interesting tradeoffs to me are:

  - something like reftable is hyper-optimized for high-latency
    block-oriented access. It's not clear to me if lmdb would even be
    usable for the distributed storage case Shawn has.

  - reftable is more code for us to implement, but we'd "own" the whole
    stack down to the filesystem. That could be a big win for debugging
    and optimizing for our use case.

  - reftable is re-inventing a lot of the database wheel. lmdb really is
    a debugged, turn-key solution.

I'm not opposed to a world where lmdb becomes the standard solution and
Google does their own bespoke thing. But that's easy for me to say
because I'm not Google. I do care about keeping complexity and bugs to a
minimum for most users, and it's possible that lmdb could do that. But
if it can't become the baseline standard (due to NFS issues), then we'd
still want something to replace the current loose/packed storage. And if
reftable does that, then lmdb becomes a lot less interesting.