Web lists-archives.com

Re: standalone library/tool to query commit-graph?

On Fri, May 24 2019, SZEDER Gábor wrote:

> On Fri, May 24, 2019 at 11:49:28AM +0200, Ævar Arnfjörð Bjarmason wrote:
>> On Fri, May 24 2019, SZEDER Gábor wrote:
>> > On Thu, May 23, 2019 at 07:48:33PM -0400, Derrick Stolee wrote:
>> >> On 5/23/2019 6:20 PM, SZEDER Gábor wrote:
>> >> > On Thu, May 23, 2019 at 11:54:22PM +0200, Ævar Arnfjörð Bjarmason wrote:
>> >
>> >> >> and since the commit graph doesn't include any commits outside of
>> >> >> packs you'd miss any loose commits.
>> >> >
>> >> > No, the commit-graph includes loose commits as well.
>> >>
>> >> Depends on how you build the commit-graph.
>> >
>> > Yeah; I just didn't want to go into details, hoping that this short
>> > reply will be enough to jog Ævar's memory to recall our earlier
>> > discussion about this :)
>> To clarify (and I should have said) I meant it'll include only packed
>> commits in the mode Karl Ostmo invoked it in, as Derrick points out.
> No, even in that mode it will include loose objects as well, if it has
> to; that's what the "and closes under reachability" part of Derrick's
> reply means and that's what I showed in our earlier discussion at:
>   https://public-inbox.org/git/20190322154943.GF22459@xxxxxxxxxx/

I should have said "include any commits outside of packs [to seed the
revision walk]".

As you correctly point out there *are* caveats to that, e.g. it's
possible to have packs & loose commits but you include everything
because of reachability.

For the purposes of the discussion Jakub started upthread the
not-quite-correct-but-close-enough mental model that we generally tend
to accumulate loose objects that later coalesce into packs is close

I.e. for that reason for most users a "git commit-graph write" won't
produce a graph with all reachable commits, e.g. try cloning git.git,
"git am"-ing a patch on top, and generate it again, it'll be the same
(unless you picked a humongous patch).

Similarly it'll be incomplete for most users that have
gc.writeCommitGraph=true on since they use "gc --auto", and they're
likely in an in-between state where they have a semi-stale graph.

So building tools directly on top of it shouldn't be anyone's first
choice, instead walk the DAG and see if that walking code can as an
optimization optimistically consult the commit-graph.