Re: [PATCH v2 1/5] test-list-objects: List a subset of object ids
- Date: Thu, 5 Oct 2017 06:00:02 -0400
- From: Jeff King <peff@xxxxxxxx>
- Subject: Re: [PATCH v2 1/5] test-list-objects: List a subset of object ids
On Thu, Oct 05, 2017 at 06:48:10PM +0900, Junio C Hamano wrote:
> Jeff King <peff@xxxxxxxx> writes:
> > This is weirdly specific. Can we accomplish the same thing with existing
> > tools?
> > E.g., could:
> > git cat-file --batch-all-objects --batch-check='%(objectname)' |
> > shuffle |
> > head -n 100
> > do the same thing?
> > I know that "shuffle" isn't available everywhere, but I'd much rather
> > see us fill in portability gaps in a general way, rather than
> > introducing one-shot C code that needs to be maintained (and you
> > wouldn't _think_ that t/helper programs need much maintenance, but try
> > perusing "git log t/helper" output; they have to adapt to the same
> > tree-wide changes as the rest of the code).
> I was thinking about this a bit more, and came to the conclusion
> that "sort -R" and "shuf" are wrong tools to use. We would want to
> measure with something close to real world workload. for example,
> git rev-list --all --objects
> produce the listof objects in traversal order (i.e. this is very
> similar to the order in which "git log -p" needs to access the
> objects) and chomping at the number of sample objects you need in
> your test would give you such a list.
Actually, I'd just as soon see timings for "git log --format=%h" or "git
log --raw", as opposed to patches 1 and 2.
You won't see a 90% speedup there, but you will see the actual
improvement that real-world users are going to experience, which is way
more important, IMHO.