Web lists-archives.com

Re: [PATCH v1 1/2] log -G: Ignore binary files




On Thu, Nov 22, 2018 at 11:16:38AM +0100, Ævar Arnfjörð Bjarmason wrote:

> > +test_expect_success 'log -G looks into binary files with textconv filter' '
> > +	rm -rf .git &&
> > +	git init &&
> > +	echo "* diff=bin" > .gitattributes &&
> > +	printf "a\0b" >data.bin &&
> > +	git add data.bin &&
> > +	git commit -m "message" &&
> > +	git -c diff.bin.textconv=cat log -G a >actual &&
> > +	git log >expected &&
> > +	test_cmp actual expected
> > +'
> > +
> >  test_done
> 
> This patch seems like the wrong direction to me. In particular the
> assertion that "the concept of differences only makes sense for text
> files". That's just not true. This patch breaks this:

But "-G" is defined as "look for differences whose patch text contains
added/removed lines that match <regex>". We don't have patch text here,
let alone added/removed lines.

For binary files, "-Sfoo" is better defined. I think we _could_ define
"search for <pattern> in the added/removed bytes of a binary file".  But
I don't think that's what the current code does (it really does a line
diff on a binary file, which is likely to put tons of unchanged crap
into the "added and removed" lines, because the line divisions aren't
meaningful in the first place).

>     (
>         rm -rf /tmp/g-test &&
>         git init /tmp/g-test &&
>         cd /tmp/g-test &&
>         for i in {1..10}; do
>             echo "Always matching thensome 5" >file &&
>             printf "a thensome %d binary \0" $i >>file &&
>             git add file &&
>             git commit -m"Bump $i"
>         done &&
>         git log -Gthensome.*5
>     )
> 
> Right now this will emit 3/10 patches, and the right ones! I.e. "Bump
> [156]". The 1st one because it introduces the "Always matching thensome
> 5". Then 5/6 because the add/remove the string "a thensome 5 binary",
> respectively. Which matches /thensome.*5/.

Right, this will sometimes do the right thing. But it will also often do
the wrong thing. It's also very expensive (we specifically avoid feeding
large binary files to xdiff, but I think "-G" will happily do so -- I
didn't double check, though).

-Peff