Web lists-archives.com

Re: test suite: why does git add large_file create a pack, rather than an object?




On Mon, Apr 1, 2019 at 10:10 PM Philip Oakley <philipoakley@xxxxxxx> wrote:
>
> hi Junio,
> On 01/04/2019 11:47, Junio C Hamano wrote:
> > Philip Oakley <philipoakley@xxxxxxx> writes:
> >
> >> At the moment I'm using an extended _test_ case that starts by adding
> >> a ~5.1Gb file and then using verify-pack, which aborts with an error.
> >>
> >>          dd if=/dev/zero of=file bs=1M count=5100 &&
> >>          git config core.compression 0 &&
> >>          git config core.looseCompression 0 &&
> >>          git add file &&
> >>          git verify-pack -s .git/objects/pack/*.pack &&
> >>          git fsck --verbose --strict --full &&
> >>          ...
> >>
> >> If however I simple execute the commands from the GfW bash, the added
> >> file is stored as a blob object, rather than a pack.
> >>
> >> I'm at a loss to understand the reason for the change in behaviour
> >> [store file as pack, vs store as object] between running the code as a
> >> test script and at the terminal. What am I missing?
> > To which test are you adding the above piece?  Perhaps one of those
> > that configures core.bigfilethreashold?
> The test script (t-large-files-on-winows.sh: [1] below) was specific to
> this debugging.
>
> I didn't set core.bigfilethreshold - Is that done (or unset) by the test
> setup at all?
>
> It does prompt me to check that all the bigfilethreshold checks are
> actually size_t, rather than a simple 'long'/uInt which would only be
> 32bits on Windows and potentially a downcast comparison, resulting in
> mistaken bigfile actions because of the modulo 2^32 action.
>
> So when I run the test script [1] on Windows I get my error from
> verify-pack, and the trash directory contains a single pack file.
> I tried doing the commands singly on a fresh repo, but that time found
> that the add/verify produced a blob object (rather than a pack with one
> object), so it got me wondering if I was testing like for like.
>
> When I tried using gdb at the add stage, with a break point, I got a
> back trace [2], and when run to completion it had the loose object, so I
> was confused. (my fixup code is at [3])

Streaming a blob directly to a pack is done by index_stream(). I
suggest you force a crashwhen that function is called (from your test
script) then examine with gdb for more info. You should be able to see
what's its caller (in case it's not index_fd), then perhaps you could
add a bunch of printtfs to show all the conditions that lead (or not
lead) to that function?

There are some would_convert_ calls in index_fd(). Maybe some other
config keys are affecting this.

PS. I also don't know what index_stream_convert_blob() does. Not sure
if it's really streaming to blob or streaming from somewhee to a
converter. You might want to check that too.
-- 
Duy