Web lists-archives.com

Re: Large object issue (Windows)

On 05/03/2019 03:35, brian m. carlson wrote:
On Mon, Mar 04, 2019 at 07:04:02PM -0500, Patrick Hogg wrote:
Hi all,

While investigating the last issue I reported (and fixed) I was trying
to come up with a good test case for repos with large objects. In the
process I found an issue on Windows with objects at least 4g large:

git init test
cd test
echo "*.exe binary" > .gitattributes
truncate -s 4g nullbytes.exe
git stage .
git commit -m "Test"
# This will break, complaining that the object is corrupt.
git fsck --full
# This will also break, complaining that the object is corrupt.
#git gc

I did some investigation and I think that this is a porting issue.
unpack_object_header_buffer in packfile.c uses an unsigned long for the
size. On Linux this will be 64 bits (at least on the Linux systems I've
tried) but on Windows it's 32 bits. The code then decides that the
object header is bad and bombs. However, if I move the repo to a Linux
machine it can handle the data just fine. (And ironically git generated
the object header when storing the object!)

Is there any reason not to switch the unsigned longs in
unpack_object_header_buffer (and its callers, wherever that may lead)
to uint64_t? (Or any potential pitfalls in doing so that I would need
to look out for?)
It's known that there are several problems with this, affecting various
parts of the code. Patches to fix this are of course welcome.

I think we've chosen to specify size_t for values which are stored
entirely in memory, since a buffer can't be larger than this size, and
off_t for sizes which refer to files or object sizes. The latter will be
64-bit on 32-bit systems when compiled with _FILE_OFFSET_BITS set to 64,
while the former will be 32-bit.

Hi Patrick,

There is also a thread on the Git-for-Windows list at https://github.com/git-for-windows/git/issues/1063 and also here at https://public-inbox.org/git/994568940.109648.1548957557643@xxxxxxxxxxxxxxxx/

Part of the issues is that zlib on windows 'sort of' fails to do >4Gb - see their FAQ32 - in that the length value is only 'long' which is only 32 bit, while in fact the zlib copes fine but returns a length modulo that limit.

Trying to get all the places that should be upcast to size_t (ptr) or ptdiff_t rather than coerced down to windows 32bit long is part of the struggle.