Web lists-archives.com

Calculating pack file SHA value

Hi all,

I am trying to figure out how to calculate the SHA value of a pack file when you
run `git index-pack file.pack`. I am close, but having a bit of trouble at the
end. Here's my understanding so far.

Git buffers data to be processed and when its exhausted, updates the SHA
checksum with the previously read data. This is from builtin/index-pack.c,
specifically fill() which calls flush() to update the SHA value. My question is,
how does git determine how many bytes at a time to process?

The size of the buffer is the file-scope variable input_len. This size seems to
be 4096 several times until the very end where it reduces to less-than 4096
(obviously this depends on the pack file, but in my case its 1074 bytes).
Ordinarily I would think its a result of the read() call not receiving the full
4096 bytes, but there still are left over bytes in the file but my manual
verification shows there are still remaining bytes in the file which are not run
through the SHA checksum.

How does git calculate a pack file's SHA verification? How does it know what
size (number of bytes) to read when running flush() to update the buffer?
(typically 4096). How does it know when in the file to stop updating the SHA1

I hope my questions are clear. Thanks!

Farhan Khan
PGP Fingerprint: 1312 89CE 663E 1EB2 179C 1C83 C41D 2281 F8DA C0DE