Web lists-archives.com

Re: Reduce pack-objects memory footprint?




On Fri, Mar 2, 2018 at 5:54 PM, Jeff King <peff@xxxxxxxx> wrote:
> On Fri, Mar 02, 2018 at 05:18:45PM +0700, Duy Nguyen wrote:
>
>> On Wed, Feb 28, 2018 at 4:27 PM, Duy Nguyen <pclouds@xxxxxxxxx> wrote:
>> > linux-2.6.git current has 6483999 objects. "git gc" on my poor laptop
>> > consumes 1.7G out of 4G RAM, pushing lots of data to swap and making
>> > all apps nearly unusuable (granted the problem is partly Linux I/O
>> > scheduler too). So I wonder if we can reduce pack-objects memory
>> > footprint a bit.
>>
>> Next low hanging fruit item:
>>
>> struct revindex_entry {
>>         off_t offset;
>>         unsigned int nr;
>> };
>>
>> We need on entry per object, so 6.5M objects * 16 bytes = 104 MB. If
>> we break this struct apart and store two arrays of offset and nr in
>> struct packed_git, we save 4 bytes per struct, 26 MB total.
>>
>> It's getting low but every megabyte counts for me, and it does not
>> look like breaking this struct will make horrible code (we recreate
>> the struct at find_pack_revindex()) so I'm going to do this too unless
>> someone objects. There will be slight performance regression due to
>> cache effects, but hopefully it's ok.
>
> Maybe you will prove me wrong, but I don't think splitting them is going
> to work. The point of the revindex_entry is that we sort the (offset,nr)
> tuple as a unit.
>
> Or are you planning to sort it, and then copy the result into two
> separate arrays?

Yep.

> I think that would work, but it sounds kind of nasty

Yeah :(

> (arcane code, and extra CPU work for systems that don't care about the
> 26MB).
-- 
Duy