Web lists-archives.com

Re: [RFC PATCH 0/4] name-rev: improve memory usage




On Fri, Mar 01, 2019 at 08:14:26PM +0100, Alban Gruin wrote:

> > diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> > index f1cb45c227..7aaa86f1c0 100644
> > --- a/builtin/name-rev.c
> > +++ b/builtin/name-rev.c
> > @@ -431,6 +431,8 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
> >  		OPT_END(),
> >  	};
> >  
> > +	save_commit_buffer = 0;
> > +
> [...]
> 
> Unfortunately this does not work in all cases, apparently.  On my git
> copy, I have 3 origins.  If I run this:
> 
> 	git log --graph --oneline --abbrev=-1 -5 | git name-rev --stdin
> 
> With or without your change, it uses 3GB of RAM.  With this series, it
> uses 25MB of RAM.

Sorry if I was unclear. I meant to try that _in addition_ to your
changes. It helps by avoiding keeping the useless commit-object buffers
in RAM as we traverse. But the most it can save is the total
uncompressed bytes of all commit objects. I.e., in git.git:

  $ git cat-file --batch-check='%(objectsize) %(objecttype)' --batch-all-objects |
    grep commit |
    perl -alne '$total += $F[0]; END { print $total }'
  74678114

or around 70MB. In linux.git, it's more like 700MB.

But in your examples, the problem is the inefficiencies in name-rev's
algorithm, and you're not actually traversing that many commits. So I
think you'd want to turn off save_commit_buffer as an extra patch in
your series. It may or not be a big win for any given case, but it's
quite easy to do.

-Peff