Re: Questions on GSoC 2019 Ideas
- Date: Tue, 5 Mar 2019 20:46:50 -0300
- From: Matheus Tavares Bernardino <matheus.bernardino@xxxxxx>
- Subject: Re: Questions on GSoC 2019 Ideas
This exercise of estimating a good spot to gain performance with
parallelism at git seems more difficult than I thought, firstly. Also,
I'm not that familiar yet with git packing (neither with the sections
of it that could benefit from parallelism). So could anyone point me
some good references on this, where I could study and maybe come back
with more valuable suggestions?
On Tue, Mar 5, 2019 at 9:57 AM Duy Nguyen <pclouds@xxxxxxxxx> wrote:
> On Tue, Mar 5, 2019 at 11:51 AM Jeff King <peff@xxxxxxxx> wrote:
> > > processing power from multiple cores, but about _not_ blocking. I
> > > think one example use case here is parallel checkout. While one thread
> > > is blocked by pack access code for whatever reason, the others can
> > > still continue doing other stuff (e.g. write the checked out file to
> > > disk) or even access the pack again to check more things out.
Hmm, you mean distributing the process of inflating, reconstructing
deltas and checking out files between the threads? (having each one
doing the process for a different file?)
> > I'm not sure if it would help much for packs, because they're organized
> > to have pretty good cold-cache read-ahead behavior. But who knows until
> > we measure it.
> > I do suspect that inflating (and delta reconstruction) done in parallel
> > could be a win for git-grep, especially if you have a really simple
> > regex that is quick to search.
> Maybe git-blame too. But this is based purely on me watching CPU
> utilization of one command with hot cache. For git-blame though, diff
> code as to be thread safe too but that's another story.
I don't know if this relates to parallelizing pack access, but I
thought that sharing this with you all could perhaps bring some new
insights (maybe even on parallelizing some other git section): I asked
my friends who contribute to the Linux Kernel what git commands seems
to take longer during their kernel work, and the answers were:
- git log and git status, sometimes
- using pager's search at git log
- checking out to an old commit
- git log --oneline --decorate --graph