Re: Questions on GSoC 2019 Ideas
- Date: Sun, 3 Mar 2019 17:12:59 +0700
- From: Duy Nguyen <pclouds@xxxxxxxxx>
- Subject: Re: Questions on GSoC 2019 Ideas
On Sun, Mar 3, 2019 at 2:18 PM Christian Couder
> One thing I am still worried about is if we are sure that adding
> parallelism is likely to get us a significant performance improvement
> or not. If the performance of this code is bounded by disk or memory
> access, then adding parallelism might not bring any benefit. (It could
> perhaps decrease performance if memory locality gets worse.) So I'd
> like some confirmation either by running some tests or by experienced
> Git developers that it is likely to be a win.
This is a good point. My guess is the pack access consists of two
parts: deflate zlib, resolve delta objects (which is just another form
of compression) and actual I/O. The former is CPU bound and may take
advantage of multiple cores. However, the cache we have kinda helps
reduce CPU work load already, so perhaps the actual gain is not that
much (or maybe we could just improve this cache to be more efficient).
I'm adding Jeff, maybe he has done some experiments on parallel pack
access, who knows.
The second good thing from parallel pack access is not about utilizing
processing power from multiple cores, but about _not_ blocking. I
think one example use case here is parallel checkout. While one thread
is blocked by pack access code for whatever reason, the others can
still continue doing other stuff (e.g. write the checked out file to
disk) or even access the pack again to check more things out.