Re: Questions on GSoC 2019 Ideas
- Date: Sun, 3 Mar 2019 17:03:31 +0700
- From: Duy Nguyen <pclouds@xxxxxxxxx>
- Subject: Re: Questions on GSoC 2019 Ideas
On Sat, Mar 2, 2019 at 10:09 PM Thomas Gummerer <t.gummerer@xxxxxxxxx> wrote:
> On 03/01, Duy Nguyen wrote:
> > On Fri, Mar 1, 2019 at 5:20 AM Christian Couder
> > <christian.couder@xxxxxxxxx> wrote:
> > >
> > > Hi Matheus,
> > >
> > > On Thu, Feb 28, 2019 at 10:46 PM Matheus Tavares Bernardino
> > > <matheus.bernardino@xxxxxx> wrote:
> > > >
> > > > I've been in the mailing list for a couple weeks now, mainly working
> > > > on my gsoc micro-project and in other patches that derived from it.
> > > > I also have been contributing to the Linux Kernel for half an year,
> > > > but am now mainly just supporting other students here at USP.
> > > >
> > > > I have read the ideas page for the GSoC 2019 and many of them interest
> > > > me. Also, looking around git-dev materials on the web, I got to the
> > > > GSoC 2012 ideas page. And this one got my attention:
> > > > https://github.com/peff/git/wiki/SoC-2012-Ideas#improving-parallelism-in-various-commands
> > > >
> > > > I'm interested in parallel computing and that has been my research
> > > > topic for about an year now. So I would like to ask what's the status
> > > > of this GSoC idea. I've read git-grep and saw that it is already
> > > > parallel, but I was wondering if there is any other section in git in
> > > > which it was already considered to bring parallelism, seeking to
> > > > achieve greater performance. And also, if this could, perhaps, be a
> > > > GSoC project.
> > >
> > > I vaguely remember that we thought at one point that all the low
> > > hanging fruits had already been taken in this area but I might be
> > > wrong.
> > We still have to remove some global variables, which is quite easy to
> > do, before one could actually add mutexes and stuff to allow multiple
> > pack access. I don't know though if the removing global variables is
> > that exciting for GSoC, or if both tasks could fit in one GSoC. The
> > adding parallel access is not that hard, I think, once you know
> > packfile.c and sha1-file.c relatively well. It's mostly dealing with
> > caches and all the sliding access windows safely.
> I'm not very familiar with what's required here, but reading the above
> makes me think it's likely too much for a GSoC project. I think I'd
> be happy with a project that declares removing the global variables as
> the main goal, and adding parallelism as a potential bonus.
> I'm a bit wary of a too large proposal here, as we've historically
> overestimated what kind of project is achievable over a summer (I've
> been there myself, as my GSoC project was also more than I was able to
> do in a summer :)). I'd rather have a project whose goal is rather
> small and can be expanded later, than having something that could
> potentially take more than 3 months, where the student (or their
> mentors) have to finish it after GSoC.
This is why I'm not involved in GSoC. I often mis-estimate the size of
work (and yes I would still like your tree-based index format in,
can't remember why it never made it).
So yeah if you find removing global variables (which essentially
identifies shared states, a prerequisite for any parallel work)
reasonable for GSoC, I'd say go for it.
Be also aware that this kind of refactoring work could result in lots
of patches and it takes time to get them merged, if your GSoC goal is
to get merged.