Re: [PATCH 2/3] setup: have the_repository use the_index
- Date: Wed, 12 Jul 2017 14:33:39 -0700
- From: Jonathan Nieder <jrnieder@xxxxxxxxx>
- Subject: Re: [PATCH 2/3] setup: have the_repository use the_index
Junio C Hamano wrote:
> Brandon Williams <bmwill@xxxxxxxxxx> writes:
>> Since it is a pointer then using a '#define' to replace 'the_index'
>> (which is not a pointer) would be a little more challenging.
> The above is merely realizing another downside that stems from the
> earlier design decision that the index field is not a real embedded
> structure, but is a pointer. It does not explain why it is better
> to have a pointer to an allocated structure in the first place.
> I am not (yet) telling you to fix the design to have a pointer
> "index" by replacing it with an embedded structure. I may actually
> do so later, but I am first trying to find out if it is a right
> design decision with some advantage.
Consider a command that doesn't need to access the index at all (e.g.,
"git grep --recurse-submodules -e foo HEAD").
In favor of using an embedding instead of a pointer, there is the
advantage that it makes initialization simpler. (It also involves a
tiny speedup by avoiding a pointer indirection on access, but that's
more negligible.) For that reason it was a good choice when there was
only one repository in memory: using such a small bounded portion of
.bss space in exchange for some convenience is a good trade.
When a process has multiple repositories in memory (for example one
per thread), the trade-off becomes different. Instead of .bss, the
unused embedded index is on the stack or heap. Using embedding would
mean that instead of an unused extra word in the per-repository
structure we get an unused ~24 words.
An argument could be made that we wouldn't want to waste either 1 word
or 24 words per in-memory repository object --- we'd want to waste 0
words and separately keep a map from repositories to index_state that
only gets populated when needed. That complicates index access a bit
too much for my taste. 1 word instead of 0 or 24 seems like a
All that said, I don't have a strong opinion on this. Both the 1-word
approach (a pointer) and 24-word approach (embedding) are tolerable
and there are reasons to prefer each.