Re: [RFC PATCH 000/194] Moving global state into the repository object
- Date: Wed, 7 Feb 2018 10:06:43 -0800
- From: Stefan Beller <sbeller@xxxxxxxxxx>
- Subject: Re: [RFC PATCH 000/194] Moving global state into the repository object
On Wed, Feb 7, 2018 at 3:48 AM, Duy Nguyen <pclouds@xxxxxxxxx> wrote:
> On Tue, Feb 6, 2018 at 6:51 AM, Stefan Beller <sbeller@xxxxxxxxxx> wrote:
>> This series moves a lot of global state into the repository struct.
>> It applies on top of 2512f15446149235156528dafbe75930c712b29e (2.16.0)
>> It can be found at https://github.com/stefanbeller/git/tree/object-store
>> Motivation for this series:
>> * Easier to reason about code when all state is stored in one place,
>> for example for multithreading
>> * Easier to move to processing submodules in-process instead of
>> calling a processes for the submodule handling.
>> The last patch demonstrates this.
> I have a mixed feeling about this. The end game is to keep
> "the_repository" references as minimum as possible, right? Like we
> only need to mention in in cmd_xxx() and not all over the place. If
> so, yay!!
Yes. And the super-end-game long after this series is to have
cmd_xxx(struct repository *r, argc, argv)
> Something else.. maybe "struct repository *" should not be a universal
> function argument to avoid global states. Like sha1_file.c is mostly about the
> object store, and I see that you added object store struct (nice!).
> These sha1 related function should take the object_store* (which
> probably is a combination of both packed and loose stores since they
> depend on each other), not repository*. This way if a function needs
> both access to object store and ref store, we can see the two
> dependencies from function arguments.
I tried that in the beginning and it blew up a couple of times when I realized
that I forgot to pass through one of these dependencies.
Maybe we can go to the repository and shrink the scope in a follow up?
> The alternate object store, if modeled right, could share the same
> object store interface. But I don't think we should do anything that
> big right now, just put alternate object store inside object_store.
yup that is the case, see struct raw_object_store at the end of the series
> Similarly those functions in blob.c, commit.c, tree.c... should take
> object_parser* as argument instead of repository*. Maybe there's a
> better name for this than object_parser. parsed_object_store I guess.
> Or maybe just object_pool.
Note that the initial few patches are misleading in the name,
which splits up the object handling into two layers, the "physical" layer
(where to get the data from, mmaping pack files, alternates, streams of bytes),
which is "struct raw_object_store objects;" in the repository, and the
"object" layer, which is about parsing the objects and making sense of the data
(which tree belongs to a commit, dereferencing tags)
So maybe I'll try to make the first layer into its own series, such
that we'll have a smaller series.