Re: [RFD] Long term plan with submodule refs?
- Date: Thu, 09 Nov 2017 14:08:51 +0900
- From: Junio C Hamano <gitster@xxxxxxxxx>
- Subject: Re: [RFD] Long term plan with submodule refs?
Stefan Beller <sbeller@xxxxxxxxxx> writes:
>> The relationship is indeed currently useful, but if the long term plan
>> is to strongly discourage detached submodule HEAD, then I would think
>> that these patches are in the wrong direction. (If the long term plan is
>> to end up supporting both detached and linked submodule HEAD, then these
>> patches are fine, of course.) So I think that the plan referenced in
>> Junio's email (that you linked above) still needs to be discussed.
> This email presents different approaches.
> This document should summarize the current situation of Git submodules
> and start a discussion of where it can be headed long term.
> Show different ways in which submodule refs could evolve.
> Submodules in Git are considered as an independet repository currently.
> This is okay for current workflows, such as utilizing a library that is
> rarely updated. Other workflows that require a tighter integration between
> submodule and superproject are possible, but cumbersome as there is an
> additional step that has to be performed, which is the update of the gitlink
> pointer in the superproject.
I do not think "rarely updaed" is an issue.
The problem is that we may want to make it easier to use a
superproject and its submodules as if the combined whole were a
single project, which currently is not easy, primarily because
submodules are separate entities with different set of branches that
can be checked out independently from what branch the superproject
is working on.
> * Obtaining a copy of the Superproject tightly coupled with submodules
> solved via git clone --recurse-submodules=<pathspec>
> * Changing the submodule selection
> solved via submodule.active flags
> * Changing the remote / Interacting with a different remote for all submodules
> -> need to be solved, not core issue of this discussion
> * Syncing to the latest upstream
> solved via git pull --recurse
> * Working on a local feature in one submodule
> -> How do refs work spanning superproject/submodule?
> * Working on a feature spanning multiple submodules
> -> How do refs work spanning multiple repos?
> * Working on a bug fix (Changing the feature that you currently work on, branches)
> -> How does switching branches in the superproject affect submodules
These are good starting points for copying such a combined whole to
your local machine and start working on it. The more interesting,
important, and potentially difficult part is how the result of such
work is shared back to where you started from. "push --recursive"
may be a simple phrase, but a sensible definition of how it should
work won't be that simple.
> Possible data models and workflow implications
> In the following different data models are presented, which aid a submodule
> heavy workflow each giving pros and cons.
> Keep everything as is, superproject and submodule have their own refs
> * Current tools that manage multiple repositories (e.g. repo, git-slave)
> have "branches in parallel", i.e. each repo has a branch of the same
> name, instead of using a superproject to manage the state of all repos
> involved. So users of such tools may be confused by submodules.
> * when using a detached HEAD in the submodule, we may run into git-gc issues.
We should make detached HEAD safe against gc if it is not,
regardless of the use of submodules. I thought it already was made
safe long time ago.
> Use replicate refs in submodules
> This approach will replicate the superproject refs into the submodule
> ref namespace, e.g. git-branch learns about --recurse-submodules, which
> creates a branch of a given name in all submodules. These (topic) branches
> should be kept in sync with the superproject
> * This seemed intuitive to Gerrit users
> * 'quick' to implement, most of the commands are already there,
> just git-branch is needed to have the workflows mentioned above complete.
> * What does "git checkout -b A B" mean? (special case: B == HEAD)
The command ran at which level? In the superproject, or in a single
> Is the branch name replicated as a string into the submodule operation,
> or do we dereference the superprojects gitlink and walk from there?
If they are "kept in sync with the superproject", then there should
be no difference between the two, so I do not see any room for
wondering about that. In other words, if there is need to worry
about the differences between the above two, then it probably is
fundamentally impossible to keep these in sync, and a design that
assumes it is possible would have to expose glitches to the end-user
I do not know if glitches resulting from there would be so severe to
be show-stoppers, though. It might be possible to paper them over.
> No submodule refstore at all
> Use refs and commits in the superproject to stitch submodule changes
> together. Disallow branches in the submodule. This is only restricted
> to the working tree inside the superproject, such that the output of git-branch
> changes depending whether the working tree is in- or outside the superproject
> working tree.
This would need enhancement for reachability code, but it feels the
cleanest from the philosophical standpoint---if you want to treat a
superproject and its submodules as if it were a single project,
ability to check out a branch in a submodule that does not match
that of the superproject would only get in the way of preserving the
illusion of "single project"-ness.
> New type of symbolic refs
> A symbolic ref can currently only point at a ref or another symbolic ref.
> This proposal showcases different scenarios on how this could change in the
> HEAD pointing at the superprojects index
This looks to me a mere implementation detail for a (part of)
necessary component to realize the above "No submodule refstore".
> Superproject operations spanning index and worktree
> E.g. git reset --mixed
> As the submodules HEAD is defined in the index, we would reset it to the
> version in the last commit. As --mixed promises to not touch the working tree,
> the submodules worktree would not be touched. git reset --mixed in the
> superproject is the same as --soft in the submodule.
I am not sure if you want to take these promises low-level "single
repository" plumbing operations make too literally. "reset --mixed"
may promise not to touch the working tree, but it also promises not
to touch submodules at all. If you are breaking the latter anyway,
it would make more sense not to be afraid of breaking the former if
it makes sense in the context of allowing the command to do more by
breaking the latter.