Web lists-archives.com

Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default

(cc list snipped)

Eddy Petrișor wrote:

> Cc: [a lot of people]

Can you say a little about how this cc list was created?  E.g. should
"git send-email" get a feature to warn about long cc lists?

> Signed-off-by: Eddy Petrișor <eddy.petrisor@xxxxxxxxx>
> ---
> There are projects such as llvm/clang which use several repositories, and they
> might be forked for providing support for various features such as adding Redox
> awareness to the toolchain. This typically means the superproject will use
> another branch than master, occasionally even use an old commit from that
> non-master branch.
> Combined with the fact that when incorporating such a hierachy of repositories
> usually the user is interested in just the exact commit specified in the
> submodule info, it follows that a desireable usecase is to be also able to
> provide '--depth 1' to avoid waiting for ages for the clone operation to
> finish.

Some previous discussion is at

In theory this should be straightforward: Git protocol allows fetching
an arbitrary commit, so "git submodule update" and similar commands
could fetch the submodule commit by SHA-1 instead of by refname.  Poof!
Problem gone.

In practice, some complications:

 - some servers do not permit fetch-by-sha1.  For example, github does
   not permit it.  This is governed by the
   uploadpack.allowReachableSHA1InWant / uploadpack.allowAnySHA1InWant
   configuration items.

   That should be surmountable by making the behavior conditional, but
   it's a complication.

 - When the user passes --depth=<num>, do they mean that to apply to
   the superproject, to the submodules, or both?  Documentation should
   make the behavior clear.

   Fortunately I believe this complication has been takencare of using
   the --shallow-submodules option.

> Git submodule seems to be very stubborn and cloning master, although the
> wrapper script and the gitmodules-helper could work together to clone directly
> the branch specified in the .gitmodules file, if specified.

This could make sense.  For the same reason as --depth in the
superproject gives ambiguous signals about what should happen in
submodules, --single-branch in the superproject gives ambiguous
signals about what branch to fetch in submodules.

> Another wrinkle is that when the commit is not the tip of the branch, the depth
> parameter should somehow be stored in the .gitmodules info, but any change in
> the submodule will break the supermodule submodule depth info sooner or later,
> which is definitly frigile.

Hm, this seems to go in another direction.  I don't think we should
store the depth parameter in the .gitmodules file, since different
users are likely to have different preferences about what to make
shallow.  If we make --depth easy enough to use at the superproject
level then the user can specify what they want there.

> I tried digging into this section of the code and debugging with bashdb to see
> where --depth might fit, but I got stuck on the shell-to-helper interaction and
> the details of the submodule implementation, so I want to lay out this first
> patch as starting point for the discussion in the hope somebody else picks it
> up or can provide some inputs. I have the feeling there are multiple code paths
> that are being ran, depending on the moment (initial clone, submodule
> recursive, post-clone update etc.) and I have a gut feeling there shouldn't be
> any code duplication just because the operation is different.
> This first patch is only trying to use a non-master branch, I have some changes
> for the --depth part, but I stopped working on it due to the "default depth"
> issue above.
> Does any of this sound reasonable?
> Is this patch idea usable or did I managed to touch the part of the code that
> should not be touched?

I agree with the goal.  As mentioned above, I'm not confident about
the particular mechanism --- e.g. something using fetch-by-sha1 seems
likely to be more intuitive.

Today, the 'branch' setting in .gitmodules is only for "git submodule
update --remote".  This patch would be a significant expansion in
scope for it.  Hopefully others on the list can talk more about how
that fits into various workflows and whether it would work out well.

Thanks and hope that helps,

>  git-submodule.sh | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 2491496..370f19e 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -589,8 +589,11 @@ cmd_update()
>  			branch=$(git submodule--helper remote-branch "$sm_path")
>  			if test -z "$nofetch"
>  			then
> +				# non-default branch
> +				rbranch=$(git config -f .gitmodules submodule.$sm_path.branch)
> +				br_refspec=${rbanch:+"refs/heads/$rbranch:refs/heads/$rbranch"}
>  				# Fetch remote before determining tracking $sha1
> -				fetch_in_submodule "$sm_path" $depth ||
> +				fetch_in_submodule "$sm_path" $depth $br_refspec ||
>  				die "$(eval_gettext "Unable to fetch in submodule path '\$sm_path'")"
>  			fi
>  			remote_name=$(sanitize_submodule_env; cd "$sm_path" && get_default_remote)