Web lists-archives.com

Re: Finding a tag that introduced a submodule change

On Wed, Mar 15, 2017 at 10:10 AM, Stefan Beller <sbeller@xxxxxxxxxx> wrote:
> On Fri, Mar 3, 2017 at 7:40 AM, Robert Dailey <rcdailey.lists@xxxxxxxxx> wrote:
>> I have a repository with a single submodule in it. Since the parent
>> repository represents the code base for an actual product, I tag
>> release versions in the parent repository. I do not put tags in the
>> submodule since multiple other products may be using it there and I
>> wanted to avoid ambiguous tags.
>> Sometimes I run into a situation where I need to find out which
>> release of the product a submodule change was introduced in. This is
>> nontrivial, since there are no tags in the submodule itself. This is
>> one thing I tried:
>> 1. Do a `git log` in the submodule to find the SHA1 representing the
>> change I want to check for
>> 2. In the parent repository, do a git log with pickaxe to determine
>> when the submodule itself changed to the value of that SHA1.
>> 3. Based on the result of #2, do a `git tag --contains` to see the
>> lowest-version tag that contains the SHA1, which will identify the
>> first release that introduced that change
>> However, I was not able to get past #2 because apparently there are
>> cases where when we move the submodule "forward", we skip over
>> commits, so the value of the submodule itself never was set to that
>> SHA1.
>> I'm at a loss here on how to easily do this. Can someone recommend a
>> way to do this? Obviously the easier the better, as I have to somehow
>> train my team how to do this on their own.
>> Thanks in advance.
> I cannot offer an easy way. However I can come up with a proposal
> how to make this easy in the future. ;)
> "git-{branch,tag} --contains" currently only takes a commit id as that is
> easy to check for. (Just a revwalk from all commits, as we walk over the
> commits in the graph)
> We should extend the possible arguments to --contains, such that you can
> do
>     # check that a given path had this exact tree/blob id
>     git tag --contains <path>:<tree/blob-id>
>     # check if the given tree/blob was at any path
>     git tag --contains <tree/blob id>
>     # generalizing from above:
>     git tag --contains [<pathspec>:]<blob/tree id>
> With this designed API you could ask for
>     git tag --contains submodule:<sha1 from step 2>
> For the implementation of this feature the revwalk would also need
> to walk the object graph (as restricted by the pathspec) and
> see if there is the given object for each tag.
> Thanks,
> Stefan

This sounds useful, but has a limitation in regards to submodules.
Lets say that parent project points submodule commit 1.

In the submodule, you create commit 2, commit 3, and commit 4.

Then, in the parent project, you new move the submodule forward to commit 4

I think the general goal for submodules is to say "which parent commit
included this submodule commit" but the parent never ACTUALLY included
commit 3, it only included commit 4 which happens to contain commit 3.

I'm wondering if it might be worth adding an (optional) mode for
submodules which would disallow adding a submodule pointer if the
current submodule pointer is not an ancestor of the new value. This
seems like a valuable protection for many uses cases (and preserves
the behavior of a bisect to find which commit added something). It
obviously shouldn't be mandatory since people often re-wind the
submodule pointer. If you have this enabled the only way to rewind the
submodule pointer would be to rewmind the parent history itself.

You could make the --contains logic above smart enough to try and
detect "ancestor of" like now, but I think that wouldn't necessarily
buy us too much and seems pretty submodule specific.