Web lists-archives.com

Re: [PATCH v4] log,diff-tree: add --combined-all-names option




On Thu, Feb 7, 2019 at 12:25 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Elijah Newren <newren@xxxxxxxxx> writes:
>
> > I think "copy from" and "rename from" should be relatively
> > straightforward.  However, in a combined diff, we could have both a
> > modified status, a renamed status, and a copied status, meaning that
> > we'll need an array of both similarity and dissimilarity indexes...and
> > trying to present that to the user in a way that makes sense seems
> > like a lost cause to me.  Does anyone else know how to represent that?
> >  I'm inclined to just leave it out.
> >
> > Also, I'm afraid "copy to" and "rename to" could be confusing if both
> > appeared, since there's only one "to" path.  If there is both a copy
> > and a rename involved relative to different parents, should these be
> > coalesced into a "copy/rename to" line?
>
> There are three possible labels (i.e. 'in-place modification',
> 'rename from elsewhere' and 'copy from elsewhere'), and you can say
> "this commit created file F by renaming from X (or by copying X)"
> only when you know path F did not exist _immediately before_ this
> commit.  The distinction between rename and copy is whether the path
> X remains in the resulting commit (i.e. if there is no X, the commit
> created path F by moving X; if there is X, the commit copied the
> contents of X into a new path F).
>
> So telling renames and copies apart is probably straight-forward (if
> you have sufficient information---I am not sure if you do in this
> codepath offhand); as long as you know what pathname each preimage
> (i.e. parent of the perge) tree had and if that pathname is missing
> in the postimage (luckily there is only one---the merge result), it
> was renamed, and otherwise it was copied.

We have change status, M, C, A, R, D, etc.  So, R vs. C tells us
renamed or copied.  We also have the original filename.

> But telling in-place modification and other two might be
> trickier. In one parent path F may be missing but in the other
> parent path F may exist, and the result of the merge is made by
> merging the contents of path X in the first parent and the contents
> of path F in the second parent.  From the view of the transition
> between the first parent to the merge result, we moved the path X to
> path F and made some modifications (i.e. renamed).  From the view of
> the transition from the other branch, we kept the contents in path F
> during the transition and there is no renames or copies involved.
>
> Actually what I had in mind when I mentioned the extended headers
> the first time in this discussion was that we would have "rename
> from", "copy from", etc. separately for each parent, as the contents
> may have come from different paths in these parents.  And that was
> where my earlier "... might only become waste of the screen real
> estate" comes from.

I think I'm with you on everything you said here, but perhaps not
since I can't see an answer to my question.  Maybe an example will
help:

Let's say we have an octopus merge.  Parent 1 had file F.  Parent 2
had file X.  Parent 3 had file Y.  The octopus has two files: F' and
X, with F' being very similar to F, X, and Y.

There's no "modified from" header; it's not needed (unless we want to
add a new kind of noise header?)
We could emit a "copied from X" header, due to parent 2.
We could emit a "renamed from Y" header, due to parent 3.

Now, the question: In addition to the two "from" headers, how many
"to" headers do we emit?  In particular, do we emit both a "copied to
F" and a "renamed to F" header, or just a combined "renamed/copied to
F" header?  I'm inclined to go with the latter, to avoid giving the
idea that there are multiple targets, but maybe folks expect there to
be one "rename to" and "copy to" for each "rename from" or "copy from"
that appeared.

> So, again, do not spend too much effort to emit these textual info
> that can be easily seen with the N+1 plus/minus header lines.