Re: [idea] File history tracking hints
- Date: Mon, 11 Sep 2017 14:41:30 -0400
- From: Jeff King <peff@xxxxxxxx>
- Subject: Re: [idea] File history tracking hints
On Mon, Sep 11, 2017 at 10:11:31AM +0300, Pavel Kretov wrote:
> Unfortunately, the heuristic can only deal with simple file renames with
> no substantial content changes; it's helpless when you:
> - rename file and change it's content significantly;
> - split single file into several files;
> - merge several files into another;
> - copy entire file from another commit, and do other things like these.
> However, if we're able to preserve this information, it's possible
> not only to do more accurate 'git blame', but also merge revisions with
> fewer conflicts.
This is definitely something that's been discussed before on the list
(though I'm not sure of the best keywords to dig for; Stefan found one
thread but I know there have been others).
And I don't think it's a totally unreasonable idea, but there are some
complications. The biggest one is that renames are really part of a
_diff_ between two endpoints. We think of them as attached to a commit
because we tend to talk about commits as a diff from state A to state B.
So obviously in the diff HEAD^ versus HEAD, we can look at the hints for
HEAD. But what about "git diff v1.0 v1.1", that may cover multiple
commits? Right now Git doesn't look at the intermediate commits at all.
And in fact we may not even know what they are, if the command is fed
two trees. Or the two endpoints may not have a sensible history (e.g.,
consider diffing between two branches, one of which has been rebased).
But even if we had a sensible set of commits to pull hints from (e.g.,
if v1.0 and v1.1 were in a linear relationship), it's not clear to me
how you would want to apply them to an end-to-end diff.
So I don't think that these kind of tracking hints make sense for a lot
of diffs (including merges, which use diffs between the endpoints and
the merge base).
Which isn't to say that they're useless. I agree that something like
"--follow" could benefit from an annotation that tells us when and how
to pick up the next step in the traversal. But of course somebody has to
make those annotations. If we had a tool to do it automatically, then we
could apply the same tool at run-time later.
But maybe if it were an optional annotation, people would want to use it
when the normal rename logic doesn't kick in. So perhaps a baby step in
this direction would be to teach something like "--follow" to "jump"
across a non-rename when it sees a special marking in the commit