Re: [PATCH] blame: add the ability to ignore commits
- Date: Tue, 8 Jan 2019 11:27:42 -0500
- From: Barret Rhoden <brho@xxxxxxxxxx>
- Subject: Re: [PATCH] blame: add the ability to ignore commits
On 2019-01-07 at 15:13 Junio C Hamano <gitster@xxxxxxxxx> wrote:
> If I read it correctly, this gives a very limited form of -S, in the
> sense that anything this can do can be expressed by using -S but the
> reverse is not true, but is designed to be easier to use, in the
> sense that unlike -S, this does not have to describe the part of the
> history you do not have to lie about. The documentation should say
> something about these pros-and-cons to help readers decide which
> feature to use.
Yeah, -S lists the revs to use, this lists the revs to *not* use. I'll
add a note.
> I somehow feel that this is rare enough that it should not squat on
> short-and-sweet '-i'. We would want to reserve it to something like
> "--ignore-case", for example.
Can do. I'll change the interface to your suggestion from down below.
> > The file .git-blame-ignore-revs is checked by default.
> Giving the projects a way to easily help participants to share the
> same distorted view of the history is a good idea, but I do not
> think we should allow projects to do so to those who merely clone it
> without their consent. IOW, "by default" is a terrible idea.
> Perhaps paying attention to blame.skipRevsFile configuration
> variable that is set by the end user would be sufficient----the
> project can ship .blame-skip-revs (or any random filename of their
> choice) in-tree as tracked contents and tell its users that they can
> optionally use it.
A blame config option works for me. I'd like the users/cloners of a
repo to not need to do anything extravagant, but a one-time config
would be fine.
> > It's useful to be alerted to the presence of an ignored commit in the
> > history of a line. Those lines will be marked with '*' in the
> > non-porcelain output. The '*' is attached to the line number to keep
> > from breaking tools that rely on the whitespace between columns.
> A policy decision like the above two shouldn't be hardcoded in the
> feature like this, but should be done as a separate option. By
> default, these shouldn't be marked with '*', as the same tools you
> said you are afraid of breaking would be expecting a word with only
> digits and no asterisk in the column where line numbers appear and
> will get broken by this change if done unconditionally.
Since users are already opting-in to the blame-ignore, do you also want
them to opt-in to the annotation? I can make a separate config option
to turn on the annotation. Any preference for how it is marked?
> In general, I find this patch trying to change too many things at
> the same time, without sufficient justification. Perhaps do these
> different things as separate steps in a single series?
> > A blame_entry attributed to an ignored commit will get passed to its
> > parent.
> Obviously, an interesting consideration is what happens when a merge
> commit is skipped. Is it sufficient to change this description to
> "...will get passed to its parentS", or would the code do completely
> nonsensical things without further tweaks (a possible simple tweak
> could be to refuse skipping merges)?
If we skip a merge commit, it might pick the wrong parent. For
example, this line was brought in via a merge:
$ ~/src/git/git-blame include/linux/mm.h | grep VM_SYNC
b6fb293f2497a (Jan Kara 2017-11-01 16:36:41 +0100 204) #define VM_SYNC
It's from merge: a3841f94c7ec ("Merge tag 'libnvdimm-for-4.15', and if
we ignore it:
$ ~/src/git/git-blame -i a3841f94c7ecb include/linux/mm.h | grep VM_SYNC
cc2383ec06be0 (Konstantin Khlebnikov 2012-10-08 16:28:37 -0700 204*) #define VM_SYNC
The wrong commit is blamed.
I can put a note in the doc about it and die if we're given a merge
commit. Is there a convenient helper for detecting a merge, or can I
just check for multiple parents? (something like commit->parents &&
> > If an ignored commit changed a line, an ancestor that changed
> > the line will get blamed. However, if an ignored commit added lines, a
> > commit changing a nearby line may get blamed. If no commit is found,
> > the original commit for the file will get blamed.
> The above somehow does not read as describing a carefully designed
> behaviour; rather, it sounds as if it is saying "the code does
> whatever random things it happens to do". For example, when there
> is a newly added line how is "A" commit changing a nearby line
> chosen (a line has lines before it and after it---they may be
> touched by different commits, and before and after that skipped
> commit, so there are multiple commits to choose from)?
This was more of a commentary about its behavior. If you ignore a
commit that added lines, it'd be nice to get a hint of what might have
caused it, and picking a commit that affected an adjacent line seemed
fine. But yeah, it's not doing anything crazy.
> > diff --git a/Documentation/git-blame.txt b/Documentation/git-blame.txt
> > index 16323eb80e31..e41375374892 100644
> > --- a/Documentation/git-blame.txt
> > +++ b/Documentation/git-blame.txt
> > @@ -10,6 +10,7 @@ SYNOPSIS
> > [verse]
> > 'git blame' [-c] [-b] [-l] [--root] [-t] [-f] [-n] [-s] [-e] [-p] [-w] [--incremental]
> > [-L <range>] [-S <revs-file>] [-M] [-C] [-C] [-C] [--since=<date>]
> > + [-i <rev>] [--no-default-ignores] [--ignore-file=<file>]
> > [--progress] [--abbrev=<n>] [<rev> | --contents <file> | --reverse <rev>..<rev>]
> > [--] <file>
> > @@ -84,6 +85,20 @@ include::blame-options.txt
> > Ignore whitespace when comparing the parent's version and
> > the child's to find where the lines came from.
> > +-i <rev>::
> > + Ignore revision when assigning blame. Lines that were changed by an
> > + ignored commit will be marked with a `*` in the blame output. Lines
> > + that were added by an ignored commit may be attributed commits making
> > + nearby changes or to the first commit touching the file.
> It probably deserves to be told that this option can be given
> multiple times and used cumulatively (unlike usual "last one wins"
> > +--no-default-ignores::
> > + Do not automatically ignore revisions in the file
> > + `.git-blame-ignore-revs`.
> This should not be "opt-out" like this.
> > +--ignore-file=<file>::
> > + Ignore revisions listed in `file`, one revision per line. Whitespace
> > + and comments beginning with `#` are ignored.
> Should it be capable to take two or more ignore-files? Or should we
> use the usual "the last one wins" rule?
> I think we should support blame.skipRevFile configuration variable
> so that the users do not have to constantly give the option from the
> command line. And with that, there is no need to have a hardcoded
> filename .git-blame-ignore-revs or anything like that.
> If we are to use configuration variable, however, we'd need a way to
> override its use from the command line, too. Perhaps a sane
> arrangement would be
> - if one or more --skip-revs-file=<file> are given from the
> command line, use all of them and ignore blame.skipRevsFile
> configuration variable.
> - if no --skip-revs-file=<file> is given from the command line,
> use blame.skipRevsFile configuration variable.
> - regardless of the above two, pay attention to --skip-rev=<rev>
> command line option(s).
Sounds fine to me.
> Somehow the damage to blame.c codebase looks way too noisy for what
> the code wants to do. If all we want is to pretend in a history,
> that commit B does not exist, i.e. use a distorted view of the
> wouldn't it be sufficient to modify pass_blame(), i.e. the only the
> caller of the pass_blame_to_parent(), where we find the parent
> commits of "C" and dig the history to pass blame to "B", and have it
> pretend as if "B" does not exist and pass blame directly to "A"?
I originally tried to skip 'B' in pass_blame() when B popped up as a
scapegoat. That broke the offsets of the blame_entries in the
parent. By running diff_hunks(), we get the opportunity to adjust the
offsets. Also, when it comes to marking the blame_entries for marking
the output, we want to know the specific diffs and to break them up at
the boundaries of [tlno,same) in blame_chunk().
> Thanks. I am personally not all that interested in this yet.
Thanks for taking a look.