Web lists-archives.com

Re: [PATCH v1] add status config and command line options for rename detection




Hi Ben,

Overall I think this is good, but I have lots of nit-picky things to
bring up.  :-)


On Wed, May 9, 2018 at 7:42 AM, Ben Peart <Ben.Peart@xxxxxxxxxxxxx> wrote:
> Add status --no-renames command line option that enables overriding the config
> setting from the command line. Add --find-renames[=<n>] to enable detecting
> renames and optionaly setting the similarity index from the command line.

s/optionaly/optionally/

> Notes:
>     Base Ref:

No base ref?  ;-)

> +status.renameLimit::
> +       The number of files to consider when performing rename detection;
> +       if not specified, defaults to the value of diff.renameLimit.
> +
> +status.renames::
> +       Whether and how Git detects renames.  If set to "false",
> +       rename detection is disabled. If set to "true", basic rename
> +       detection is enabled.  Defaults to the value of diff.renames.

I suspect that status.renames should mention "copy", just as
diff.renames does.  (We didn't mention it in merge.renames, because
merge isn't an operation for which copy detection can be useful -- at
least not until the diffstat at the end of the merge is controlled by
merge.renames as well...)

Also, do these two config settings only affect 'git status', or does
it also affect the status shown when composing a commit message
(assuming the user hasn't turned commit.status off)?  Does it affect
`git commit --dry-run` too?

> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -109,6 +109,10 @@ static int have_option_m;
>  static struct strbuf message = STRBUF_INIT;
>
>  static enum wt_status_format status_format = STATUS_FORMAT_UNSPECIFIED;
> +static int diff_detect_rename = -1;
> +static int status_detect_rename = -1;
> +static int diff_rename_limit = -1;
> +static int status_rename_limit = -1;

Could we replace these four variables with just two: detect_rename and
rename_limit?  Keeping these separate invites people to write code
using only one of the settings rather than the appropriate inherited
mixture of them, resulting in a weird bug.  More on this below...

> @@ -1259,11 +1273,29 @@ static int git_status_config(const char *k, const char *v, void *cb)
>                         return error(_("Invalid untracked files mode '%s'"), v);
>                 return 0;
>         }
> +       if (!strcmp(k, "diff.renamelimit")) {
> +               diff_rename_limit = git_config_int(k, v);
> +               return 0;
> +       }
> +       if (!strcmp(k, "status.renamelimit")) {
> +               status_rename_limit = git_config_int(k, v);
> +               return 0;
> +       }

Here, since you're already checking diff.renamelimit first, you can
just set rename_limit in both blocks and not need both
diff_rename_limit and status_rename_limit.  (Similar can be said for
diff.renames/status.renames.)

<snip>

> @@ -1297,6 +1329,10 @@ int cmd_status(int argc, const char **argv, const char *prefix)
>                   N_("ignore changes to submodules, optional when: all, dirty, untracked. (Default: all)"),
>                   PARSE_OPT_OPTARG, NULL, (intptr_t)"all" },
>                 OPT_COLUMN(0, "column", &s.colopts, N_("list untracked files in columns")),
> +               OPT_BOOL(0, "no-renames", &no_renames, N_("do not detect renames")),
> +               { OPTION_CALLBACK, 'M', "find-renames", &rename_score_arg,
> +                 N_("n"), N_("detect renames, optionally set similarity index"),
> +                 PARSE_OPT_OPTARG, opt_parse_rename_score },

Should probably also document these options in
Documentation/git-status.txt (and maybe Documentation/git-commit.txt
as well).

Not sure if we want to add a flag for copy detection (similar to
git-diff's -C/--find-copies and --find-copies-harder), or just leave
that for when someone finds a need.  If left out, might want to just
mention that it was considered and intentionally omitted for now in
the commit message.

> @@ -1336,6 +1372,27 @@ int cmd_status(int argc, const char **argv, const char *prefix)
>         s.ignore_submodule_arg = ignore_submodule_arg;
>         s.status_format = status_format;
>         s.verbose = verbose;
> +       s.detect_rename = no_renames >= 0 ? !no_renames :
> +                                         status_detect_rename >= 0 ? status_detect_rename :
> +                                         diff_detect_rename >= 0 ? diff_detect_rename :

Combining the four vars into two as mentioned above would allow
combining the last two lines above into one.

> +       if ((intptr_t)rename_score_arg != -1) {

I don't understand why rename_score_arg is a (char*) and then you need
to cast to intptr_t, but that may just be because I haven't done much
of anything with option parsing.  A quick look at the docs isn't
making it clear to me, though; could you enlighten me?

> +               s.detect_rename = DIFF_DETECT_RENAME;

What if status.renames is 'copy' but someone wants to override the
score for detecting renames and pass --find-renames=40?  Does the
--find-renames override and degrade the 'copy'?  I think it'd make
more sense to increase s.detect_rename to at least DIFF_DETECT_RENAME,
rather than just outright setting it here.

> +               if (rename_score_arg)
> +                       s.rename_score = parse_rename_score(&rename_score_arg);
> +       }
> +       s.rename_limit = status_rename_limit >= 0 ? status_rename_limit :
> +                                        diff_rename_limit >= 0 ? diff_rename_limit :

Again, combination of variables could allow these last two lines to be combined.

> +                                        s.rename_limit;
> +
> +       /*
> +        * We do not have logic to handle the detection of copies.  In
> +        * fact, it may not even make sense to add such logic: would we
> +        * really want a change to a base file to be propagated through
> +        * multiple other files by a merge?
> +        */
> +       if (s.detect_rename > DIFF_DETECT_RENAME)
> +               s.detect_rename = DIFF_DETECT_RENAME;

This comment and code made sense in merge-recursive.c (which doesn't
show detected diffs/renames/copies but just uses them for internal
processing logic).  It does not make sense here; git status could show
detected copies much like `git diff -C --name-status` shows it.  In
fact, a quick grep for DIFF_STATUS_COPIED shows multiple hits in
wt-status.c, so I suspect it already has the necessary logic for
displaying copies.


I looked over the rest of the patch.  Nice testcases.  Adding a couple
more testcases around copy detection could be useful.