Web lists-archives.com

Re: [PATCH 1/3] ls-tree: make <tree-ish> optional

Thanks for contributing to Git. As this seems to be your first
submission to the project, don't be alarmed by the extent and nature
of the review comments. They are intended to help you polish the
submission, and are not meant with ill-intent.

On Mon, Jul 2, 2018 at 11:58 PM Joshua Nelson <jyn514@xxxxxxxxx> wrote:
> use syntax similar to `git-checkout` to make <tree-ish> optional for
> `ls-tree`. if <tree-ish> is omitted, default to HEAD. infer arguments as
> follows:

Nit: Capitalize first word of each sentence.

This commit message explains what the patch changes (which is a good
thing to do), but it's missing the really important explanation of
_why_ this change is desirable. Without such justification, it's
difficult to judge if such a change is worthwhile. As this is a
plumbing command, people may need more convincing (especially if other
plumbing commands don't provide convenient defaults like this).

> 1. if args start with --
>         assume <tree-ish> to be HEAD
> 2. if exactly one arg precedes --, treat the argument as <tree-ish>
> 3. if more than one arg precedes --, exit with an error
> 4. if -- is not in args
>         a) if args[0] is a valid <tree-ish> object, treat is as such
>         b) else, assume <tree-ish> to be HEAD
> in all cases, every argument besides <tree-ish> is treated as a <path>

This and the other patches are missing your Signed-off-by: (which
should be placed right here).

The three patches of this series are all directly related to this one
change. As such, they should be combined into a single patch. (At the
very least, 1/3 and 2/3 should be combined; one could argue that 3/3
is lengthy enough to make it separate, but that's a judgment call.)

Now, on to the actual code...

> diff --git builtin/ls-tree.c builtin/ls-tree.c
> @@ -163,10 +163,39 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>             ((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
>                 ls_options |= LS_SHOW_TREES;
> +       const char *object;
> +       short initialized = 0;

This project frowns on declaration-after-statement. Move these to the
top of the {...} block where other variables are declared.

Why use 'short'? Unless there's a very good reason that this needs to
be a particular size, you shouldn't deviate from project norm, which
is to use the natural word size 'int'.

'initialized' is too generic, thus isn't a great name.

>         if (argc < 1)
> -               usage_with_options(ls_tree_usage, ls_tree_options);
> -       if (get_oid(argv[0], &oid))
> -               die("Not a valid object name %s", argv[0]);
> +               object = "HEAD";
> +       else {
> +               /* taken from checkout.c;
> +                * we have a simpler case because we never create a branch */

Style: Multi-line comments are formatted like this:

     * Gobble
     * wobble.

However, this comment doesn't belong in the code, as it won't be
particularly helpful to anyone reading the code in the future. This
sort of note would be more appropriate in the commit message or, even
better, in the commentary section just below the "---" lines following
your Signed-off-by:.

> +               short dash_dash_pos = -1, i = 0;

Same question about why you used 'short' rather than 'int' (especially
as 'dash_dash_pos' is declared 'int' in checkout.c).

Is there a good reason why you initialize 'i' in the declaration
rather than in the for-loop? A _good_ reason to do so in the for-loop
is that doing so makes the for-loop more idiomatic, reduces cognitive
load on readers (since they don't have to chase down the
initialization), and safeguards against someone adding new code
between the declaration and the for-loop which might inadvertently
alter the initial value.

> +               for (; i < argc; i++) {
> +                       if (!strcmp(argv[i], "--")) {
> +                               dash_dash_pos = i;
> +                               break;
> +                       }
> +               }
> +               if (dash_dash_pos == 0) {
> +                       object = "HEAD";
> +                       argv++, argc++;

'argc' is never accessed beyond this point, so changing its value is
pointless. Same observation for the 'else' arms. (And, even if there
was a valid reason to modify 'argc', I think you'd want to be
decrementing it, not incrementing it, to show that you've consumed an

> +               } else if (dash_dash_pos == 1) {
> +                       object = argv[0];
> +                       argv += 2, argc += 2;
> +               } else if (dash_dash_pos >= 2)
> +                       die(_("only one reference expected, %d given."), dash_dash_pos);
> +               else if (get_oid(argv[0], &oid)) // not a valid object

Style: Use /* comments */ in this codebase, not // comments.

> +                       object = "HEAD";
> +               else {
> +                       argv++, argc++;
> +                       initialized = 1;
> +               }
> +       }
> +
> +       if (!initialized) // if we've already run get_oid, don't run it again
> +               if (get_oid(object, &oid))
> +                       die("Not a valid object name %s", object);

Can't you accomplish the same without the 'initialized' variable by
instead initializing 'object' to NULL and then checking it here?

    if (object && get_oid(object, &oid))
        die(_("not a valid object name: %s", object);