Web lists-archives.com

Re: [PATCH v1 2/2] log: add option to choose which refs to decorate

Rafael Ascensão <rafa.almas@xxxxxxxxx> writes:

> When `log --decorate` is used, git will decorate commits with all
> available refs. While in most cases this the desired effect, under some
> conditions it can lead to excessively verbose output.


> Using `--exclude=<pattern>` can help mitigate that verboseness by
> removing unnecessary 'branches' from the output. However, if the tip of
> an excluded ref points to an ancestor of a non-excluded ref, git will
> decorate it regardless.

Is this even relevant?  I think the above would only serve to
confuse the readers.  --exclude, --branches, etc. are ways to
specify what starting points "git log" history traversal should
begin and has nothing to do with what set of refs are to be used to
decorate the commits that are shown.  But the paragraph makes
readers wonder if it might have any effect in some circumstances.

> With `--decorate-refs=<pattern>`, only refs that match <pattern> are
> decorated while `--decorate-refs-exclude=<pattern>` allows to do the
> reverse, remove ref decorations that match <pattern>

And "Only refs that match ... are decorated" is also confusing.  The
thing is, refs are never decorated, they are used for decorating
commits in the output from "git log".  For example, if you have 


and B is at the tip of the 'master' branch, the output from "git log
D" would decorate B with 'master', even if you do not say 'master'
on the command line as the commit to start the traversal from.

Perhaps drop the irrelevant paragraph about "--exclude" and write
something like this instead?

	When "--decorate-refs=<pattern>" is given, only the refs
	that match the pattern is used in decoration.  The refs that
	match the pattern, when "--decorate-refs-exclude=<pattern>"
	is given, are never used in decoration.

> Both can be used together but --decorate-refs-exclude patterns have
> precedence over --decorate-refs patterns.

A reasonable and an easy-to-explain way to mix zero or more positive
and zero or more negagive patterns that follows the convention used
elsewhere in the system (e.g. how negative pathspecs work) is

 (1) if there is no positive pattern given, pretend as if an
     inclusive default positive pattern was given;

 (2) for each candidate, reject it if it matches no positive
     pattern, or if it matches any one of negative patterns.

For pathspecs, we use "everything" as the inclusive default positive
pattern, I think, and for the set of refs used for decoration, a
reasonable choice would also be to use "everything", which matches
the current behaviour.

> The pattern follows similar rules as `--glob` except it doesn't assume a
> trailing '/*' if glob characters are missing.

Why should this be a special case that burdens users to remember one
more rule?  Wouldn't users find "--decorate-refs=refs/tags" useful
and it woulld be shorter and nicer than having to say "refs/tags/*"?

> diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
> index 32246fdb0..314417d89 100644
> --- a/Documentation/git-log.txt
> +++ b/Documentation/git-log.txt
> @@ -38,6 +38,18 @@ OPTIONS
>  	are shown as if 'short' were given, otherwise no ref names are
>  	shown. The default option is 'short'.
> +--decorate-refs=<pattern>::
> +	Only print ref names that match the specified pattern. Uses the same
> +	rules as `git rev-list --glob` except it doesn't assume a trailing a
> +	trailing '/{asterisk}' if pattern lacks '?', '{asterisk}', or '['.
> +	`--decorate-refs-exlclude` has precedence.
> +
> +--decorate-refs-exclude=<pattern>::
> +	Do not print ref names that match the specified pattern. Uses the same
> +	rules as `git rev-list --glob` except it doesn't assume a trailing a
> +	trailing '/{asterisk}' if pattern lacks '?', '{asterisk}', or '['.
> +	Has precedence over `--decorate-refs`.

These two may be technically correct, but I wonder if we can make it
easier to understand (I found "precedence" bit hard to follow, as in
my mind, these are ANDed conditions and between (A & ~B), there is
no "precedence").  Also we'd want to clarify what happens when only
"--decorate-refs-exclude"s are given, which in turn necessitates us
to describe what happens when only "--decorate-refs"s are given.

> diff --git a/log-tree.c b/log-tree.c
> index cea056234..8efc7ac3d 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -94,9 +94,33 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
>  {
>  	struct object *obj;
>  	enum decoration_type type = DECORATION_NONE;
> +	struct ref_include_exclude_list *filter = (struct ref_include_exclude_list *)cb_data;
> +	struct string_list_item *item;
> +	struct strbuf real_pattern = STRBUF_INIT;
> +
> +	if(filter && filter->exclude->nr > 0) {

Have SP before '('.

> +		/* if current ref is on the exclude list skip */
> +		for_each_string_list_item(item, filter->exclude) {
> +			strbuf_reset(&real_pattern);
> +			normalize_glob_ref(&real_pattern, NULL, item->string, 0);
> +			if (!wildmatch(real_pattern.buf, refname, 0))
> +				goto finish;
> +		}
> +	}
> -	assert(cb_data == NULL);
> +	if (filter && filter->include->nr > 0) {
> +		/* if current ref is present on the include jump to decorate */
> +		for_each_string_list_item(item, filter->include) {
> +			strbuf_reset(&real_pattern);
> +			normalize_glob_ref(&real_pattern, NULL, item->string, 0);
> +			if (!wildmatch(real_pattern.buf, refname, 0))
> +				goto decorate;
> +		}
> +		/* Filter was given, but no match was found, skip */
> +		goto finish;
> +	}

The above seems to implement the natural mixing of negative and
positive patterns, which is good.

Unless I am missing something, I think these normalize_grob_ref()
calls should be removed from this function; add_ref_decoration() is
called once for EVERY ref the repository has, so you are normalizing
a handful of patterns you got from the user over and over to get the
same normalization, possibly thousands of times in a repository of a
project with long history.

You have finished collecting patterns on filter->{exclude,include}
list from the user by the time "for_each_ref(add_ref_decoration)" is
called in load_ref_decorations(), and these patterns never changes
after that.  

Perhaps normalize the patterns inside load_ref_decorations() only
once and have the normalized patterns in the filter lists?

> +decorate:
>  	if (starts_with(refname, git_replace_ref_base)) {
>  		struct object_id original_oid;
>  		if (!check_replace_refs)
> @@ -136,6 +160,9 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
>  			parse_object(&obj->oid);
>  		add_name_decoration(DECORATION_REF_TAG, refname, obj);
>  	}
> +
> +finish:
> +	strbuf_release(&real_pattern);
>  	return 0;
>  }
> @@ -148,15 +175,15 @@ static int add_graft_decoration(const struct commit_graft *graft, void *cb_data)
>  	return 0;
>  }
> -void load_ref_decorations(int flags)
> +void load_ref_decorations(int flags, struct ref_include_exclude_list *data)
>  {
>  	if (!decoration_loaded) {
>  		decoration_loaded = 1;
>  		decoration_flags = flags;
> -		for_each_ref(add_ref_decoration, NULL);
> -		head_ref(add_ref_decoration, NULL);
> -		for_each_commit_graft(add_graft_decoration, NULL);
> +		for_each_ref(add_ref_decoration, data);
> +		head_ref(add_ref_decoration, data);
> +		for_each_commit_graft(add_graft_decoration, data);

Don't name that variable "data".

for_each_*() and friends that take a callback with callback specific
data MUST call the callback specific data as generic, e.g. cb_data,
because they do not know what they are passing.  The callers of
these functions, like this one, however, know what they are passing.
Also load_ref_decorations() itself knows what its second parameter

    void load_ref_decorations(int flags, struct decoration_filter *filter)

or something (see below).

>  	}
>  }
> diff --git a/log-tree.h b/log-tree.h
> index 48f11fb74..66563af88 100644
> --- a/log-tree.h
> +++ b/log-tree.h
> @@ -7,6 +7,10 @@ struct log_info {
>  	struct commit *commit, *parent;
>  };
> +struct ref_include_exclude_list {
> +	struct string_list *include, *exclude;
> +};

The "decoration" is not the only thing related to "ref" in the
log-tree API; calling this structure that filters what refs to be
used for decoration with the above name without saying that this is
about "decoration" is too selfish and unmaintainable.

How about "struct decoration_filter" and rename the fields to say
"{include,exclude}_ref_pattern" or something like that?  The
renaming of the fields to include "ref" somewhere is coming from the
same concern---it will be selfish and narrow-minded to imagine that
the ways to filter refs used for decoration will stay forever only
based on refnames and nothing else, which would be the reason not to
have "ref" somewhere in the names.