Web lists-archives.com

Re: [PATCH v1 2/2] log: add option to choose which refs to decorate

On 04/11/17 03:49, Junio C Hamano wrote:
Rafael Ascensão <rafa.almas@xxxxxxxxx> writes:

Using `--exclude=<pattern>` can help mitigate that verboseness by
removing unnecessary 'branches' from the output. However, if the tip of
an excluded ref points to an ancestor of a non-excluded ref, git will
decorate it regardless.

Is this even relevant?  I think the above would only serve to
confuse the readers.  --exclude, --branches, etc. are ways to
specify what starting points "git log" history traversal should
begin and has nothing to do with what set of refs are to be used to
decorate the commits that are shown.  But the paragraph makes
readers wonder if it might have any effect in some circumstances.

With `--decorate-refs=<pattern>`, only refs that match <pattern> are
decorated while `--decorate-refs-exclude=<pattern>` allows to do the
reverse, remove ref decorations that match <pattern>

And "Only refs that match ... are decorated" is also confusing.  The
thing is, refs are never decorated, they are used for decorating
commits in the output from "git log". For example, if you have

and B is at the tip of the 'master' branch, the output from "git log
D" would decorate B with 'master', even if you do not say 'master'
on the command line as the commit to start the traversal from. >
Perhaps drop the irrelevant paragraph about "--exclude" and write
something like this instead?

	When "--decorate-refs=<pattern>" is given, only the refs
	that match the pattern is used in decoration.  The refs that
	match the pattern, when "--decorate-refs-exclude=<pattern>"
	is given, are never used in decoration.

What you explained was the reason I mentioned that. Because some users were wrongfully trying to remove decorations by trying to exclude the starting points. But I agree this adds little value and can generate further confusion. I will remove that section.

Both can be used together but --decorate-refs-exclude patterns have
precedence over --decorate-refs patterns.

A reasonable and an easy-to-explain way to mix zero or more positive
and zero or more negagive patterns that follows the convention used
elsewhere in the system (e.g. how negative pathspecs work) is

  (1) if there is no positive pattern given, pretend as if an
      inclusive default positive pattern was given;

  (2) for each candidate, reject it if it matches no positive
      pattern, or if it matches any one of negative patterns.

For pathspecs, we use "everything" as the inclusive default positive
pattern, I think, and for the set of refs used for decoration, a
reasonable choice would also be to use "everything", which matches
the current behaviour.

That's a nice explanation that fits the current "--decorate-refs" behavior.

The pattern follows similar rules as `--glob` except it doesn't assume a
trailing '/*' if glob characters are missing.

Why should this be a special case that burdens users to remember one
more rule?  Wouldn't users find "--decorate-refs=refs/tags" useful
and it woulld be shorter and nicer than having to say "refs/tags/*"?

I wanted to allow exact patterns like:
"--decorate-refs=refs/heads/master" and for that I disabled the flag that adds the trailing '/*' if no globs are found. As a side effect, I lost the shortcut.

Is adding a yet another flag that appends '/*' only if the pattern equals "refs/{heads,remotes,tags}" a good idea?

Because changing the default behavior of that function has implications on multiple commands which I think shouldn't change. But at the same time, would be nice to have the logic that deals with glob-ref patterns all in one place.

What's the sane way to do this?

diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
index 32246fdb0..314417d89 100644
--- a/Documentation/git-log.txt
+++ b/Documentation/git-log.txt
@@ -38,6 +38,18 @@ OPTIONS
  	are shown as if 'short' were given, otherwise no ref names are
  	shown. The default option is 'short'.
+	Only print ref names that match the specified pattern. Uses the same
+	rules as `git rev-list --glob` except it doesn't assume a trailing a
+	trailing '/{asterisk}' if pattern lacks '?', '{asterisk}', or '['.
+	`--decorate-refs-exlclude` has precedence.
+	Do not print ref names that match the specified pattern. Uses the same
+	rules as `git rev-list --glob` except it doesn't assume a trailing a
+	trailing '/{asterisk}' if pattern lacks '?', '{asterisk}', or '['.
+	Has precedence over `--decorate-refs`.

These two may be technically correct, but I wonder if we can make it
easier to understand (I found "precedence" bit hard to follow, as in
my mind, these are ANDed conditions and between (A & ~B), there is
no "precedence").  Also we'd want to clarify what happens when only
"--decorate-refs-exclude"s are given, which in turn necessitates us
to describe what happens when only "--decorate-refs"s are given.

I believe the same explanation mentioned earlier fits nicely here too.

diff --git a/log-tree.c b/log-tree.c
index cea056234..8efc7ac3d 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -94,9 +94,33 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
  	struct object *obj;
  	enum decoration_type type = DECORATION_NONE;
+	struct ref_include_exclude_list *filter = (struct ref_include_exclude_list *)cb_data;
+	struct string_list_item *item;
+	struct strbuf real_pattern = STRBUF_INIT;
+	if(filter && filter->exclude->nr > 0) {

Have SP before '('.

+		/* if current ref is on the exclude list skip */
+		for_each_string_list_item(item, filter->exclude) {
+			strbuf_reset(&real_pattern);
+			normalize_glob_ref(&real_pattern, NULL, item->string, 0);
+			if (!wildmatch(real_pattern.buf, refname, 0))
+				goto finish;
+		}
+	}
- assert(cb_data == NULL);
+	if (filter && filter->include->nr > 0) {
+		/* if current ref is present on the include jump to decorate */
+		for_each_string_list_item(item, filter->include) {
+			strbuf_reset(&real_pattern);
+			normalize_glob_ref(&real_pattern, NULL, item->string, 0);
+			if (!wildmatch(real_pattern.buf, refname, 0))
+				goto decorate;
+		}
+		/* Filter was given, but no match was found, skip */
+		goto finish;
+	}

The above seems to implement the natural mixing of negative and
positive patterns, which is good.

Unless I am missing something, I think these normalize_grob_ref()
calls should be removed from this function; add_ref_decoration() is
called once for EVERY ref the repository has, so you are normalizing
a handful of patterns you got from the user over and over to get the
same normalization, possibly thousands of times in a repository of a
project with long history.

You have finished collecting patterns on filter->{exclude,include}
list from the user by the time "for_each_ref(add_ref_decoration)" is
called in load_ref_decorations(), and these patterns never changes
after that.

Perhaps normalize the patterns inside load_ref_decorations() only
once and have the normalized patterns in the filter lists?

This would be what a sane person would do. This detail went over my head. Will move it to load_ref_decorations()

  	if (starts_with(refname, git_replace_ref_base)) {
  		struct object_id original_oid;
  		if (!check_replace_refs)
@@ -136,6 +160,9 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid,
  		add_name_decoration(DECORATION_REF_TAG, refname, obj);
+	strbuf_release(&real_pattern);
  	return 0;
@@ -148,15 +175,15 @@ static int add_graft_decoration(const struct commit_graft *graft, void *cb_data)
  	return 0;
-void load_ref_decorations(int flags)
+void load_ref_decorations(int flags, struct ref_include_exclude_list *data)
  	if (!decoration_loaded) {
decoration_loaded = 1;
  		decoration_flags = flags;
-		for_each_ref(add_ref_decoration, NULL);
-		head_ref(add_ref_decoration, NULL);
-		for_each_commit_graft(add_graft_decoration, NULL);
+		for_each_ref(add_ref_decoration, data);
+		head_ref(add_ref_decoration, data);
+		for_each_commit_graft(add_graft_decoration, data);

Don't name that variable "data".

for_each_*() and friends that take a callback with callback specific
data MUST call the callback specific data as generic, e.g. cb_data,
because they do not know what they are passing.  The callers of
these functions, like this one, however, know what they are passing.
Also load_ref_decorations() itself knows what its second parameter

     void load_ref_decorations(int flags, struct decoration_filter *filter)

or something (see below).

diff --git a/log-tree.h b/log-tree.h
index 48f11fb74..66563af88 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -7,6 +7,10 @@ struct log_info {
  	struct commit *commit, *parent;
+struct ref_include_exclude_list {
+	struct string_list *include, *exclude;

The "decoration" is not the only thing related to "ref" in the
log-tree API; calling this structure that filters what refs to be
used for decoration with the above name without saying that this is
about "decoration" is too selfish and unmaintainable.

How about "struct decoration_filter" and rename the fields to say
"{include,exclude}_ref_pattern" or something like that?  The
renaming of the fields to include "ref" somewhere is coming from the
same concern---it will be selfish and narrow-minded to imagine that
the ways to filter refs used for decoration will stay forever only
based on refnames and nothing else, which would be the reason not to
have "ref" somewhere in the names.

I will make the corrections. Thanks for the feedback.