Web lists-archives.com

Re: [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic

On 4/10/2018 6:09 PM, Martin Ågren wrote:
On 10 April 2018 at 23:04, Ben Peart <Ben.Peart@xxxxxxxxxxxxx> wrote:
The File System Excludes module is a new programmatic way to exclude files and
folders from git's traversal of the working directory.  fsexcludes_init() should
be called with a string buffer that contains a NUL separated list of path names
of the files and/or directories that should be included.  Any path not listed
will be excluded. The paths should be relative to the root of the working
directory and be separated by a single NUL.

The excludes logic in dir.c has been updated to honor the results of
fsexcludes_is_excluded_from().  If fsexcludes does not exclude the file, the
normal excludes logic is also checked as it could further reduce the set of
files that should be included.

Here you mention a change in dir.c...

  Makefile     |   1 +
  fsexcludes.c | 210 +++++++++++++++++++++++++++++++++++++++++++++++++++
  fsexcludes.h |  27 +++++++
  3 files changed, 238 insertions(+)

... but this patch does not seem to touch dir.c at all.

Oops! Fixed in V2.

+static int check_fsexcludes_hashmap(struct hashmap *map, const char *pattern, int patternlen)
+       struct strbuf sb = STRBUF_INIT;
+       struct fsexcludes fse;
+       char *slash;
+       /* Check straight mapping */
+       strbuf_reset(&sb);

You could drop this strbuf_reset(). Or did you intend to use a static
struct strbuf?

Good point, fixed in V2.

+       /*
+        * Check to see if it matches a directory or any path
+        * underneath it.  In other words, 'a/b/foo.txt' will match
+        * '/', 'a/', and 'a/b/'.
+        */
+       slash = strchr(sb.buf, '/');
+       while (slash) {
+               fse.pattern = sb.buf;
+               fse.patternlen = slash - sb.buf + 1;
+               hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
+               if (hashmap_get(map, &fse, NULL)) {
+                       strbuf_release(&sb);
+                       return 0;
+               }
+               slash = strchr(slash + 1, '/');
+       }

Maybe a for-loop would make this slightly more obvious:

for (slash = strchr(sb.buf, '/'); slash; slash = strchr(slash + 1, '/'))

On second thought, maybe not.

+       entry = buf = fsexcludes_data->buf;
+       len = fsexcludes_data->len;
+       for (i = 0; i < len; i++) {
+               if (buf[i] == '\0') {
+                       fsexcludes_hashmap_add(map, entry, buf + i - entry);
+                       entry = buf + i + 1;
+               }
+       }

Very minor: I would have found "buf - entry + i" clearer here and later,
but I'm sure you'll find someone of the opposing opinion (e.g.,
yourself). ;-)

+static int check_directory_hashmap(struct hashmap *map, const char *pathname, int pathlen)
+       struct strbuf sb = STRBUF_INIT;
+       struct fsexcludes fse;
+       /* Check for directory */
+       strbuf_reset(&sb);

Same comment as above about this spurious reset.

Good point, fixed in V2.

+       if (hashmap_get(map, &fse, NULL)) {
+               strbuf_release(&sb);
+               return 0;
+       }
+       strbuf_release(&sb);
+       return 1;
+ * Return 1 for exclude, 0 for include and -1 for undecided.
+ */
+int fsexcludes_is_excluded_from(struct index_state *istate,
+       const char *pathname, int pathlen, int dtype)

Will we at some point regret not being able to "return negative on
error"? I guess that would be "-2" or "negative other than -1".

This function is modeled after the other is_excluded_from* functions in dir.c so that the return value can be handled the same way. I don't anticipate any need for change but you're right, we could return some other "negative other than -1" if it was ever needed.

+void fsexcludes_init(struct strbuf *sb) {
+       fsexcludes_initialized = 1;
+       fsexcludes_data = *sb;

Grabbing the strbuf's members looks a bit odd. Is this
performance-sensitive enough that you do not want to make a copy? If a
caller releases its strbuf, which would normally be a good thing to do,
we may be in big trouble later. (Not only may .buf be stale, .len may
indicate we actually have something to read.)

I can understand that you do not want to pass a pointer+len, and that it
is not enough to pass sb.buf, since the string may contain nuls.

Maybe detach the original strbuf? That way, if a caller releases its
buffer, that is a no-op. A caller which goes on to use its buffer should
fail quickly and obviously. Right now, an incorrect caller would
probably fail more subtly and less reproducibly.

In any case, maybe document this in the .h-file?

Great suggestion! I was looking for a better way to ensure the buffer ownership transfer was robust. I'll do both strbuf_detach() and update the header file. Thank you.