Web lists-archives.com

Re: [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic




On 10 April 2018 at 23:04, Ben Peart <Ben.Peart@xxxxxxxxxxxxx> wrote:
> The File System Excludes module is a new programmatic way to exclude files and
> folders from git's traversal of the working directory.  fsexcludes_init() should
> be called with a string buffer that contains a NUL separated list of path names
> of the files and/or directories that should be included.  Any path not listed
> will be excluded. The paths should be relative to the root of the working
> directory and be separated by a single NUL.
>
> The excludes logic in dir.c has been updated to honor the results of
> fsexcludes_is_excluded_from().  If fsexcludes does not exclude the file, the
> normal excludes logic is also checked as it could further reduce the set of
> files that should be included.

Here you mention a change in dir.c...

>  Makefile     |   1 +
>  fsexcludes.c | 210 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  fsexcludes.h |  27 +++++++
>  3 files changed, 238 insertions(+)

... but this patch does not seem to touch dir.c at all.

> +static int check_fsexcludes_hashmap(struct hashmap *map, const char *pattern, int patternlen)
> +{
> +       struct strbuf sb = STRBUF_INIT;
> +       struct fsexcludes fse;
> +       char *slash;
> +
> +       /* Check straight mapping */
> +       strbuf_reset(&sb);

You could drop this strbuf_reset(). Or did you intend to use a static
struct strbuf?

> +       /*
> +        * Check to see if it matches a directory or any path
> +        * underneath it.  In other words, 'a/b/foo.txt' will match
> +        * '/', 'a/', and 'a/b/'.
> +        */
> +       slash = strchr(sb.buf, '/');
> +       while (slash) {
> +               fse.pattern = sb.buf;
> +               fse.patternlen = slash - sb.buf + 1;
> +               hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
> +               if (hashmap_get(map, &fse, NULL)) {
> +                       strbuf_release(&sb);
> +                       return 0;
> +               }
> +               slash = strchr(slash + 1, '/');
> +       }

Maybe a for-loop would make this slightly more obvious:

for (slash = strchr(sb.buf, '/'); slash; slash = strchr(slash + 1, '/'))

On second thought, maybe not.

> +       entry = buf = fsexcludes_data->buf;
> +       len = fsexcludes_data->len;
> +       for (i = 0; i < len; i++) {
> +               if (buf[i] == '\0') {
> +                       fsexcludes_hashmap_add(map, entry, buf + i - entry);
> +                       entry = buf + i + 1;
> +               }
> +       }
> +}

Very minor: I would have found "buf - entry + i" clearer here and later,
but I'm sure you'll find someone of the opposing opinion (e.g.,
yourself). ;-)

> +static int check_directory_hashmap(struct hashmap *map, const char *pathname, int pathlen)
> +{
> +       struct strbuf sb = STRBUF_INIT;
> +       struct fsexcludes fse;
> +
> +       /* Check for directory */
> +       strbuf_reset(&sb);

Same comment as above about this spurious reset.

> +       if (hashmap_get(map, &fse, NULL)) {
> +               strbuf_release(&sb);
> +               return 0;
> +       }
> +
> +       strbuf_release(&sb);
> +       return 1;
> +}
> +
> +/*
> + * Return 1 for exclude, 0 for include and -1 for undecided.
> + */
> +int fsexcludes_is_excluded_from(struct index_state *istate,
> +       const char *pathname, int pathlen, int dtype)
> +{

Will we at some point regret not being able to "return negative on
error"? I guess that would be "-2" or "negative other than -1".

> +void fsexcludes_init(struct strbuf *sb) {
> +       fsexcludes_initialized = 1;
> +       fsexcludes_data = *sb;
> +}

Grabbing the strbuf's members looks a bit odd. Is this
performance-sensitive enough that you do not want to make a copy? If a
caller releases its strbuf, which would normally be a good thing to do,
we may be in big trouble later. (Not only may .buf be stale, .len may
indicate we actually have something to read.)

I can understand that you do not want to pass a pointer+len, and that it
is not enough to pass sb.buf, since the string may contain nuls.

Maybe detach the original strbuf? That way, if a caller releases its
buffer, that is a no-op. A caller which goes on to use its buffer should
fail quickly and obviously. Right now, an incorrect caller would
probably fail more subtly and less reproducibly.

In any case, maybe document this in the .h-file?

Martin