Web lists-archives.com

Re: [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too

On Thu, Apr 05, 2018 at 10:34:43AM -0700, Elijah Newren wrote:

> Even if a directory doesn't match a pathspec, it is possible, depending
> on the precise pathspecs, that some file underneath it might.  So we
> special case and recurse into the directory for such situations.  However,
> we previously always added any untracked directory that we recursed into
> to the list of untracked paths, regardless of whether the directory
> itself matched the pathspec.
> For the case of git-clean and a set of pathspecs of "dir/file" and "more",
> this caused a problem because we'd end up with dir entries for both of
>   "dir"
>   "dir/file"
> Then correct_untracked_entries() would try to helpfully prune duplicates
> for us by removing "dir/file" since it's under "dir", leaving us with
>   "dir"
> Since the original pathspec only had "dir/file", the only entry left
> doesn't match and leaves nothing to be removed.  (Note that if only one
> pathspec was specified, e.g. only "dir/file", then the common_prefix_len
> optimizations in fill_directory would cause us to bypass this problem,
> making it appear in simple tests that we could correctly remove manually
> specified pathspecs.)

It sounds like correct_untracked_entries() is doing the wrong thing, and
it should be aware of the pathspec-matching when culling entries. In
other words, my understanding was that read_directory() does not
necessarily promise to cull fully (which is what led to cf424f5fd in the
first place), and callers are forced to apply their own pathspecs.

The distinction is academic for this particular bug, but it makes me
wonder if there are other cases where "clean" needs to be more careful
with what comes out of dir.c.