Web lists-archives.com

Re: [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item




On Thu, Apr 5, 2018 at 12:04 PM, Jeff King <peff@xxxxxxxx> wrote:
> On Thu, Apr 05, 2018 at 11:36:45AM -0700, Elijah Newren wrote:
>
>> > Do we care about matching the name "foo" against the patchspec_item "foo/"?
>> >
>> > That matches now, but wouldn't after your patch.
>>
<snip>
>> So I should probably make the check handle both cases:
>>
>> @@ -383,8 +383,9 @@ static int match_pathspec_item(const struct
>> pathspec_item *item, int prefix,
>>         /* Perform checks to see if "name" is a super set of the pathspec */
>>         if (flags & DO_MATCH_LEADING_PATHSPEC) {
>>                 /* name is a literal prefix of the pathspec */
>> +               int offset = name[namelen-1] == '/' ? 1 : 0;
>>                 if ((namelen < matchlen) &&
>> -                   (match[namelen] == '/') &&
>> +                   (match[namelen-offset] == '/') &&
>>                     !ps_strncmp(item, match, name, namelen))
>>                         return MATCHED_RECURSIVELY_LEADING_PATHSPEC;
>
> That seems reasonable to me, and your "offset" trick here should prevent
> us from getting confused. Can namelen ever be zero here? I guess
> probably not (I could see an empty pathspec, but an empty path does not
> make sense).

Right, I don't see how an empty path would make sense.

> There are other similar trailing-slash matches in that function, but I'm
> not sure of all the cases in which they're used. I don't know if any of
> those would need similar treatment (sorry for being vague; I expect I'd
> need a few hours to dig into how the pathspec code actually works, and I
> don't have that today).

If it'd only take you a few hours, then you're a lot faster than me.
It took me a while to start wrapping my head around it.

The other trailing-slash matches in the function are all correct,
according to the testsuite.  (I'm not sure I like the
DO_MATCH_DIRECTORY stuff, but it is encoded in tests and backward
compatibility is important.)  In particular, changing the earlier code
to have the same offset trick would make it claim that e.g. either
"a/b" or "a/b/" as names match unconditionally against "a/b/c" as a
pathspec.  We need it to be conditional: we only want that to be
considered a match when checking whether we want to recurse into the
directory for other matches, not when checking whether the directory
itself matches the pathspec.  Thus, it should be behind a separate
flag, in a subsequent check, which is what this series does (namely
with DO_MATCH_LEADING_PATHSPEC).

To be more precise, here is how a matrix of pathnames and pathspecs
would be treated by match_pathspec_item(), where I am abbreviating
names like MATCH_RECURSIVELY_LEADING_PATHSPEC to LEADING):

                               Pathspecs
                |    a/b    |    a/b/    |   a/b/c
          ------+-----------+------------+-----------
          a/b   |  EXACT    | RECURSIVE  |  LEADING[3]
  Names   a/b/  |  EXACT[1] |  EXACT     |  LEADING[2]
          a/b/c | RECURSIVE | RECURSIVE  |  EXACT

[1] Only if DO_MATCH_DIRECTORY is passed.  Otherwise,
    this is NOT a match at all.
[2] Only if DO_MATCH_LEADING_PATHSPEC is passed,
    after applying this series.  Otherwise, not a match
    at all.
[3] Without the fix in this thread that you highlighted,
    and assuming we apply patch 7, this would actually
    mistakenly return RECURSIVE.


Now for a separate question: How much of the above would you like
added to the commit message...or even as a comment in the code to make
it clearer to other folks trying to make sense of it?


Elijah