Web lists-archives.com

Re: [PATCH 2/2] test-lib: exhaustively insert non-alnum ASCII into the TRASH_DIRECTORY name




On Mon, Apr 10, 2017 at 10:02 AM, Ævar Arnfjörð Bjarmason
<avarab@xxxxxxxxx> wrote:
> On Mon, Apr 10, 2017 at 3:47 AM, SZEDER Gábor <szeder.dev@xxxxxxxxx> wrote:
>>> Change the test library to insert non-alphanumeric ASCII characters
>>> into the TRASH_DIRECTORY name, that's the directory the test library
>>> creates, chdirs to and runs each individual test from.
>>>
>>> Unless test_fails_on_unusual_directory_names=1 is declared before
>>> importing test-lib.sh (and if perl isn't available on the system), the
>>> trash directory will contain every non-alphanumeric character in
>>> ASCII, in order.
>>
>> At the very least there must be an easier way to disable this, e.g. a
>> command line option.
>>
>> This change is sure effective in smoking out bugs, but it's a major
>> annoyance during development when it comes to debugging a test.  At
>> first I could not even cd into the trash directory, because TAB
>> completing the directory name with all those non-printable characters
>> didn't work (this may be a bug in the bash-completion package).  And
>> simply copy-pasting the dirname didn't work either, because 'ls'
>>
>>   trash directory.t9902-completion.??????????????????????????????? !"#$%&'()*+,-:;<=>?@[\]^_`{|}~?

Btw, it seems most of the failures in t9902-completion are triggered
by remote URL parsing.  The trash directory's new name contains '[',
']' and even "@[", all of which are treated special by
connect.c:host_end(), a helper function of parse_connect_url(),
basically breaking anything trying to e.g.:

  git fetch "$(pwd)/other"

What puzzles me most is that parse_connect_url() recognizes right at
its beginning that a remote URL like this is not actually an URL, so
why does it continue parsing it as if it were one?

A few other failures are triggered by the ':' in the trash directory's
name, breaking the following commonly used pattern:

  export GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY" &&
  cd subdir &&
  test-git-pretending-it's-run-outside-of-a-repository

I think ':' should therefore be excluded from the trash directory, too.

>> After some headscratching, Sunday night may be my excuse, I figured
>> out that 'cd tr*' works...  only to be greeted with the ugliest-ever
>> three-line(!) shell prompt.
>>
>> Therefore I would say that this should not even be enabled by default
>> in test-lib.sh, so at least running a test directly from the command
>> line as ./t1234-foo.sh would considerately give us an easily
>> accessible trash directory even without any command line options.  We
>> could enable it for 'make test' by default via GIT_TEST_OPTS in
>> t/Makefile, though.
>
> This definitely needs some tweaking as you and Joachim point out. E.g.
> some capabilities check in the test suite to check if we can even
> create these sorts of paths on the local filesystem.
>
> A couple of comments on the above though:
>
> a) If we have something that's a more strict mode that makes tests
> fail due to buggy code in various scenarios, we gain the most from
> having it on by default

I know, and I basically agree...

> and having some optional mode to have devs
> e.g. disable it for manual inspection of the test directories.

... but this is just too gross to live as default outside of a CI
environment.

> Most of the running of the test suite that really matters, i.e. just
> before the software is delivered to end users, is going to be running
> in some non-interactive build system preparing a package.
>
> b) I think any sort of magic like using it with 'make test', but not
> when the *.sh is manually run, will just lead to frustrating seemingly
> heisenbugs from people trying to debug the test suite when things do
> fail, i.e. you run 'make test' on some obscure platform we haven't
> fixed path bugs on, 10 fail, you manually inspect them and every one
> of them succeeds, because some --use-garbage-dirs option wasn't
> passed.

That's not really an issue.  When a test fails during 'make test' with
garbage in trash dir names, the dev comes and attempts to cd into the
trash dir, and will be instantly reminded that non-printable
characters might play a role in the failure when he can't do so with
ordinary means.

>>> This includes all the control characters, !, [], {} etc. the "."
>>> character isn't included because it's already in the directory name,
>>> and nor is "/" for obvious reasons, although that would actually work,
>>> we'd just create a subdirectory, which would make the tests harder to
>>> inspect when they fail.i
>>
>> 1. Heh.  How an additional subdirectory would make the tests harder to
>>    inspect is nothing compared to the effect of all the other
>>    characters.
>>
>> 2. s/i$//
>>
>>> This change is inspired by the "submodule: prevent backslash expantion
>>> in submodule names" patch[1]. If we'd had backslashes in the
>>> TRASH_DIRECTORY all along that bug would have been fixed a long time
>>> ago. This will flag such issues by marking tests that currently fail
>>> with "test_fails_on_unusual_directory_names=1", ensure that new tests
>>> aren't added unless a discussion is had about why the code can't
>>> handle unusual pathnames, and prevent future regressions.
>>>
>>> 1. <20170407172306.172673-1-bmwill@xxxxxxxxxx>
>>> ---
>>>  t/README      | 12 ++++++++++++
>>>  t/test-lib.sh |  4 ++++
>>>  2 files changed, 16 insertions(+)
>>>
>>> diff --git a/t/README b/t/README
>>> index ab386c3681..314dd40221 100644
>>> --- a/t/README
>>> +++ b/t/README
>>> @@ -345,6 +345,18 @@ assignment to variable 'test_description', like this:
>>>       This test registers the following structure in the cache
>>>       and tries to run git-ls-files with option --frotz.'
>>>
>>> +By default the tests will be run from a directory with a highly
>>> +unusual filename that includes control characters, a newline, various
>>> +punctuation etc., this is done to smoke out any bugs related to path
>>> +handling. If for whatever reason the tests can't deal with such
>>> +unusual path names, set:
>>> +
>>> +    test_fails_on_unusual_directory_names=1
>>> +
>>> +Before sourcing 'test-lib.sh' as described below. This option is
>>> +mainly intended to grandfather in existing broken tests & code, and
>>> +should usually not be used in new code, instead your tests or code
>>> +probably need fixing.
>>>
>>>  Source 'test-lib.sh'
>>>  --------------------
>>> diff --git a/t/test-lib.sh b/t/test-lib.sh
>>> index 13b5696822..089ff5ac7d 100644
>>> --- a/t/test-lib.sh
>>> +++ b/t/test-lib.sh
>>> @@ -914,6 +914,10 @@ fi
>>>
>>>  # Test repository
>>>  TRASH_DIRECTORY="trash directory.$(basename "$0" .sh)"
>>> +if test -z "$test_fails_on_unusual_directory_names" -a "$(perl -e 'print 1+1' 2>/dev/null)" = "2"
>>> +then
>>> +   TRASH_DIRECTORY="$TRASH_DIRECTORY.$(perl -e 'print join q[], grep { /[^[:alnum:]]/ and !m<[./]> } map chr, 0x01..0x7f')"
>>> +fi
>>>  test -n "$root" && TRASH_DIRECTORY="$root/$TRASH_DIRECTORY"
>>>  case "$TRASH_DIRECTORY" in
>>>  /*) ;; # absolute path is good
>>> --
>>> 2.11.0
>>
>>