Web lists-archives.com

Re: [PATCH 11/12] grep: change the internal PCRE code & header names to be PCRE1




On Tue, Apr 11, 2017 at 12:48 PM, Jeff King <peff@xxxxxxxx> wrote:
> On Tue, Apr 11, 2017 at 12:45:55PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> On Tue, Apr 11, 2017 at 12:37 PM, Jeff King <peff@xxxxxxxx> wrote:
>> > On Sat, Apr 08, 2017 at 01:25:05PM +0000, Ævar Arnfjörð Bjarmason wrote:
>> >
>> >> diff --git a/builtin/grep.c b/builtin/grep.c
>> >> index 9478ab5dff..dffb9743b8 100644
>> >> --- a/builtin/grep.c
>> >> +++ b/builtin/grep.c
>> >> @@ -490,7 +490,7 @@ static void compile_submodule_options(const struct grep_opt *opt,
>> >>       case GREP_PATTERN_TYPE_FIXED:
>> >>               argv_array_push(&submodule_options, "-F");
>> >>               break;
>> >> -     case GREP_PATTERN_TYPE_PCRE:
>> >> +     case GREP_PATTERN_TYPE_PCRE1:
>> >>               argv_array_push(&submodule_options, "-P");
>> >>               break;
>> >
>> > Hmm. This isn't a problem yet, but wouldn't this need to pass some
>> > pcre1-specific option instead of just "-P"?
>>
>> Yes, this is a bug. I'll need to add a git_options along with
>> submodule_options and pass -c grep.patternType=....
>
> Maybe that's an indication we should have --pcre1-regexp and
> --pcre2-regexp, so we don't have to resort to config tweaking.

I'd rather not. To reply to both your
<20170411103018.dkq5gangx3vcxhp4@xxxxxxxxxxxxxxxxxxxxx> & this, one
thing I was trying to do in this series (and I don't think I went far
enough in "grep & rev-list doc: stop promising libpcre for
--perl-regexp") was to stop promising some specific version of PCRE.

I.e. I think we should have the likes of core.patternType=perl & -P
for the user, but whether we implement that with pcre/pcre2, or even
libperl itself, or any of the other PCRE reimplementations around
(there's one for java, one for C#, cpython has one I think...), or
maybe even re2 should be an implementation detail.

I.e. as far as the user is concerned they just want perl-y regexes,
but they most likely don't care about the 1% featureset of those
regexes where the various implementations of "perl-y regex" actually
differ, because those cases tend to be really obscure syntax.

Having core.patternType=pcre2 is a neccesary evil for our own
implementation & testing, e.g. for the submodule grep it would be
really bizarro if some edge case where the implementations differ
causes us to produce different grep results, since one side would use
pcre2 & the other pcre1.

But I don't think we should take it to the level of having documented
--pcreN-regexp options.