Web lists-archives.com

Re: [PATCH] t4062: stop using repetition in regex




René Scharfe <l.s.r@xxxxxx> writes:

> Am 09.08.2017 um 19:47 schrieb Junio C Hamano:
>> René Scharfe <l.s.r@xxxxxx> writes:
>> 
>>> There could be any characters except NUL and LF between the 4096 zeros
>>> and "0$" for the latter to match wrongly, no?  So there are 4095
>>> opportunities for the misleading pattern in a page, with probabilities
>>> like this:
>>>
>>>    0$                          1/256 * 2/256
>>>    .0$         254/256       * 1/256 * 2/256
>>>    ..0$       (254/256)^2    * 1/256 * 2/256
>>>    .{3}0$     (254/256)^3    * 1/256 * 2/256
>>>
>>>    .{4094}0$  (254/256)^4094 * 1/256 * 2/256
>>>
>>> That sums up to ca. 1/256 (did that numerically).  Does that make
>>> sense?
>> 
>> Yes, thanks.  I think the number would be different for "^0*$" (the
>> above is for "0$") and moves it down to ~1/30000, but as I said,
>> allowing additional false success rate is unnecessary (even if it is
>> miniscule enough to be acceptable), so let's take the 64*64 patch.
>
> Ah, right, now I get your calculation in the email I replied to above.
> "^0*$" has a probability of 2/255 to produce false positives.

Yes, and that is larger than 2/256 we would have to accept with the
original "^0{4096}$" or the updated "^(0{64}){64}$" by ~1/30000,
which is unnecessary additional false rate of success.

Thanks.