Web lists-archives.com

Re: [PATCH v2 2/2] mailinfo: support Unicode scissors




Jeff King <peff@xxxxxxxx> writes:

> In fact, I think you could then combine this with the previous
> conditional and get:
>
>   if (skip_prefix(c, ">8", &end) ||
>       skip_prefix(c, "8<", &end) ||
>       skip_prefix(c, ">%", &end) ||
>       skip_prefix(c, "%<", &end) ||
>       /* U-2702 in UTF-8 */
>       skip_prefix(c, "\xE2\x9C\x82", &end)) {
>           in_perforation = 1;
> 	  perforation += end - c;
> 	  scissors += end - c;
> 	  c = end - 1; /* minus one to account for loop increment */
>   }
>
> (Though I'm still on the fence regarding the whole idea, so do not take
> this as an endorsement ;) ).

I do not think we want to add more, but use of skip_prefix does
sound sensible.  I was very tempted to suggest

	static const char *scissors[] = {
		">8", "8<", ">%", "%<",
                NULL,
	};
        const char **s;

	for (s = scissors; *s; s++)
		if (skip_prefix, c, *s, &end) {
			in_perforation = 1;
			...
			break;
		}
	}
        if (!s)
		... we are not looking at any of the scissors[] ...

but that would encourage adding more random entries to the array,
which we would want to avoid in order to help reduce the cognirive
load of end-users.

In hindsight, addition of an undocumented '%' was already a mistake.
I wonder how widely it is in use (yes, I am tempted to deprecate and
remove these two to match the code to the docs).