Web lists-archives.com

Re: [PATCH v2 2/2] mailinfo: support Unicode scissors




On Mon, Apr 01, 2019 at 11:53:34PM +0200, Andrei Rybak wrote:

> diff --git a/mailinfo.c b/mailinfo.c
> index f4aaa89788..804b07cd8a 100644
> --- a/mailinfo.c
> +++ b/mailinfo.c
> @@ -701,6 +701,13 @@ static int is_scissors_line(const char *line)
>  			c++;
>  			continue;
>  		}
> +		if (starts_with(c, "\xE2\x9C\x82" /* U-2702 ✂ in UTF-8 */)) {
> +			in_perforation = 1;
> +			perforation += 3;
> +			scissors += 3;
> +			c += 2;
> +			continue;
> +		}

It might be worth using skip_prefix() instead of starts_with() to
compute the size automatically. E.g.:

  if (skip_prefix(c, "\xE2\x9C\x82", &end)) {
	size_t len = end - c; /* no magic number needed! */
  }

In fact, I think you could then combine this with the previous
conditional and get:

  if (skip_prefix(c, ">8", &end) ||
      skip_prefix(c, "8<", &end) ||
      skip_prefix(c, ">%", &end) ||
      skip_prefix(c, "%<", &end) ||
      /* U-2702 in UTF-8 */
      skip_prefix(c, "\xE2\x9C\x82", &end)) {
          in_perforation = 1;
	  perforation += end - c;
	  scissors += end - c;
	  c = end - 1; /* minus one to account for loop increment */
  }

(Though I'm still on the fence regarding the whole idea, so do not take
this as an endorsement ;) ).

-Peff