Web lists-archives.com

Re: [PATCH v10 7/9] convert: check for detectable errors in UTF encodings

Lars Schneider <larsxschneider@xxxxxxxxx> writes:

> I would like to advise the dashed form as this seems to be the
> canonical form and it avoids cross platform issues. My macOS
> iconv does not support the form without dashes.

Sure, that is why I said canonicalization without inserting dash
does not make much sense, hence an interim step with only upcasing
is not a good idea.  A possible interim solution would be to do
nothing (no dash insertion, no upcasing) and fixing both in a later
follow-up patch, but as I said, I do not care too strongly either

> Would this approach work for you?
> 			const char *advise_msg = _(
> 				"The file '%s' contains a byte order "
> 				"mark (BOM). Please use UTF-%s as "
> 				"working-tree-encoding.");
> 			const char *stripped;
> 			char *upper = xstrdup_toupper(enc);
> 			upper[strlen(upper)-2] = '\0';
> 			skip_prefix(upper, "UTF-", &stripped) ||
> 			skip_prefix(stripped, "UTF", &stripped);
> 			advise(advise_msg, path, stripped);

Are you now interested in not having any interim step and jump
directly to the endgame solution?  If so, that is fine by me, too,
but as I already said earlier (i.e. not doing this BOM check for an
encoding that is not spelled in your canonical upcase-with-dash form
might be a feature that leaves an escape hatch), I am not all that
interested in enforcing policy at this point in the codepath to
begin with, so...