Web lists-archives.com

Re: UTF-8 character encoding




On 6/26/18, Michael Enright  wrote:
> On Mon, Jun 25, 2018 at 11:33 AM, Lee  wrote:
>> I'm still trying to figure utf-8 out, but it seems to me that 0x0 -
>> 0xff is part of the utf-8 encoding.
>
> I don't see how you arrived at this.

I screwed up trying to do hex in my head.  For whatever reason I
didn't want to write 0 - 127

> An initial byte of 0xFF is not
> the initial byte of any valid UTF-8 byte sequence. And it doesn't
> conform with the statement you have later:

right, I screwed up :)

> The standards such as IETF RFC-3629 are easy enough to read, so I
> recommend using them and citing them to others instead of trying to
> summarize.

Thanks for the RFC reference - I hadn't come across that one yet.

Lee

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple