Web lists-archives.com

Re: Invalid UTF-8 byte? (was: Re: utf)




On Wednesday, April 04, 2018 02:10:16 PM Jonathan de Boyne Pollard wrote:
> rhkramer:
> > The reason I wanted such a byte was to use it as a record separator in
> > a set of text files (that I use as an askSam "workalike" (or
> > "worksimilar") so that I could use msort (which depends on a 1 byte
> > record separator to --separate the records ;-) while sorting. Some of
> > the files already include UTF-8, and, in the future, I anticpate all
> > will be in UTFF-8.
> 
> Note that ISO 646, hence ISO 8859, hence ISO 10646, has had a
> single-byte Record Separator character since the 1960s.  (-:

Ok, thanks, I see that is Dec 30, Hex 1e.  

A quick look at the UTF-8 table in the Wikipedia article on UTF-8 seems to 
indicate that byte is a valid UTF-9 byte, which makes it unsuitabe for my use.