Re: Invalid UTF-8 byte? (was: Re: utf)
- Date: Wed, 4 Apr 2018 15:27:21 -0400
- From: rhkramer@xxxxxxxxx
- Subject: Re: Invalid UTF-8 byte? (was: Re: utf)
On Wednesday, April 04, 2018 02:10:16 PM Jonathan de Boyne Pollard wrote:
> > The reason I wanted such a byte was to use it as a record separator in
> > a set of text files (that I use as an askSam "workalike" (or
> > "worksimilar") so that I could use msort (which depends on a 1 byte
> > record separator to --separate the records ;-) while sorting. Some of
> > the files already include UTF-8, and, in the future, I anticpate all
> > will be in UTFF-8.
> Note that ISO 646, hence ISO 8859, hence ISO 10646, has had a
> single-byte Record Separator character since the 1960s. (-:
Ok, thanks, I see that is Dec 30, Hex 1e.
A quick look at the UTF-8 table in the Wikipedia article on UTF-8 seems to
indicate that byte is a valid UTF-9 byte, which makes it unsuitabe for my use.