Web lists-archives.com

Re: Invalid UTF-8 byte? (was: Re: utf)




On Tue, 3 Apr 2018 15:47:57 -0400
Greg Wooledge <wooledg@xxxxxxxxxxx> wrote:

> On Tue, Apr 03, 2018 at 09:36:42PM +0200, Michael Lange wrote:
> > >From what i have understood I think the OP should certainly at least,
> > whatever the files they want to include exactly look like and
> > whichever byte they choose as delimiter, scan the file first for such
> > a byte and if it is actually found replace it with either an empty
> > string or (probably better) some sort of "tag" before applying the
> > contents to the new database. This way they could at least be sure
> > that their chosen delimiter does not split one record into halves.
> 
> Or abort the program with an error message.

Yes, or ask the user what to do with that particular file :)

Regards

Michael

> 
> > I have no idea what these "text files" look like of course. It just
> > seemed -to me - that the fact that the null byte cannot ever be part
> > of a file name might make it slightly more appropriate for this
> > purpose than other candidate bytes. Of course, it depends...
> 
> NUL bytes are an excellent choice for delimiter in lots of situations.
> 
> Of course, we still need the OP to tell us what the actual situation is.
> 



.-.. .. ...- .   .-.. --- -. --.   .- -. -..   .--. .-. --- ... .--. . .-.

Peace was the way.
		-- Kirk, "The City on the Edge of Forever", stardate
unknown