Web lists-archives.com

Re: Invalid UTF-8 byte? (was: Re: utf)




On Tuesday, April 03, 2018 07:54:35 AM Nicolas George wrote:
> rhkramer@xxxxxxxxx (2018-04-03):
> > Next I'll have to refresh my memory on how to replace the existing From
> > with From preceded by the null character, i.e., something like:
> > 
> > Find: \n\nFrom
> > Replace with \n\n0x00\nFrom
> 
> This is a very bad idea, and you are obviously about tu reproduce the
> errors of the past.
> 
> You need to change your design. 

I don't understand the errors of the past nor is it feasible to change my 
design.   Two points:

   * the other half of the above replacement is deleting the 0x00 after 
sorting (unless it is totally innocuous, which it might be)

   * my design is what I'd call a mashup of existing programs that use and 
require a specified file format (basically, mbox)--the programs that I use 
include kmail, nail, recol, kate, and, in the future, any editor which uses 
Scintilla, and, I would hope to be able to use any email program that can use 
mbox files.






> You obviously have free-form text, with
> possibly some rigid syntax but not enough for your needs. Therefore, you
> cannot use a delimiter inside the text. You have to put your structure
> outside the text.
> 
> It is very typical of the "problems" people have with UTF-8: the problem
> resides not in the properties of UTF-8 but in the unwritten assumptions
> about the way they should be implementing things.
> 
> Regards,