Web lists-archives.com

Re: Invalid UTF-8 byte? (was: Re: utf)




On Tue, 03 Apr 2018, rhkramer@xxxxxxxxx wrote:
> I am building (have built several iterations) of a free format
> database to work something like askSam. It is a mashup of several
> applications, things like recol, kmail, nail, kate and the data is
> stored in mbox formatted files.
> 
> Each record is treated as an email.

You should consider looking at using Maildir with notmuch and using
things which integrate notmuch.[1]

> Most likely this would be only a temporary addition, and I would need
> to do things like make sure that one byte will be unique in the file.
> It sounds like there are at least a few candidates.

Maildir is the solution to this. While you *can* handle mbox and do all
the escape rules properly (From to >From and back), it's a pain. Let
your filesystem handle it for you.

[I'm speaking from experience; I currently maintain debbugs, which
basically stores everything in a custom format mbox. This inevitably
makes things slow, as you have to search through the mbox linearly to
find any message in the mbox unless you also write indexes for the
mailbox.]

1: Notmuch itself uses xapian to do the heavy lifting.
-- 
Don Armstrong                      https://www.donarmstrong.com

Cheop's Law: Nothing ever gets built on schedule or within budget.
 -- Robert Heinlein _Time Enough For Love_ p242