Re: Invalid UTF-8 byte? (was: Re: utf)
- Date: Wed, 4 Apr 2018 14:33:21 -0400
- From: rhkramer@xxxxxxxxx
- Subject: Re: Invalid UTF-8 byte? (was: Re: utf)
On Wednesday, April 04, 2018 01:36:15 PM Don Armstrong wrote:
> On Tue, 03 Apr 2018, rhkramer@xxxxxxxxx wrote:
> > I am building (have built several iterations) of a free format
> > database to work something like askSam. It is a mashup of several
> > applications, things like recol, kmail, nail, kate and the data is
> > stored in mbox formatted files.
> > Each record is treated as an email.
> You should consider looking at using Maildir with notmuch and using
> things which integrate notmuch.
> > Most likely this would be only a temporary addition, and I would need
> > to do things like make sure that one byte will be unique in the file.
> > It sounds like there are at least a few candidates.
> Maildir is the solution to this. While you *can* handle mbox and do all
> the escape rules properly (From to >From and back), it's a pain. Let
> your filesystem handle it for you.
> [I'm speaking from experience; I currently maintain debbugs, which
> basically stores everything in a custom format mbox. This inevitably
> makes things slow, as you have to search through the mbox linearly to
> find any message in the mbox unless you also write indexes for the
> 1: Notmuch itself uses xapian to do the heavy lifting.
I'll probably look into notmuch, just for kicks.
I've considered maildir--it meets some of my requirements (that is, to make
something close to an askSam workalike), but one drawback is that it is
essentially one email (i.e., my "record"). One of the desirable features of
askSam is that you did not have to create a new file to add a new note /
record, you just start typing in an existing open record and then, as time or
other constraints allow, you can add more "tags" or a record separator. (It's
been so long since I've used askSam I actually forget what had to be done (f
anything) to separate a new record from the previous record).
askSam basically stores all it's records in one file, although it is (of
course) possible to separate them.