Web lists-archives.com

Plain text vs. binary (was Re: The new normal of logging)




On 2017-10-27 at 05:36, Darac Marjal wrote:

> A binary format is, really, nothing to be feared, so long as the
> format is as well defined as plain text (because, when you look at 
> it, plain text IS binary, it's just an astoundingly well-agreed 
> encoding).

The binary format underlying plain text does have one additional
noteworthy characteristic, though: it is intended to map *directly* to
something that is, in theory, human-readable. The glyphs which are being
encoded are simply letters, numbers, et cetera; these are almost
universally recognized, and once the trivial translation from binary
form for display is completed, if you don't understand what the result
means you at least have something on which you can base searches and
questions.

With any other binary format, you have to either have the
domain-specific knowledge necessary to understand the meaning of the
glyphs which were encoded into the binary format, or let something which
*does* have that understanding - generally, a specialized tool - perform
an additional layer of translation after translating from binary form
into those glyphs. Only after that can you realistically start to put
together searches and questions if you don't understand the result.
(Barring the sort of people who are sufficiently technically competent
to be able to reverse-engineer file formats in a clean-room environment,
which is a relatively rare skill.)

IMO that's a meaningful additional layer of burden.


Plus, having a *single* format to use for interchange has the advantage
that you only need to have *one* set of tools on hand to translate its
encoded form into the glyphs being represented; if multiple formats
exist, you need to either have the tools for each of them on hand, or be
able to do the work of translating in your head on the fly every time,
which is nontrivial in most cases.

Very few - if any - formats other than plain text are sufficiently
general to be suitable for use for every conceivable
thing-to-be-represented, without needing to be extended for the purpose
(with a corresponding extension of the translating tools). Admittedly
that's because plain text lets you implement other encoding systems on
top of it (from the various human languages, to e.g. INI-file format, to
things like uuencoding, and so forth), but there the fact that the
immediate superficial readability lets you start to form meaningful
questions and searches comes in again; if there's any other file format
which enables something comparable, I'm not finding myself able to think
of it.

-- 
   The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man.         -- George Bernard Shaw

Attachment: signature.asc
Description: OpenPGP digital signature