Re: [OT] scanned files are large in size

On Wed 02 Jan 2019 at 19:17:00 (+0000), mick crane wrote:
> On 2019-01-02 10:24, tomas@xxxxxxxxxx wrote:
> > On Wed, Jan 02, 2019 at 09:40:33AM +0000, Joe wrote:
> > > On Wed, 2 Jan 2019 09:59:48 +0100 tomas@xxxxxxxxxx wrote:
> > > > And next time, try to find a scanner which provides you with a raw
> > > > image. Wrapping images in PDFs is... not elegant.
> > > 
> > > They do this to cater for multiple pages, whereas in my experience,
> > > most scanning is single-sheet. Even the Simple Scan program on Debian
> > > defaults to pdf, something which cannot be configured.
> > 
> > I get this, and offering that option seems to make sense. But forcing
> > it (and forcing an image format like JPEG) doesn't make sense. So
> > either
> > provide the knobs or let the host software do it.
> > 
> > My scanner just transfers the raw image. The scan program is
> > responsible
> > for the transformation to the target format, which I can choose.
> > This is
> > how /I/ want to be treated, as a paying customer.
> having a scanner do PDFs is weird, see Obama birth certificate, how do
> you know is a faithful copy ?
> A piece of paper with marks on it is an image and should be treated as
> such.

If I could be bothered to look, I might be able to come across a raw
sound or video file on this Debian system. It's quite normal to wrap
such raw data in a container format of some sort.

So I can't understand your objection to wrapping a scanned image into
a PDF container, which makes a lot of data handling a lot easier than
would otherwise be the case. An obvious example was already mentioned:
put a document into the ADF, press the button, obtain one file
containing the entire document. Other examples would be postprocessing
with programs like pdftk and pdfjam. Would you really send a scanned
document to a company/institution as a multitude of image attachments
instead of a single PDF?

If you want the image back from a PDF, that's what the pdfimages
program is for. I would assume that tomás is pleased with the
packaging of the raw scan data into an image format of some
description, so what's the difference?