I scan documents for two main reasons:
- to have backup copies of my airplane’s technical logs (a plane can lose tens of thousands of dollars of value if the logs are lost); and
- to allow me to submit expense claims to customers by e-mail, using scanned receipts.
It’s very easy to scan individual pages to just about any format in Linux using graphical frontends like XSane or The Gimp, but when there’s more than one page, nothing beats PDF for ease of use at the receiver’s end (especially when you’ll be sending the file to an admin assistant running Windows and reading e-mail in Outlook). After a bit of experimentation, I found a few steps that actually work:
- In the XSane preview window, preset the area to Letter size, choosing any resolution you want (150 or 300 dpi are probably the best choices).
- Save your scans in the format of your choice.
- Use the convert utility from ImageMagick to merge all of the scanned pages into a Postscript file. It is critical to use the
-densityoption with your scan DPI so that the pages come out the right size, e.g. “convert -density 150 *.tiff output.ps”.
- Use the ps2pdf utility from Ghostscript to convert the Postscript file to PDF, eg. “ps2pdf output.ps output.pdf”.
I’ve tried many other approaches (including using the libtiff utilities with all compression options, and using convert to go straight to PDF), and they all result in either huge or malformed PDF files. This is the one approach that works for me.
There must be a tool out there, GUI or command line, that willallow me to batch scan multipage documents straight into PDF without all this messing around. I haven’t found it, but I’ll be happy to hear about such a tool in comments.