You may have had to scan parts of books before so you know what it’s like to try to position a page on the flatbed scanner platter and scan it. It doesn’t always work out well first time with parts of the page missing, slanting pages, some pages upside down, etc. I’ve spent a little time scanning indexes from cookbooks recently. About half of my cookbooks are indexed on the website Eat My Books but for those that weren’t, I decided to scan those indexes as PDFs and send them to Evernote for a more complete record of all the recipes in my books. So here’s a few tips I learned about flatbed scanning along the way.
Clean the scanner platter
We’re going to be doing OCR (optical character recognition) on the scans to convert PDF images to text so the first thing is to make sure the platter of the flatbed scanner is spotlessly clean. Any specks of dust, dirt or smudges will lower the quality of the scan and possibly hinder the OCR and word recognition.
Prepare for scanning
If your scanner has a side-hinged lid as mine has (see image at top), this is going to hamper laying the pages of large books such as cookbooks on the platter, unless the lid is detachable. Check if yours is. If not, and you’re scanning single pages from large books, you will have to turn the book through 180 degrees for each page to avoid the lid. No such problem with an end-hinged lid. If you’re scanning indexes (usually at the end of books), or thick books, you’re probably going to have to somehow support the heavy side of the book as you scan the page on the light side (again see image above). You can help support one side of your book by finding a shoebox or some other support which is the same height as the scanner platter and laying half of the book on this while you scan the other page.
Know your scanner software settings
Anyway, for a multi-page scan to a PDF on a flatbed scanner with a side-hinged lid, you’re going to inevitably end up with a PDF where pages are alternately inverted and right way up. Check your scanner software settings to see if inverted text can be automatically corrected. I use a Canon Pixma MP280 multifunction printer with Canon MP Navigator software and I found I could change the settings to correct this. Here’s the initial screen I see when I select Save as PDF file
The scan settings at the top of the screen can be changed depending on what you want. I’ll deal with the Resolution a little later. The important part here is to select PDF (Multiple Pages) obviously for a multi-page scan, then click Set. This brings you to this screen with some important settings:
Check Enable keyword search for OCR. Check Detect the orientation of text documents and rotate images to correct the alternate inverted pages in your file, and check Correct slanted document so you get a scan with nice horizontal text. You’re scanner software may have a different layout but dig around in the settings until you get it set up correctly.
Another couple of points. If the font size of the book text is small, try increasing the image resolution to 400 or 600 dpi to improve OCR (on the first screen above). But even doing this, I still found that some text was not recognized when searching the PDF later using Ctrl-F. Coloured text background and contrast between text and background also have an effect on the quality of the OCR.
Finally, when you are struggling to position and hold a heavy book on the platter with one hand and reaching for the mouse to click Scan with the other, I found that it was easier to just hit Enter on the keyboard as that would activate the scan and also restart the scan after a new page had been selected in a multi-page scan.
The multi-page PDFs were saved to my hard drive and after that it was just a case of opening a new note in my Cookbook indexes notebook in Evernote and dragging the PDF there. I do know it’s possible to scan books directly with your smartphone straight into Evernote and I’ll tackle that in a later post and link to it here when that post has been added.
Do you have any tips for scanning books on flatbed scanners? Drop a comment below.