This is a build log about making a book scanner, as shown above. Off to the right is also a 19 inch monitor, off camera.
If you've ever tried to scan a book using a plain ole flatbed scanner, I am sure you noticed two things right away: it is really bad for the book and it is mind-crushingly sloowww. I needed to be able to make digital versions of magazines, in this case complete volume sets of aviation magazines published prior to 1923 (which means they are in the public domain) such that the text within them can be searched. The flatbed approach was just not going to work.
As usual, this build started out with some internet searching, and right away I found the diybookscanner.org website. This is a community repository of information founded by Daniel Reetz who in 2009 pioneered several designs and created a forum (with the intrepid help of Rob Baruch) whereby several others were inspired to create other designs. Some of the original players have moved on, and Daniel retired from this effort in 2015, but their webpages live on. Lots of good stuff there, well worth your time if you are interested in this subject. There were many people there who contributed much in the way of ideas, and I borrow heavily from them. So many of them that I don't think I can acknowledge all of them properly - many of them only provide a pseudonym anyway, so a general big "thank you" goes out to all of them.
So what, exactly, is this build project? Instead of laying a book flat down on the face of a slowly moving scanner, the idea is to use two cameras, placed such that when a book is opened beneath them, they are focused on both pages and are triggered, the book page is turned, they are triggered again, and so on until all the pages have been photographed. The cameras are chosen so that the resulting images will be a minimum of 300 DPI per page.
The diagram at the top of this page shows the main pieces of this book scanner. At the very top is a flashlight looking symbol, representing the light source to illuminate the pages. Next down are the two cameras showing their field of view in a light blue. The thin green "V" is a glass platen that is necessary to flatten the pages while pictures are taken. Next is the book (or magazine), and on the bottom is a matching "V" cradle upon which the book rests. Between taking pictures, two pages at a time, the cradle with the book is lowered, the page is turned, and the cradle is then raised, pressing the pages flat against the platen for the next two photos and so on.
When you are done taking pictures of all the pages, including the front and back cover, what you will have is a single page image file for every page, including the front and back cover, placed on a flash drive that you can take with you. The images on the flash drive are then copied onto another computer for "post processing" - cropping, rotating, or otherwise processing the images to make them as nice as possible, scanning them for OCR to include text for searching, and then assembling them into a PDF file. This post processing, or work flow, will be covered a bit more below.
After the book is placed properly on the cradle, if we assume we can lower the cradle, turn the page, raise the cradle, and push the switch an average of once every 4 seconds (and I think that is a resonable rate), that means we can do 15 cycles per minute of two page images per cycle, or 30 pages per minute - if kept up, that is a rate of 1800 pages per hour, or 2 -3 average books per hour! Most of the automatic page turner schemes I looked at are quite complex and not fool proof - you still have to babysit the machine, so as long as you have to be there, this seems like a very workable approach without a great deal of complexity.
With that basic idea in mind, we begin by making practical choices guided by the constraints of our particular goals. I want the scanner to be shared and taken into libraries, so it must be portable - i.e., at least fit inside a car and ideally be carried by one person. Portability implies it must also be somewhat rugged. We need it to be easily operated by a volunteer with a minimum amount of training and have a simple set-up procedure. We need it to provide quality images that will be good enough for our foreseeable needs, as later access to the sources may not be possible once they are scanned. The maximum size of the scanned book or magazine should be large enough to accommodate almost all of what we need to scan, but this size will be tempered by the resulting weight, size, and cost.
Cost, of course, is always an issue. By far the biggest expense will be the cameras, but we are in luck in that Canon makes some high resolution (20 megapixel) low cost point and shoot cameras that may also be hacked and controlled by a computer. Seems the camera of choice right now for this kind of thing is the Canon Powershot ELPH 160, which is currently around $100 and we need two of them, shown above.
Ease of use will necessitate using some kind of controller to manage the cameras and the resulting images. Once the cameras are in place and calibrated, taking them off to access the SD card and/or batteries and then replacing them and re-calibrating them is not a great approach. Using a single board computer to get the files off the camera and act as a scanner server is definitely the way to go.
There is much work to do in the post processing, so to relieve that as much as possible, the scanner must be made to produce the most consistent and uniform images it can. That way any later manipulation of the pages, like cropping, color, brightness, contrast, etc. can be done in batch mode, instead of having to modify each page, one at a time. If the page images are consistent, we can just let the computer do all the work.
The first step was to settle on a maximum book size, because everything else revolves around this choice. I started by thinking that using glass was the way to go, and therefore looked for inexpensive sources. I would like to use polycarbonate (Lexan) as it is tough and light, but it scratches easier than glass and it can have static issues that attract dust. 11 x 14 inches is kind of a standard picture size, but the oversize LIFE magazine which is 10.5 x 14 inches will not fit this, because the frame of the platen will take up some room. 11x16 would work well, but using any picture frame glass would have to be custom cut for this. This may be necessary anyway, because picture frame glass is very thin and fragile, and you would probably want to get some double strength glass anyway.
I think that 11x16 should cover about all we will need except for things like large maps which in any case would have to be scanned elsewhere. I am pretty sure that with this page size the resulting scanner profile will still fit inside a car door. Whether glass or polycarbonate is used, since they both can be scratched or broken and need to be cleaned, they should be replaceable. With luck, we should be able to make the platen to use either one and see which is best.
With a full 11x16 inch image size, a 300 dpi image of that would require a 15.8 megapixel camera, assuming a 100 percent usable image, which you almost never get. Nevertheless, with this camera/page size combination, the 20 megapixel cameras will always produce images better than 300 dpi, and as we zoom in on smaller page sizes, the dpi capture rate goes up dramatically, which is just icing on the cake.
With these requirements in mind, I began looking at the designs on diybookscanner.org, selecting those ideas that best fit the scanner we need. I found a design that I thought would be suitable, and I began by making the platen. Having some Misumi 20x20mm aluminum extrusion available to me, I started with that because it has a slot that will accommodate 3/16 polycarbonate (or glass.) Also, I noticed that Daniel Reetz determined experimentally that the optimum angle for the book to be open is about 100 degrees. To do the least harm to the book, the narrower this angle is the better, but we have to worry about reflections and glare in the platen glass, and opening the book just over 90 degrees produces the least lighting issues in the resulting images. This will still cause much less damage to the book than a simply reading it by hand.
Figure 3 shows my first attempt, as far as it got. By this time, I realized that the thing was getting quite heavy and that the only way that cantilevered platen was going to work was with a substantial counterweight on the back of it, making the thing even heavier. The two sets of drawer guides (one for the left-right movement of the book cradle and the other for the platen to slide up and down the center column) added more weight, and of course the oak frame and 1/2 inch plywood isn't exactly light weight. Even though I have seen people using something like this on YouTube, once I got this far I didn't think this design would be practical for me. Sometimes the experiment doesn't work out.
Note: If you want to see any of the images on this website in a larger format, you can. If you are using Firefox or Internet Explorer, right-click on the image and select "view image..." If you are using Chrome, right-click on it and "open image in new tab..."
Fig. 4 (photo courtesy Tenrec Builders, LLC)
Jonathon Duerig of Tenrec Builders took Daniel's most advanced plywood based design, the "Archivist," and rather cleverly converted it to aluminum extrusion construction, the Archivist Quill shown above, and now offers it for sale as a kit. It really looked like the best choice, but I already had a bunch of Misumi aluminum extrusion laying about from a special promotion they did a while back, as well as some sheet polycarbonate. That, coupled with my homemade CNC machine meant I could come up with most of the parts on my own for little cost, but I would need Jonathon's help to make that work. I emailed him and explained my situation and, much to his credit, he was happy to help by supplying some dimensions for the extrusions and selling me some critical parts. The Quill's maximum book dimensions are 11.8 x 15.75 inches so it is right about the size I was aiming at. I am not sure, however, about this fitting into a car seat space.
You can inspect Jonathon's assembly guide here, so I will just concentrate on my variations from that design here.
Here are some of the extrusions cut to size. They consist of both 20mm and 15mm square extrusions, with some pieces cut on a 40 degree angle, and some on a 50 degree angle, due to Daniel's 100 degree finding. I cut them on my rescue bandsaw, and then ground and/or sanded them down to final size and finish. Some of them need to be a flat black in order to eliminate any reflections onto the glass, but I didn't have any black anodized sticks, so I'll be going the spray paint route for those.
Jonathan has a list of parts as well as a part identification guide on his website that includes a scale drawing of many of the parts needed for the scanner. By scanning those images, scaling them against a known dimension, and converting them to a vector based drawing, I could then create a toolpath for the CNC router to produce them. The router is shown above cutting the polycarbonate, and the resulting pieces are shown on the right as they come off the machine. Polycarbonate is what they make aircraft windshields and bullet proof glass out of, so this is pretty tough stuff. The sheet is 3/16 thick, which is thicker than some of the aluminum plate that Jonathon uses for the same part, so we'll see what changes may be needed as a result of the different thickness.
The photo above is how the base goes together. The vertical pieces are V-groove extrusion to allow the wheels for the cradle assembly to ride up and down in those grooves.
The kit version uses pieces of felt for the feet, but for a couple of dollars I got some 1/4-20 threaded leveling feet and tapped that size hole into each end of the bottom extrusions. With them all the way in, some of the threads stick up above the top of the rail, so I added cap nuts to protect hand movement from those threads. This should make it easy to accommodate uneven work surfaces if need be. I made some 3D printed end caps and inserted them on each exposed open extrusion end. They needed to be tapped into place with a rubber mallet, just like you want them to fit.
Here is the cradle lifter assembly that slides up and down on the base shown in Figure 8. It went together easily. The plans called for two .5 inch spacers on the V wheel axles, but with the extra thickness of the side plates only a single flat washer was needed to replace one of the two spacers for each wheel. Other than that, everything else was the same.
Above is the cradle slider, which allows the position of the book to self-adjust its center under the platen each time the book is lifted into place. On top of this will be the cradle supports, or wings, which can be moved to accommodate a particular book, and then locked into place, or during transport. This whole assembly, including the cradle supports below and the book that rests upon it, may be removed from the machine to adjust the book easily. After the book is fitted, the assembly can be placed on the guide slots, and the book is ready to scan.
If you look closely you can see I labeled each part with a marker as I made them to help in assembly. Those markings will be removed when nearing completion.
Here are the two cradle supports made from 15mm extrusion. The 3/16 Lexan fittings seem to scale well for the rest of the machine, but here they look a little clunky. They work, nevertheless. On top of these will go two Lexan sheets for the book to rest upon.
Here, the bottom of the unit is nearly complete. The lifter (Figure 10) is lowered onto the vertical rails of the base (Figure 8) and the cradle slider (Figure 11) rides back an forth on the lifter. On top of the slider are the two cradle supports (Figure 12), whose distance apart can be moved back and forth, then locked, to accommodate the spine for books of various widths. Locking knobs for the vertical lifters are shown at the top of the vertical rails. They are used to keep the whole cradle in place during transport.
Still needed are the two cover plates for the book to rest upon, locking knobs for the cradle supports, and a bit of padding to go between the lifter and the base. Most of what you see here will not need painting since this part is under what gets photographed.
Above shows the front, back and sides of the platen structure. The front and back Vee shapes will be painted black and the straight sides attach to the vertical V groove extrusions of the base.
These are the two upper frames that support the cameras and the lighting bar. It is the same extrusion and polycarbonate type fittings as have been used above, painted black.
I attached the two upper frames to the base assembly, as shown above left. I deviated from the manual a bit by attaching the butt joints of the extrusions with not only plates and corner brackets, but by tapping the center hole of the extrusion and drilling a hole on the end to accommodate a flat head machine screw, as shown above right. The center hole of the 20mm extrusion works great for tapping either a 5mm or a 10-24 machine screw.
Here the platen supports have been attached along with the cross braces. The clear window Lexan pieces need to be slid into position, and the handlebar and rope attached. I am thinking that the rear vee may have a panel attached to accommodate the electronics and wiring, etc.
The handlebar is attached to the rod ends and with it the two ropes are pulled down, causing the cradle to rise up to the platen. The handlebar is the work end of the machine, so doing what is possible to make this easy will make a big difference. To help counterbalance the weight of the cradle and whatever book is on it, Jonathon added a black denim sleeve (on the left) to the handlebar which allows water bottles to be inserted with a variable amount of water until just the right balance is achieved - a slight downward force needed to raise the cradle, and the cradle should tend to fall just on its own when the handlebar is released. Although the construction of the sleeve is a very simple sewing project, it really taxed the limit of my sewing skills!
Above shows the right handlebar corner - the plans calls for the handlebar to be made from three pieces across the front, but I did this as a single bar, and made up some aluminum corner plates to attach the side rails, on both the top and the bottom of each corner. The polycarbonate piece in front is the attach point of the rope.
Also, where the handlebar beams connect to the rod ends is a kind of critical point, and I didn't feel right using polycarbonate for that. I made a couple of thicker three-hole aluminum plates from some on-hand material for that attach point, shown on the right.
In the Archivist Quill shown in Figure 4, Jonathon used sheet ABS for the plates of the book rest, but I didn't have anything like that, so again I used some of the Lexan I had, and sprayed it black. Like other builders on the forum have done, I wanted more of a slide resistant surface to better keep the book in place during scanning, so I sprayed adhesive onto the back of some non-slip tool drawer liner that I had and covered the surface of the book rest with it. (One of my tool cabinets had exactly this liner in it, so I replaced this liner in one of the drawers with some Harbor Freight stuff.) It is a rubbery substance that grips things in place quite well, shown on the left above with its opposite plate upside down beneath it.
On the right is the completed cradle assembly, just waiting to get to work. It is nice that the cradle can be lifted off its track and taken to an open area to adjust it for a different book spine, and then replaced back on the track instead of trying to do everything while bending over the unit.
Jonathon also used ABS plastic sheeting to form the light box on top, which served two purposes - one was to provide a support for the LEDs at a distance to ensure even lighting, and the other was to provide a shroud to prevent outside room light from reflecting onto the platen glass. After playing around with the LED lights for the best even lighting to fill the whole platen area, etc, I came up with the above framework. The frame on the left has some 2020 risers with a 1515 frame. The corners are attached using the drill and tap method shown in Figure 19. The inside groove on the top frame accommodates a diffuser panel, of which there are many designs out there. This is not the lenticular lens that Jonathon and Daniel used, because that stuff is hard to get. I am sure the one above, which is cut from a panel used in florescent fixtures, doesn't do as good a job directing the light directly down onto the platen, but I am hoping the more powerful LEDs I am using will make up for that. As long as the light is evenly distributed, it will be ok.
On the right is an adapter plate for mounting the LED light fixtures, shown upside down. It is also made from Lexan, and painted flat black on the bottom. It mounts on the top of frame shown on the left. I will add some cable stays when I see how it plays on on the unit. Should 60 watts of heat prove to be a problem for this, I can always redo this plate in aluminum.
A couple of other additions I made. I had a bit of trouble with the rope coming off the pulleys while moving the unit, so I added four keepers made from some 1/16 inch aluminum angle that slides onto the extrusion and rests just shy of touching the rope, but prevents it from slipping off the pulley, shown above left.
I also 3D printed some clips that snaps onto the 2020 extrusion to help keep the wiring tidy, Thingiverse # 59745 by Wired1, shown in the center image.
On the right is a knob I also 3D printed that is long enough to be easily reached, and accomodates a 1 1/2 inch 1/4-20 bolt to hold the cameras in place. They were customized from "Parameterized Star Knob for Tool" by Hank Cowdog, Thingiverse #664192.
All of the needed electronics are shown above - at the top are the two LED floodlights, 30 watts each - that's actual LED wattage, not an equivalent incandescent rating. At the far left is a plain ole surge protected white power strip to act as a single switch for the whole unit. Under the lights are the two cameras, which have a write protected SD card in each (not visible) with the CHDK software installed on them. Also inserted into each camera is a battery surrogate connected to the two 120VAC black power supplies. This way we don't have to worry about constantly removing the cameras to get to either the SD card or the batteries. The cameras also have a USB cable inserted into each one that go into two of the Raspberry Pi USB ports. The Raspberry Pi 3 is inside the Pi's gray 3D printed case, Thingiverse #922740 by 0110-M-B, with another micro SD card (not visible) inserted into it with the bootable Linux Raspian/PiScan software installed on it.
The Pi also has a regular HDMI cable coming off of it to plug into a computer monitor, which goes off the picture. The keyboard and mouse plug into two USB ports on the monitor, and monitor has one USB cable going to the Pi, which is not what is shown in the picture above. The fourth USB port is for the cable to accommodate the USB flash drive for the resulting images. Just under the Pi is it's power wall wart. No soldering! Just cable management.
Not shown is an HDMI 19inch computer monitor (bought at a thrift shop for $28), which will take up the sixth power outlet on the white power strip. Although the software supports the 7 inch Raspberry touch screen, I will go with the cheaper monitor for now, mostly because I think it will make checking camera image focus easier, but it does mean lugging one more thing around.
The PiScan software is also due to the efforts of Jonathon Duerig, who took the core of a previous piece of software called SpreadsPi and streamlined the user interface to keep the operation as simple as possible for this particular hardware. It has been made freely available on github. Thanks again, Jonathon!
Here is a shot of the back, showing how the electronics and wiring was handled. It is just another piece of thin Lexan painted flat black on the other side and attached to the rear platen supports. In the above photo, I have yet to add some strain relief for the cables where needed and I need to finish plugging everything in. Power for the lights and the monitor are yet to be plugged into the strip. To mount the power supplies for the cameras and the power strip, I 3D printed some custom mounting brackets using Parametric Mounting Brackets, Thingiverse #1949437 by Revwarguy, who is me! I was surprised to find such a thing wasn't already out there. (OpenSCAD rocks!) From there, I just bundled up the extraneous wire and the unit was ready to scan it's first book.
I just wanted there to be just two external cables - one for the power strip and the other a bundle of the HDMI, power, and USB cables that go to the monitor.
Here the unit is just about done. The lights seem to evenly illuminate the platen nicely, and the cradle moves up and down smoothly. If glare from outside sources in a daylit room becomes a problem, I am thinking that I might tax my sewing skills even more and just make a lightweight black denim tent to go over the top portion of the unit. Overall, the PiScan software installed and worked without a hitch. One word, though - you have to make sure you set the time and the date on each camera before plugging a USB cable into it. Otherwise the CHDK hacked camera powers up into a "set the time and the date" mode, but since the USB cable is there, it thinks it is under remote control, so it ignores the buttons on the camera to set the time and date.
Post processing - After using the scanner, you have a USB drive with an image for each page of the book. The format of the images is whatever you have you camera set to produce, like jpgs, tiifs, etc. However, each page is oriented sideways, and you almost always have more of the surrounding areas of the page than you want.
I am using an open source program called Scan Tailor that does a wonderful job of changing the page orientations, cropping each page, setting margins for the output pages, and skewing them if needed. It handles both text and images pretty well, and allows you to manually modify the output where necessary. It's user interface is a bit odd, and there are definitely some things I would like to see added or improved, but for open source I have to say its just great. (I am using the experimental version) Thank you, Joseph Artsimovich, Nate Craun, et al.!!!
From there, I also use Irfanview to batch process changing the page images DPI and mode (like grayscale, etc.), which is also a free program. For the OCR and PDF generation, I use ABBYY, which does an incredible job of both image and character recognition. ABBYY is not free, but is totally worth it, IMHO. For simple books (mostly text, black and white, 300-400 pages, it takes very little time, but for one with lots of images, it takes about the same time to scan the book as to do the post processing, which is a total of about 4 hours from first scan to being able to do a PDF search on the book.
Comments may be directed to gary at liming daught org.
Thanks for viewing this build log!