Raspberry Pi Digitizes and reads books

You can make your own book reader that will read books aloud after it has digitized them. The ingredients you will need are the tiny single board computer Raspberry Pi or RBPi, a BrickPi and some Lego motors and blocks. The finished book reader will flip through one page of a book at a time, take its picture and turn the picture into a text document, before moving onto the next page.

The book reader works by preparing a page to turn with the help of a rotating Lego motor. Gravity does its bit by providing just enough friction on the page of the book to allow it to inch forward. Finally, a Lego arm beam swings over and forces the page to turn over.

Once on a new page, the camera of the RBPi snaps an image of the page and saves it in the form of a JPEG file format. The RBPi then uses an open source Optical Character Recognition (OCR) software program to transform the page into text format and saves it. The RBPi then uses free text-to-speech software to read the page aloud over the speakers connected. The BrickPi operates the Lego modules that turn to the next page of the book.

For this project, you will need an RBPi (Model B), an RBPi camera, the BrickPi, the BrickPi Power Pack, Raspbian Wheezy on an SD Card, a Wi-Fi dongle and a Lego Mindstorms kit. The Lego kit could be either an EV3 or a NXT system.

As you have to use the camera to capture the image of the page, you will need good lighting. Arranging for the RBPi and the BrickPi to be placed above the book allows the camera to be positioned squarely above the book. Arrange lights over the sides at angles to fall and illuminate the page from two sides.

You may have to calibrate the page turning mechanism until it runs perfectly. This is done by adjusting the values of the variables in the arm_test.py. The motor connects to the Port A of the BrickPi and for calibration, the values of speed_arm, speed_roller, t1 and t2 may have to be changed and tested until the page turns flawlessly.

The camera is placed in position and held there with two Lego Technic beams. Once the camera is fitted in place, you may have to change its focus, as the camera focus is typically at infinity. Although the camera may give acceptable results without adjusting, focusing on the page gives improved results for OCR recognition. To change the focus read here and here for guidance.

Once the camera is adjusted, take a few images and check for clarity over the whole page. If the image does not look proper, adjust the focus and angle again. If the image looks good, it is time to test whether the OCR can convert it. Setup the Tesseract OCR engine, and use it to convert an image with “tesseract image.jpg o”. The output will be o.txt and this should now be readable with the text-to-speech engine eSpeak. This software allows choice of the reader’s gender and the accent. Once you connect a pair of headphones or speakers to the RBPi, you should be ready to go. For more details on this project, refer here.