Convert your own books: a new resource

scanner.jpgNow here's an interesting blog. It is dedicated to showing you how to convert your own books to digital format for your own use. Nothing illegal about that, and it has a lot of advantages over buying a DRMed copy of a book you already own. The article is reprinted with their permission. Take a look: Digitizing your own books The Book Ripper community,, came together to take the difficulty out of digitizing books. Unlike music, movies, or even loose paper, books have proved surprisingly difficult to break out of their analog format. Very complicated robotic scanners, costing tens of thousands of dollars, have been built to address this problem, but their size and cost make them practical only for large institutions, leaving individuals who want digital books at the mercy of book publishers. As it turns out, digitizing your books is not hard. The advances in small cameras make it possible to achieve high quality results cheaply and at a rate of 600-900 pages per hour. That is what we do at and there are a number of advantages compared to getting your ebooks from publishers. Cost The most impressive advantage is cost. For people who own books already, getting digital copies of those books from publishers is an expensive prospect. Commercial ebooks have no commodity price and can vary wildly by publishing outlet, but let’s assume a $10 price for each ebook. The book ripper design we use costs around $250 dollars, which includes the price of two small point and shoot cameras. If you own more than 25 books, building a scanner will be cheaper than buying electronic editions. For those of us that own hundreds or thousands of books, the math becomes obvious.


In the wake of Amazon’s memory hole-ing of George Orwell’s works, their retroactive disabling of the text-to-speech capabilities on new readers, and the continuing industry wide obsession with DRM, control over your ebooks has been gaining visibility as an issue in the digitization of our vast printed catalogue. With publisher-made ebooks, they control what devices can read it, what software can do to it, where it can be stored, how many times you can download it, and how long you have access to it; people doubt so strongly that you will even be able to read the closed formats that publishers sell books in that they suggest insurance as a way to cover your losses when your digital copies disappear.

The books that you convert, you control.


Of course, the illegal distribution channels release everything in free formats, and release it all for free, so there they would seem to be ahead of publishers on both fronts, and much less effort than home book ripping. Where the illegal copy market falls short, besides the obvious issues of copyright infringement, is in the reliability of their versions.

Illegal copies are known for typos and OCR errors, lack of text and page formatting, and spotty availability of works. Unfortunately, legal ebooks are known for these same things. Neither can be relied upon as an authoritative representation of the author’s work and neither offer any way to verify or improve the accuracy of the digital work other than by reference to the printed one.

In contrast, when you scan the books yourself, you retain high quality images of every page. Viewers and other tools will let you jump back and forth from the text to the image versions. OCR can be corrected over time or re-run with better software and formatting can be added or corrected, but only if you have the page images.

Until digital distribution becomes the original and authoritative method of book publishing, as it has for the web, having the page images will remain the only way to guarantee or improve the accuracy of your digital books.

Because you love your books

If you love your books, if you care enough about them that you need every word to be right and you want the digital copy to be as beautiful as the paper one, you should scan them yourself. If you don’t care that much about a book, the publishers’ copy or the illegal copy may be all you need, or you might be better off cutting the spines off your existing books and feeding them through a high speed USB scanner. You can always recycle the pages afterwards.

If nothing but the best will do, or no other options are available, come on over to and see how easy it is.

6 Comments on Convert your own books: a new resource

  1. Now that might be the most practical and easily-used book scanner I’ve ever seen. And as someone who would love to be able to easily digitize his book collection, it looks like the first device I may actually build.

  2. I lot of thing come to mind after watching the demo video. First, you have to turn each page then replace the book ripper on top of the book. At the high end estimate of 1000 pages per hour it would 35 hours to scan a 100 book library assuming an average of 350 pages per book. At 1000 books it would be 350 hours. But my personal book library is closer to 3000 books so that would be at least 1050 hours, or about 26 weeks if I worked at it on standard business day. That is not going to happen. So maybe I should just pick and choose my favorite 100 books to scan.

    Second and more important: Where can these scanned books be read? A book scanned into a PDF is actually less useful than the original book. Sure, I could read it on my MacBook Pro, but that isn’t how I read ebooks. I know what ever I could do to I scanned books the results would be next to impossible to read on a Kindle 1 or 2 without a lot of proofing, though it might be OK on the DX. The Sony isn’t magic and wouldn’t do much better with scanned books either.

    So to read the scanned books I’d need to spend $500 for the DX. Total cost per book with $250 for the book ripper comes to $7.50 for 100 books excluding labor. Is that such a good deal?

    Assuming $10.00 per book to buy, I would save only $250. Or, would I work 35 hours for $250, or $7.14 per hour, less than minimum wage in Washington state? I don’t think it is worth it. Would you work for that wage?

  3. You’re right: I wouldn’t want to do all of my library, either… probably just a few choice books. But it would be easy enough to use on a newly-purchased book.

    I think the idea behind scanning the pages is to run them through an (automated) OCR system… at least, it’s a viable option if the cameras you use provide a high-enough resolution image. That way, you can convert it to whatever format you choose… you’ll have a truly digitized document, and you won’t need the DX or any PDF-optimized reader at all.

    Of course, the idea that it might not be cost-effective isn’t really the point. The point is that it’s a way to legally digitize your books.

  4. The flip side in all of this is how silly DRM is for books. Sure plenty of people could rip CDs to get mp3s… but at least in the case of CD’s it was at least theoretically possible to put DRM on the media. In contrast, there is no way DRM could ever be effectively put on a paper book.

  5. DRM and comparisons to corporate versions aside. Every book I own and want to own in the future at this point is out of print. The publishers and the Ebook manufacturers aren’t keeping up with converting new and old books into digital format nearly fast enough to cover even a small portion of all existing literature to date. I for one don’t have the time and equipment to professionally convert all my books to digital format, nor do I have the option of purchasing it on amazon. Until there is a user-end option for conversion that is fast or a company to pay people to convert them then we need more options.

  6. Danni Gerada // March 13, 2013 at 1:23 pm //

    Surely a communal process would work welln- there’s little sense in 100 people scanning the same books, a database would grow etc, possibly with a minor fee towards those who upload new reads etc, surely having to scan isbn no.s could monitor whether you own the book… Devices would need to be near banned from inside libraries! But yes the book scanner is a brilliant idea that needs to get it’s skates on and snowball!

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail

wordpress analytics