Screen shot 2009-10-30 at 12.12.28 PM.pngMark runs the blog eBooks Just Published and he posted a request that I’m republishing because it’s for a good cause:

DAISY is an XML-based e-book format created by the DAISY international consortium of libraries for people with print disabilities. DAISY implementations have focused on two main types: audio ebooks (digital talking books) and text ebooks. DAISY text ebooks are similar in many ways to the ePub format. DAISY uses the DTBook XML document type which provides a rich set of tags for marking up various elements of a book, making it easy to navigate and accurately convert to spoken audio using text to speech.

I’ve been working on ebook to audiobook conversion for the next release of Text2Go,which is now in beta. I’ve provided support for ePub and was hoping to include support for DAISY DTBook. The DAISY specification is freely available and there is a sample ebook in DTBook format. I’ve created a simple DTBook reader which will read the sample DTBook available. However I need to test this with a large range of DTBooks from multiple sources before I can be confident that I’ve provided a bullet-proof implementation.

This is where I’ve run into problems. I just can’t seem to find a good source of ebooks in DTBook format. Are there free or even paid sources of such ebooks on the Internet or are they only available through libraries or sites catering to the visually impaired? Perhaps I haven’t hit on the right keywords to use in Google? It’s a real shame as I would like to provide first class support for the DAISY DTBook format as it’s been designed specifically for text to speech applications.  If you’ve discovered any good sources of DAISY ebooks, please let me know. Thanks in advance.


  1. I don’t know how many publishers would actually have the actual source code out on the web, and DTBook is essentially the source code for the text that goes into a DAISY book…

    You might try talking to some of the K-12 text book publishers. There are mandates for these publishers to provide DAISY books in some states (Texas, I think?), so you might have better luck getting some samples from them, if you can give them a business reason to use your product.

    Hope this helps.

  2. I know some DAISY related terminology can be confusing, therefore I will give a quick overview:
    “DAISY 3” is the short name of the ANSI/NISO Z39.86 standard released in 2002, revised in 2005, currently the most recent version of the DAISY Standard. The former version is DAISY 2.02 and there is an ongoing revision called DAISY 4 (also known as DAISY Next or ZedNext).

    “DTBook” = “DAISY XML”: it is an XML document representing the textual content of a DAISY book. It is a custom XML grammar with book-specific markup (e.g. there are elements for frontmatter, page, footnote, poem, etc). See the normative DTD schema for more information. The DTBook grammar is part of the DAISY 3 specification: a DAISY 3 book with textual content will contain a DTBook document ; however all DAISY 3 books do not have a DTBook (an audio-only DAISY book has no textual content, hence no DTBook).

    “DTB” = “Digital Talking Book” = “DAISY Book” is the entire file set representing the book. It can be composed of textual content, audio files, synchronization files, navigation control files, etc. Their composition depends on the type of the book (e.g. audio-only, full-text full-audio, text-only, etc.) and also on the version of the standard (a DAISY 2.02 DTB is different from a DAISY 3 DTB).

    Samples of DAISY books:
    DAISY Consortium website

    gh, LLC has posted sample books in DAISY 3 format, you can also download a free trial of a DAISY Player

    Publications in DAISY and EPUB formats plus a free trial of a DAISY Player on the Dolphin Computer Access website

    The Internet Archive (text-only DAISY)

    Offering DAISY books to their qualified members:

    Bookshare (text-only DAISY)
    RFB&D (human narrated DAISY)

    Other sources:

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail