Greg NewbyGreg Newby, CEO of Project Gutenberg, says he’s open to creation of .epub files on the fly, via the main Gutenberg site. And he is also willing to consider links to sites that store IDPF-standard files in ready-to-go form.

At the same time, however, Greg writes on a Gutenberg list that he needs convincing evidence that .epub will indeed be an open, honest standard without gotchas coming in from Adobe or any other company. He’ll also need the right software tools—free and open source.

“On the fly” explained

But first, what does “on the fly” mean? It means that Gutenberg would treat .epub as it now does Plucker.

You’d type in a number to identify the e-book file, then wait while the conversion gears ground away and generated .epub from another format such as HTML or .txt. This isn’t an optimal solution, but it’s a good start, especially if Gutenberg also uses direct links to sites with ready-to-go .epub.

Catnip for consumers, if IDPF doesn’t play games

The benefit for Gutenberg visitors would be for future Sony Readers—expected to come with Digital Editions, Adobe’s software that can read .epub, not just PDF—to be able to read .the IDPF format without conversion hassles at the human readers’ end. The same could happen with Bookeen‘s forthcoming Cybook Gen3; in fact, an entire generation of E Ink machines with .epub-reading capabilities, whether or not they originated from Adobe software, which apparently won’t happen in the case of the Cybook.

Adobe funds the IDPF, whose executive director, Nick Bogaty, is about to start a job there. While the public domain community will benefit from .epub and mustn’t walk away from the possibilities by ostracizing the IDPF just because Adobe’s involved, we also need verifiable assurances that no one will compromise the integrity of the standard. Integrity is the key to many different brands of commercial software and hardware—not to mention open source freeware and shareware programs—working with .epub from Gutenberg and other sites.

How public domain sites could help the standard

If the public domain community embraces .epub, even less than fully, such as through the on-the-fly approach, it would be a significant victory for the standard. Many more more people probably read public domain books than those in the DRM-infested proprietary formats from large publishers.

The cranes and wrecking balls would finally be at work tearing down the Tower of eBabel, or at least we’d be closer to this long-awaited event than we are now. That would help everyone, from Project Gutenberg and self-publishers to Random House, Simon & Schuster and HarperCollins. We need to make e-books as easy to use as CDs; nontechie consumers are sick, sick, sick of eBabel.

Greg’s wise conditions

While showing flexibility, Greg also insists on conditions, wise ones as I see it. For example, he doesn’t want DRM inflicted on Gutenberg and he demands free, open source software that can assure that an .epub file is really an .epub files, as opposed, say, to something with PDFish elements hidden inside.

Amen to that! I want there to be provisions for checking the files spewed out by .epub related tools, as well as a systematic way of guaranteeing that e-reading software is truly compatible. Otherwise such programs—both the creation and reading varieties–shouldn’t qualify for an .epub logo. Furthermore, the tools should be able to be vetted by third parties, independent of Adobe and even the IDPF, to make certain no one has compromised them.

Other open source .epub tools needed

The goal as I see it should be to allow the existence not just of commercial software for creation and reading but also the free, open source variety—a much-needed alternative to the present mess where individual corporate interests come ahead of those of e-bookdom and society at large.

In Gutenberg’s case, there’ll need to be linux software to create .epub from another format, once someone punches in the number. Might Adobe, per chance, be willing to pay the costs or otherwise facilitate this? It’s a member of the Open Content Alliance., and Adobe’s Bill McCoy has portrayed himself a friend of the public domain. Helping Gutenberg, without strings attached, would be a great act of goodwill. Or what if a foundation such as MacArthur or George Soros’s Open Society Institute–without any commercial interests—provided the funding? Or how about Brewster Kahle, the philanthropist behind the Internat Archive and related projects, including OCA, which could benefit mightily from a standard, reflowable format fit for cellphones and PDAs, not just desktops? Could he finance open source software useful to OC and Gutenberg alike?

Or, as a funder for conversion software, not just for Gutenberg in particular, how about the American Library Association or another large library or university group as a funder? Or maybe even a grant from one of these groups to the Digital Library Federation, as an overseer?

Volunteer programmers might also be the solution for Gutenberg and similar groups. Anyone from the MobileRead programming community care to participate?

But what about format validation and related precautions—a prerequisite for Greg to be interested? The IDPF apparently has no such animals right now. I’d welcome a roadmap with an ETA and assurances that the tools will be free and open source. Given Adobe’s participation in the Open Content Alliance, would the company be willing to spin off the job to a project overseen by someone like Brewster, who is far more trusted in the open source and public domain communities than Adobe is?

The rewards of openness: Avoidance of proprietary hell

David Moynihan’s struggle with the use of Mobipocket at Munsey.com shows the difficulties of working with the alternatives to openness, proprietary systems. I want an open approach and ready-made solutions for small guys like David. WordPress hasn’t done too bad a job getting the masses online with blogging. We need an open source WordPress equivalent for book publishers, including the “self” variety. The big boys can still sell their CMS systems and offer services, too, not just software. But WordPress is there for the masses.

By the way, I’m wondering if none other than the WordPress people or the Drupal people could do .epub-creation modules. Plus, an OpenOffice plug-in would be nice. The .epub standard is too new for powerful open-source tools to exist now.

If the IDPF uses gotchas from Adobe or other companies to make open source impossible, yes, I’ll join Moynihan in his grumpiness. But let’s give the IDPF a chance rather than coming up, as his Reg commentary did, with a pack of distortions and lies. Consumers are begging for an end to the Tower of eBabel, and the public domain community shouldn’t try close off the options, as long as they’re not just Trojans from Adobe or elsewhere. I take it for granted that software companies will attempt Trojans–hence, the need for well-vetted validation and other precautions.

Adobe bastardizing .epub already?

In that vein, let me quote from Aaron Miller of Book Glutton, a small company that is eager to use .epub while at the same time insisting on its integrity.

“It makes me nervous to have Adobe so prominent in the IDPF. They seem committed to the standard, but they’re using XSL formatting objects for their XHTML .epubs, and it would be very difficult to get a web browser to gracefully use XSL-FO.

“If they decided to lean heavily on that, to the point that .epubs without an .xpgt file (their XSL stylesheet) became unreadable in DE, then we would have a problem.”

Hello, Adobe and IDPF? What’s your response? Can you assure Aaron that you’ll keep .epub and Adobe-specific things separate? Meanwhile he also reports he can’t even get DRMed filses to work in Digital Editions.

Different issue from the specs

That said, keep in mind that these are Adobe issue, not those with the actual specs, which a bunch of other companies have vetted. Let’s not repeat the same confusion that David Moynihan did in his nutty commentary in the Register.

What will count is close monitoring of the IDPF and Adobe to make certain that we don’t get a de facto Adobe-owned standard.

This is why the open source community needs to embrace .epub and get the right tools out there—so that plenty of people will notice if notice if proprietary gotchas creep into .epub, either as a spec or in the world of implementations.

A warning to the IDPF and Adobe

In Adobe’s place, I would take fast action now–to regard to validation-related tools and other safeguards, as well as either helping Gutenberg or encourage for others to do so. I also suggest that Adobe encourage creation of free open source tools in general, plus go ahead with the phased-in logo plan I’ve suggested, so as to decouple the format and DRM issues.

Procrastination will harm the standard. I jumped out of OpenReader in part because the Consortium didn’t do implementation well (not entirely OpenReader’s fault since it lacked sufficient support). Adobe’s Bill McCoy earlier was joking about a Cargo Cult mentality—the expectation that the software would just materialize for the OpenReader standard, simply because OpenReader deserved it.

Same applies to .epub itself, Bill. If you sincerely want it to be a standard, an admirable legacy for you and Adobe alike, then you need to look beyond the current commercial solutions and nurture open source as well, for the sake of .epub’s long-term credibility and the cause of e-books as a serious, trustworthy medium.

Who knows? The open source community might well come up with concepts that would show up in commercial software; potential synergies abound. IBM hasn’t done too badly by reaching out to and encouraging the open source community, and by moving somewhat from a product-focused approach to a services-oriented one.

That just might be a rather attractive role model for Adobe.

History vs. the current and tangible: .epub-related code

Such actions would help Adobe overcome a rather understandable fear of the company in the public domain and open source communities. Some years ago a Russian programmer visiting the U.S. went to jail for circumventing Adobe’s DRM. The company has also worked to make use of its standards a legal requirement for those dealing with the U.S. government. David Moynihan raised these issues in the Register, and I certainly can understand why.

What counts, however, in the end, won’t be history but something current and tangible, at least on computer monitors—.epub-related code. I hope that Adobe and the IDPF will listen and learn from from OpenReader, which was valuable in prodding the IDPF to get, but which failed as a real-life standard.

Although major publishers will be releasing .epub books with help from translation houses, let’s not forget the smaller publishers and the public domain sites whose e-books have probably found many times more readers. Their use of .epub will great speedily its adoption, thereby benefiting both Adobes and small-fry alike.

16 COMMENTS

  1. Suggested solution:
    release Adobe Digital Editions (without DRM) under a dual-license:
    GPL and commercial.
    Keep the DRM as an add on in the commercial version.
    Adobe would make the same money because publishers and device manufacturers would opt for the commercial license.
    On the other hand the open-source version could pick up momentum, would stabilize the standard and would “lead the way” for standardization.
    DE is “free” anyway. The only reason for not opening it up is to obscure it in order to protect the content creation tool from FOSS products. But that way we cannot talk about an “open standard”, can we?

  2. It’s important to understand that one does not need Adobe InDesign to create EPUB!

    EPUB is simply a zip file that contains an OPS Publication. In turn, an OPS Publication is simply a file set containing one or more XHTML 1.1 documents (representing the ebook content), an XML file called the “Package” (which contains metadata, a table of contents, and other optional stuff), along with optional CSS style sheets and PNG/JPG images.

    An OPS Publication is not that much different from a standards-conforming web site, for gosh sakes!

    I use a plain text editor to produce OPS Publications. Others may wish to use some tool to create the XHTML 1.1 document(s) and the Package.

    On the other side of the coin, if Opera and/or Firefox so choose, they could quite quickly add a module to their browsers to render EPUB Publications. I’ve actually approached Håkon Lie at Opera about this possibility.

    My Lordy, I really do need to write the “EPUB Demystified” article that’s been burning in my mind for the last few weeks. All the misconceptions floating around about EPUB, especially David Moynihan’s article, which definitely need to be cleared up.

    Btw, for those who wonder who I am, I’m one of the principal technical contributors to the three IDPF standards which underly EPUB, so I do know a little bit about the EPUB open standard.

  3. “My Lordy, I really do need to write the “EPUB Demystified” article that’s been burning in my mind for the last few weeks.”

    —–

    Jon, we’re all waiting 🙂 Your article would be very helpful, although I would prefer someone wrote a simple tool to handle much of the manual formatting for me.

  4. Me, too, Jon. Even better, why not lay out your own suggestions for IDPF and PG. I’m really convinced that the IDPF should encourage open source implementations and public domain uses, not just tolerate them. Otherwise the IDPF may just be perceived as a tool of Adobe. Thanks. David

  5. At heart, .epub is about accessibility (DTBOOK) and css support. Really, after you get things in a relatively structured format, all you need to worry about is which CSS features work for which device/software. Perhaps if there were some way to check that your css conforms with OPS style guidelines, that would make things easier. If project gutenberg created 2 or 3 standard stylesheets and a script for creating the package/manifest and a script for zipping everything, you wouldn’t really need an editor.

    Fortunately there are a lot of people adept at composing css.

    what I can’t figure out is what table support means. OPS requires table support, but Sony Reader and Mobipocket offer little if any support for it.

  6. Feedbooks will have on the fly epub files this month. We’ll enable this feature on books first and then on RSS feeds too.

    In the future, we’ll work on improving the overall look of these epub files and would also like to add a “custom epub” feature on the website, where anyone will be able to easily customize the CSS and layout of the book.

  7. “It’s important to understand that one does not need Adobe InDesign to create EPUB!”

    I understand what you are saying but I disagree.
    The de-facto implementation of the IDPF standard is Adobe’s Digital Editions but to really get the useful capabilities of it you need to use Adobe’s creation tool the reason being that they extended the specification in their implementation but did not document these extensions – at least not for the public.

    IDPF/epub is all about preparing Adobe to survive in a post-paper world where the electronic representation of text on paper (PDF) will no longer be meaningful. Kudos for Adobe for realizing this and taking steps… but it’s not really an “open standard” then, is it?

  8. Yes, the EPUB standard is completely open. Anyone may build reading systems based on it with no license, and anyone may build publications to that spec with no license. It is akin to building web sites and web browsers.

    Regarding Adobe’s “extensions” in Digital Editions (DE), it is my understanding that any conforming EPUB will render in DE according to the requirements of the IDPF specs, and does not need to include any “extensions” to properly render.

    Now, if DE does not properly render ordinary EPUB’s (those without any Adobe extensions), then I’d like to know (and I do plan to experiment with this soon.) Tamas, do you have reason to believe this is the case? If so, what are the specific extensions required to make an EPUB look decent in DE?

    Btw, most of the EPUB’s that the major publishers will produce will be done by commercial conversion houses, and at least for the conversion houses I’ve talked with, they will not use InDesign to generate the EPUB Publications.

  9. Hi all,

    I am writing an article on ebooks, and .epub, but one main concern is over whether the DRM could actually hold the format back from a consumer point of view.

    Could, for example, the big software and publishing houses hijack this format and use it to increase revenues from ebooks?

    Cheers

  10. Adobe Digital Editions will happily work without .xpgt file and it will continue to work. The only downside of not using .xpgt is not being able to control the number of columns and the header location. Xpgt file is a proper extension of the epub format, it just contains additional styling information. It can always be safely ignored by viewers that don’t understand it. If you don’t need (or want) to use it, then by all means don’t use it – no problem.

    You can quote me to Greg on this.

    Peter Sorotokin

    Digital Editions lead engineer

    Adobe Systems Inc.

  11. This question is targetted at Peter Sorotokin.

    Is there a document that spells out exactly what subset of the OPS spec that Digital Editions supports?. This would help quite a bit. I have been using trial and error to see what worked and what didn’t.

    I have had two problems so far. One is related to rendering and the other to the DE interface.

    First, using CSS to create drop caps works, but the positioning of the initial letter and the succeding lines isn’t correct. The drop caps display correctly in IE6, Firefox 2 and the Lector plugin.

    Second, there needs to be a “Back” button or other means to return from a hyperlink jump. I want to hyperlink a word, which jumps to an entry in a glossary. This works, but then I have no way to return to where I was reading. Every method of implementing “Back” functionallity that I have tried returns me to the start of the document, which is not good. Hyperlinking every glossary word to reverse jump is not only needlessly cumbersome, but doesn’t work if more than one word in the text is linked to the same glossary entry.

    Not exactly related to the above, but to DE overall, is how slow it is. Opening an epub ebook, paging to a new chapter (when using separate files for each chapter) take a very noticeable time. When clicking on a hyperlink, it takes 8-10 seconds to actually jump to the target. Something is wrong here, when these things take so long on a modern P4 computer with 1GB of RAM. I have done these type of things on old, much slower PDAs, without noticeable delay.

    One last interface suggestion: give us some way to change the program colors. That dark grey on black may look cool, but it isn’t very easy to see.

    None of this is intended as critisism, but is given as customer feedback, in the hope that DE will improve.

  12. Joseph,

    Your suggestions/ctiticizms all make sense. In terms of drop caps, I’d need to see the code.

    Digital Editions is slower than it could be, but 8 seconds is certainly way too slow. By chance, are you trying to format your book as a single gigantic XHTML file? If so, break it up into small chapters – that should help performace a lot.

    In general, questions for Adobe are better go to http://blogs.adobe.com/digitaleditions/ – you are really lucky that I looked at this entry again.

    Peter

  13. I had seen the blog link you posted before, but I see no way to post a question, other than to comment on an existing blog entry. Am I missing something? A support forum would be better suited, rather than a blog.

    I still would like to see Adobe post a document that lists exactly which parts of the OPS spec that DE supports currently and perhaps when we can expect the rest of the spec to be supported.

    The article posted at the following address mentions a few things that DE does not support, but the article doesn’t seem to be officially from Adobe. I assume the author either had some inside knowledge or used trial and error to determine what didn’t work (as I am doing).
    http://www.hxa7241.org/articles/content/epub-guide_hxa7241_2007.html

  14. Peter, here’s an invitation to contribute to the main part of the TeleRead blog to keep us up to date on DE-–write me at drNOSPAMteleread.com. We’re reaching tens of thousands of people each month, so you’d get exposure to additional opinions. And the ideas here can result in a better product. I’m all in favor of a mix of both the commercial and open source models, which can be extremely synergistic.

    In a somewhat related vein, I’m especially interested in the direction and progress of the DE version for the Sony Reader. Either directly or by leaning on Sony, you’d do well to offer heavier fonts, at least as an option, so the E Ink screen is easier to read in dim light. Your thoughts on this?

    An aside: the newer Reader is a considerable improvement over the old one in the contrast area, but it’s still not the same as an LCD in that regard.

    Thanks!
    David

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com.