Tamas Simon's challenge to .epub

September 28, 2007

243

Tamas Simon photo No one’s a bigger defender of the idea of e-book standards than the TeleBlog is.

I’m not entirely happy with the IDPF’s present .epub standards, which don’t assure reliable interbook linking, for example. Even so, at least they’re a start and enjoy the endorsement of some major companies in technology and publishing. But just because .epub’s enemies can be wrong—it’s just a plain lie to say the specs support PDF simply because Adobe Digital Editions does—should we automatically accept what the format’s defenders say?

Going beyond the anti-.epub lies

What if the current .epub isn’t sufficiently useful without proprietary extensions from Adobe? Suppose that it in fact isn’t ready for prime time despite all the vetting it’s gotten from people in technology and publishing and despite the obvious lies of some .epub critics. Ahead is far more thoughtful critique of .epub from Tamas Simon, a proudly geekish TeleBlog contributor shown in the photo above. You needn’t be an XML and tagging expert to understand what he’s driving at.

“I thought of an .epub challenge to answer Jon Noring’s question about why I am criticizing the IDPF standard saying that without Adobe’s proprietary extensions built into Digital Editions—and only Digital Editions—the standard is really… how should I say?… weak. Try to do the following page layout in epub format:

“Header: One row of text which should contain the title of the current chapter

“Body: the main text area. Some words have small subscript numbers behind them indicating that there is a related footnote

“It contains the footnotes for the text being displayed in the body area.

“Of course we want to be able to re-flow the document…

“Good luck!

“My goal is to point out that the IDPF standard is not ready for anything useful. It needs a lot of work and a more open evolutionary process.”

OK, Jon, and other .epub defenders, what’s your side?

In the other direction, I’d welcome more examples from Tamas and others about .epub’s flaws and shortcomings. I’m also curious how .epub with Adobe extensions would handle the situation described above.

Other friendly suggestions

Meanwhile, I’d suggest that the IDPF not just get heavy input from Tamas, but also seek out the involvement of other honest skeptics in the formal standards-setting process, even if they’re not working for corporate or institutional members of the organization. And please set up vettable validation tools for .epub readers and creation programs so we can make sure that .epub-related software fully meets publishing requirements without the involvement of proprietary technology from Adobe or anyone else. Let’s make sure that all the necessary features are present.

If an .epub logo is to come about, and I dearly hope it does, let’s make certain that publishers and readers can trust it.

Related: Project Gutenberg is considering the use of the OpenDocument Text, among other formats, and you can bet that along the way, some PG people are questioning .epub’s independence of Adobe. The company played a major role in the drafting of the .epub’s current specs. And this is all the more reason for the defenders of the current .epub to respond to skeptics like Tamas Simon.

18 COMMENTS

Hadrien GARDEUR September 28, 2007 at 9:49 am

For ePub files we haven’t supported footnotes on Feedbooks yet, but from what I’ve tried, you can’t create “real footnotes” (displayed in the footer). They only support reference notes (link to another xml file or another part of the current xml file).

I’ll be working on this question now that ePub is available (for beta testing) on Feedbooks.

Log in to leave a comment
David Rothman September 28, 2007 at 9:56 am

Hadrien on .epub: Merci! That’s a good catch, and I hope you’ll continue to add to your wish list. Especially for note-heavy content, yes, the goodies should appear in the footer! The reader ideally should be able to toggle this capability in and out. – David

Log in to leave a comment
Preston September 28, 2007 at 10:20 am

Maybe they should be renamed endnotes. 🙂

Log in to leave a comment
pond September 28, 2007 at 10:55 am

Like Gutenbergers (apparently) I also think ODF is a good way to go. It’s a zip-container so it can hold several files, it has linking, is set up for looking at ‘pages’ and has been extensively tested at Boeing among other places. It supports encryption. There are a bunch of folks working on the standard. It’s on open standard now.

Why re-invent it?

I know the idpf folks were at this for a long time and come from another place than the odf folks. But there is something fishy about the whole latest chapter about epub, what with the guy leading the endeavor going off to a high-paying job at Adobe…and now this about Adobe proprietary pieces of code. It doesn’t pass the smell test, though I am no expert and I’d be happy to be proven wrong. I hope I am wrong.

Log in to leave a comment
Lee Passey September 28, 2007 at 4:04 pm

Well, I could respond to each question that Tamas asks, but perhaps it would be easer to just say:

How would you do it in HTML?

That’s how you do it in .epub.

Don’t understand HTML+CSS?
There’s dozens, if not hundreds, of books to help you learn.

The great thing about .epub is that It’s a zip-container so it can hold several files, it has linking, and is set so you don’t have to view the document in fixed ‘pages’ . It supports encryption based on the W3C standards. There is nothing in it that is not based on XML, XHTML, CSS or ZIP. There are a bunch of folks working on the standard. It’s an open standard now.

And about “the guy leading the endeavor going off to a high-paying job at Adobe” — I think it is fair to say that the .epub format was developed in spite of Nick Bogaty, not because of him. Heck, I’m not sure he even understands the format any better than Mr. Rothman does.

Log in to leave a comment
Tamas Simon September 28, 2007 at 4:26 pm

Lee
well, you could respond as you say but you did not…
For someone with your expertise it wouldn’t take more than 2 minutes would it?

btw I think you forget that HTML was designed with no pagination in mind, it is meant to take advantage of scrolling
with today’s e-paper technology that’s a problem

pond
Well said about the smell test 🙂

Log in to leave a comment
Aaron Miller September 28, 2007 at 5:58 pm

CSS does in fact provide a method for pagination. It’s just not used by
browsers. The DOM also provides several opportunities for pagination, as well as
dynamically sized headers and footers for notes.
It is very possible to do paginated, reflowable documents with links,
images, rich content–everything the web offers–with the IDPF standard.
The only reason Adobe uses an extension (which is itself also a standard
called XSL-FO) is that current browsers have not
caught up to the emerging standards on pagination.

Aaron Miller
bookglutton.com

Log in to leave a comment
Tamas Simon September 28, 2007 at 7:25 pm

Hi Aaron,

as far as I know you can control the page breaking behavior of blocks of text and you can have repeating element by using fixed positioning.

This would allow you to display the title of the book in the header of every page… but what I was looking for is more tricky.

The challenge is to have the contents of the header and footer depend on what is displayed inside the main body after pagination.

I guess we could break a book into chapters and then within each chapter the header could look the same… so that’s solvable.
The footnotes are more tricky though because you don’t know in advance which words will end up on the same page

Log in to leave a comment
Harrison Ainsworth September 30, 2007 at 1:35 pm

Most books don’t have footnotes. So weakness with footnotes amounts to little. One would need much more to justify an argument that the “standard is not ready for anything useful”.

But epub *does* support footnotes in some form. There is a extra CSS property defined (in OPS): “display: oeb-page-foot;” (http://www.idpf.org/2007/ops/OPS_2.0_0.987_draft.html#Section3.3 note [6]) It seems to allow control of footer display in relation to body content currently displayed. Is that not sufficient? (Although it ultimately depends on good reader software…)

Log in to leave a comment
Jon Noring September 30, 2007 at 2:09 pm

Microsoft Reader LIT is actually a good comparison for EPub, since both are containers (one proprietary, one open) for an OEBPS (now OPS) Publication.

MS Reader proves that with simple XHTML and CSS, one can display the content in page form, rather than in the web “scroll” form. There is nothing inherent in XHTML, unless one uses table markup for layout (which is a no-no), that would preclude browsers from displaying XHTML as MS Reader does: paginated.

So I find the comments about EPub and pagination mystifying. It is not a format issue, but a reading system issue.

Regarding how to handle in-content annotations (in print usually called footnotes, endnotes or sidenotes), EPub allows one to link to them as separate XML documents which are marked as out-of-spine. Reading systems which have been designed to specially handle out-of-spine content (like MS Reader was designed to do), could display that content in unique ways, such as in a popup (which for digital display is superior to footnotes or endnotes).

Again, we must not be constrained by following the limitations of print, but to transcend print and be able to digitally present content in ways better than print. Displaying annotations in popups, for example, is one such “better way”. EPub certainly provides the framework to display out-of-spine annotative content in powerful ways — it is now up to reading systems to enable it and for publishers to take advantage of it.

I think a lot of the confusion around EPub is the perceived linkage between the format and the so-far-only EPub reading system out there: Adobe Digital Editions. But EPub was designed not with Adobe Digital Editions in mind (and having served in a mostly independent capacity on the technical committee that developed EPub I know), but with how best to build the format allowing all kinds of reading system innovations. The road to Berlin is open for anyone to build an EPub reading system, and to outdo Adobe Digital Editions (and codebases such as Mozilla could be used — Opera could build a wonderful EPub reading system that blows Digital Editions out of the water if they wanted to). Whether anyone else will is a marketing thing.

But damn it, don’t blame EPub if no one else builds a better reading system that will take advantage of the innovations we now have in EPub, and others we are thinking of adding to OPS (and thereby EPub) in the future. Do not let Digital Editions limit your view on what EPub is capable of enabling.

Log in to leave a comment
GeneB September 30, 2007 at 2:55 pm

Pagination is an occasional accoutrement in ebooks. This is another example of term substitution: electronic copies of paper books are more and more identified with ebooks. They are as ebooks as watermelons are berries.
The test on “ebookness” is hypertext. The more “an ebook” is hypertext, the more it is an ebook.
Technically, PDFs have some “ebookness”, but think about the power of hypertext – free linking, many dimensions, and no pagination because it is not linear.
epub should be an adequate container for hypertext.

Log in to leave a comment
Tamas Simon October 1, 2007 at 4:47 pm

thanks Jon

that was a great reply!

re: pop-ups
it’s personal preference I’d say. Sometimes it can be useful to let the eyes wander onto the footnotes and back to the text.

It would be worthwile to list the “technical innovations” that you mention.

Meanwhile I’ll see if I could think of some other challenges 🙂

Log in to leave a comment
Jon Noring October 1, 2007 at 10:25 pm

Thanks Tamas! And of course your article that started this thread is thoughtful, and asks the type of questions that need to be asked.

One “technical innovation” we’ve discussed is a standadized way for third-parties to externally address spots within OPS Publications within EPub containers. We have considered creating our own IRI. One proposal is “book:/…” (I’ll spare the details of what goes after the “/”.)

One problem we have in designing an EPub addressing system is that the current OPS uses two ways to identify resources. We use both resource identifiers (mostly in the manifest of the Package), and the web-familiar path/filename.

OpenReader resolved this issue by requiring all resource references, including hypertext links in documents, to reference by resource identifer and not by path/filename. In addition, we moved all CSS assignments out of documents into the Binder (the equivalent to the OPS Package.) The beauty of using resource identifers, which, btw, are assigned in the Binder, is that it makes no difference if we change the path/filename of any resource (like documents, images, CSS, etc.) so long as we make the single adjustment in the Binder. Compare this with web pages where reorganizing the location of HTML documents and related resources can be a nightmare, requiring editing documents themselves.

(There’s also another issue I’d rather not get into deeply here, and that deals with identifiers. There are those who want to build an addressing system to the EPub Container itself (which is hard since the Container spec has no requirement for a Container level identifier!), while I believe it should be done to the OPS Publication inside. The main reason is that publishers may repackage an OPS Publisher into a new EPub Package, maybe with other OPS Publications, so to maintain linkability to existing third-party addresses, it is better to point to the OPS Publication. It is a knotty issue. There are those in the OPS WG who want to link the container with the publication, and to me this is a huge mistake that will limit publisher’s options.)

Anyway, that’s one innovation we’re considering, and not sure if the next-gen OPS will follow the OpenReader system, or stick with the web-based path/filename for addressing resources within EPub.

Log in to leave a comment
Joseph Gray November 7, 2007 at 5:46 pm

I know that Jon Noring’s comments were posted a few months ago, but I would like to ask him about one point he brought up. Jon mentions using popups for things like footnotes. I agree that this would be a very good method and wouldn’t require jumping away from the content you are reading.

The problem is, how can you do a popup in an epub? My understanding is that CSS properties “position” and “z-order” are not supported in the epub specification. If Jon can tell me how to do a popup in an epub without these, I’d love to know.

Log in to leave a comment
Jeffrey Kraus-Yao November 7, 2007 at 6:05 pm

The Gemstar REB1200 supported placing content in headers and footers using the CSS display property.

Header
Footer

The header would be placed at the top of each page and the footer at the bottom of each page.

The source XML file was then compiled into an IMP file. See message http://groups.yahoo.com/group/REB1200/message/1025 for more details on the binary format.

According to the Microsoft Reader Content SDK this display property was not supported in version 1.

Log in to leave a comment
Jon Noring November 12, 2007 at 7:19 pm

Joseph asks:

The problem is, how can you do a popup in an epub? My understanding is that CSS properties “position” and “z-order” are not supported in the epub specification. If Jon can tell me how to do a popup in an epub without these, I’d love to know.

To do a “popup” in an EPub reading system requires the EPub reading system to support them!

The important thing is for the publication author to “flag” the content they would want to display in a popup for EPub reading systems that support the “out-of-spine” feature of the Package specification (OPF).

This is done in the Spine of the OPF Package by setting, for each content document you want to appear in a popup when linked to, the attribute linear to the value of “no”. This flags the content document as “auxiliary” per the discussion in Section 2.4 of the OPF 2.0 Specification.

Now, I’m not happy with how “out-of-spine” content is handled and discussed in the specification. The “linear” folk won out, mainly because they had, or were developing, reading systems which essentially linearized the content in the first place. That’s a topic I may write about at a future time.

Log in to leave a comment
David Rothman November 12, 2007 at 7:44 pm

Jon, the sooner you write about it, the better. I’ll then pick it up in Publishers Weekly–read by lots and lots of print people who, if they must do e-books, don’t want things dumbed down. They need to understand that too linear approach could have that effect. Thanks. David

Log in to leave a comment
Joseph Gray November 12, 2007 at 9:32 pm

Jon, thanks for replying. I saw the “linear = no” setting in the spine, but haven’t tried it yet. I’ll have to see how that works.

One of the biggest problems with testing epub features right now is the lack of readers. For real testing with CSS, the only two choices at the moment are DE and Lector.

Doing a popup or “tooltip” as you would do on a web page can’t be done in epub, as the “position” and “z-order” properties are not required in the spec.

As for Jeffrey Kraus-Yao’s comment about headers and footers, the “oeb-page-head” and “oeb-page-foot” tags that were in the old OEBPS spec are included in the OPS 2.0 spec as well.

Log in to leave a comment

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com. Cancel reply

You must be logged in to post a comment.

Share this:

Related

18 COMMENTS

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com. Cancel reply

AMAZON

REVIEWS: E-Book & AUDIO BOOKS

SELF PUBLISHING: TECH & BIZ TIPS

MOST RECENT

POPULAR POSTS

MAJOR CATEGORIES