OpenBerg logo

For something like five or six years, I’ve been able to style XML elements with CSS and have the text displayed just the way I want.

That is, in the XMetaL XML editor* and in browsers.

Not in an e-reader, however. All the e-readers specify the vocabulary you’re permitted to use in your e-book**.

There’s a difference between a reader and a browser, between a reader and an editor.

The reader has library functions, bookmarks, annotations. It collects multiple files into a single package; browsers and editors don’t have the same orientation. They just won’t do.

As it happens, I’ve worked with XML since 1999 and I have lots and lots of XML files I need to look at. I’m not particularly happy with my reading choices and I hate converting a file to XHTML just so I can view it in the distraction-free confines of FBReader or MS Reader.

Last week I ran across OpenBerg Lector again, and I took a fresh look at it. Lector has separated from the original effort to make YAEBVBOX (yet another e-book vocabulary based on XHTML), and it has a fairly clear goal: to enable a (human) reader to open an e-book package in Firefox and read the e-book there.

It took a day or two for me to realize that. In the past, I’ve envisioned browsers being used to treat texts on the web more like an e-book, and Lector’s being a Firefox add-on clouded my perception.

However, Lector won’t just read XHTML files. In fact, its main format is OEBPS — Lector will read OEB package files in a single-file publications (stored in .obz zip files) and handily display the content files therein.***

Once I realized that Lector was built to be an e-reader that lets Mozilla’s Gecko do its rendering, it dawned on me that at last an e-reader had arrived that would accept arbitrary XML.

I grabbed some XML files (structured, as any XML ought to be, by the nature of the content and not by some pre-ordained presentation of a scientific article that spawned HTML), wrote some CSS for the elements used, and created a package file, tossing it all into a .obz archive.

When I opened this file (yes, using File | Open File…) in Firefox, it correctly utilized Lector’s scripts to display my three XML files, moving smoothly between one and the next by pressing PageDown at the bottom and PageUp at the top.

I’ve just begun exploring what Lector can do. The first effort to replace scrolling with pages is underway. I believe the annotation, highlighting, bookmarking and so on will be delegated to other extensions. By building on the Firefox framework, Lector surpasses other e-readers by providing such features as MathML, SVG and use-your-own-XML.

That’s a lot to recommend it, and of course an example of the ‘prairielight’ e-readers I projected earlier. Lector has now set the bar that other e-readers will have to to meet.

* And possibly other XML editors that I haven’t owned.

** Don’t point out the unfulfilled potential of the “extended” vocabulary permitted in OEBPS 1.0; I never heard of any reading system that implemented it.

*** I’ve never encountered a book in the “Cabinet Comic Books” format (.cbz, .cbr), but that too is handled by Lector.


  1. The .cbz and .cbr formats are just renamed ZIP and RAR files typically containing comic book images. Unlike .obz there’s nothing special about them, they include no metadata – it’s basically just a bunch of jpegs or pngs or whatever put into a single archive for ease of use.

  2. For that matter, there’s nothing special about a .opf file — it’s just a list of all the text and image files in a publication and the order the text files should appear in.

    So how does one specify the order of images for a comic book in .cbz/r? Is it alphabetical?

  3. Typically the files in a cbz/cbr archive will just have names like Unfab27-12.jpg and Unfab27-25.jpg, with the names proceeding in order from front cover to back. Most viewers will also pop up a thumbnail sheet so you can click on a particular page. There’s no separate file listing the order of the images. The viewer simply progresses through the pictures in alphabetical order, so if for some reason you want a different order, you rename files to get it.

  4. Actually, OpenBerg Lector accepts a super-set of cbr/cbz, which I’m also using for slide-shows. Essentially, this is cbr/cbz except you can have any kind of resources, not just images, and with provisions for hiding files and directories so as to remove them from the reading order.

    In the next version, we’ll also recognize if there’s a file named “index.html”/”index.xhtml”/”index.php”. This will give us extra-lightweight e-Book management for people who do not care too much about meta-data.

    If, at any time, people wish to add opf meta-data to an existing cbr/cbz, they can just put it inside the archive, Lector will recognize it the next time the book is opened.

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail