Picture of a red leather-bound bookWe are making progress on the BookX Project (pronounced “books”), previously known as “SimpleBook.”

The purpose and goals of BookX are outlined in the last TeleRead update from last November.

But to briefly summarize, the BookX system is envisioned to enable “almost push button” conversion of a single and fairly simple master XML document into most, if not all, e-book formats in use today and tomorrow. It is the goal to build a simple system (most, if not all, of which will be open source) to master simpler types of books, such as fiction. We believe BookX will be able to master most of the e-books published by the smaller, independent e-book publishers, who really need a system like this to save them time, effort, guarantee more uniform results, and make it easy to edit and republish their books.

Here’s the current status:

  1. Latest draft DTD of the BookX XML markup vocabulary

  2. The public domain book My Ántonia mastered in the latest draft version of BookX

    (Note that the CSS style sheet associated with the XML document is pretty cheesy, but the style sheet at least helps to visualize the BookX document. I will be working to improve the quality of presentation. Any assistance?)

  3. BookX to BookXHTML Transformation Description (plain text)

What we need in the short-term is for someone to build a script (such as PHP or Perl — XSLT is also possible), to convert a BookX document to what we call BookXHTML (see the transformation document for full information.) Once we get that converter written, then it is a very short jump to go from that to a number of XHTML-derived formats such as Microsoft LIT, Mobipocket, and even a directly viewable web document, among many others. Also, there are several paths to go from BookX to PDF, some of which can go through BookXHTML.

We also need to further explore BookX authoring tools, including leveraging existing authoring tools such as Word, to build BookX documents.

And of course, your feedback on the current BookX DTD is welcome.

Come join the BookX community!

7 COMMENTS

  1. Jon, I still would like to write this for you – time permiting.
    I think Perl would be a better option than php. It is very easy to slap a win32 gui on a perl script and turn it into a win32 exe, or slap wx or tk on it for Mac and linux users. Basically all you need is a way to select the file you want to convert and mabey a few options.

    Remove the gui and you have a script or exe suitable for batch conversions from a commandline.

    An xslt would also be nice since most XML authoring environments can transform documents, and this would allow the author to do everything from start to finish in their favorite XML editor.

    So I think the best course of action is to
    1. Create a really good XSLT
    2. In Perl apply XSLT and validate against DTD
    3. Guify and make into exe

    and that my friend is poetry!
    or at least a really bad Haiku.

  2. first, all this stuff looks sweet. This is good work (and hey, I might try to mess around with my own transformations this weekend).

    I was wondering. Is there any aspect to the BookXHTML that couldn’t be transformed back into BookX? My CMS’s rich text editor lets me create widgets for adding div classes. (Of course, this kind of solution wouldn’t be able to validate).

    From my standpoint as someone who writes a lot of stuff first for the web, roundtripping is on the top of my list of features.

  3. Sorry everyone for not replying sooner to this comment thread, but my DSL ISP has been flaky of late — switching to a new and hopefully better ISP the next few days.

    Anyway, I greatly appreciate the comments, and am glad that others see potential in BookX. Clearly we need some special forum to discuss BookX development among ourselves, including such topics as:

    • The BookX vocabulary (still a few miscellaneous things left to work on that should only minimally impact on the BookX to BookXHTML work.)

    • Conversion scripts/tools, like BookX to BookXHTML, and from there to the various e-book formats, including native dotReader, OpenReader, OEBPS, XHTML for direct web presentation (which is what the full BookXHTML is capable!), MobiPocket, MS LIT, PDF (several pathways to get to PDF), RTF, Plucker, etc., etc.

    • Building a repository of CSS style sheets for both native BookX (CSS will not be able to handle links or images in BookX, but handle everything else — great for “visualization” purposes), and for the BookXHTML equivalent.

    • Conformance checking tool — it is a good idea to build a conformance checker to verify that a touted BookX document is fully conforming — makes it easier on the conversion side. Note that there are requirements that go beyond just simple validation of a BookX to the BookX DTD/schema.

    • Authoring options. This includes possible plug-ins for Word, Open Office, etc. BookX is designed so there’s multiple options, from simple Unicode-capable text editors, to using something heavy duty like XMetaL. Lots of options and pathways.

    Any ideas about such a forum? SourceForge? YahooGroups?

    Now to answer some of the comments so far…

    I agree with regx (thanks for the haiku!) that XSLT would be preferable if it is reasonably doable (should be since XSLT is supposedly Turing complete.) The difficulties with BookX to BookXHTML conversion is that stuff has to be moved around, and the implied structural division hierarchy in BookX has to be converted into a real hierarchy with nested <div> for BookXHTML. It’s not just a simple mapping of tags and attributes as the transformation document describes.

    (To note: there are design reasons, primarily to make life easier on the BookX authoring side, why we chose to use implied division hierarchies rather than requiring actual hierarchical nesting with tags, but clearly we want BookXHTML to include <div> for actually nesting the structural division hierarchies when present.)

    Regarding BookXHTML back to BookX brought up by Robert. Definitely! But this requires that BookXHTML be exactly formatted (including the allowed attributes and attribute values) per the organization given in the transformation document. I assume it is possible to build a RelaxNG or XML Schema for enforcing the BookXHTML structure — not any XHTML 1.1 will work for “roundtrip” conversion. (Writing the BookXHTML schema may be difficult since a lot of the required stuff is at the attribute level.)

    It has always been my focus that BookX is the “master,” and everything else is derived from that. I currently see no advantages to authoring a BookXHTML document from scratch when it is actually easier to author BookX from scratch — BookX was designed to be fairly easy to author with a simple text editor.

    To stress again, the reason for the strictness and rigidity in the format of BookXHTML (which conforms to XHTML 1.1) is that this allows standardization of CSS style sheets, meaning a repository of CSS style sheets for BookXHTML can be built. This allows BookX system users to search the repository for the styling they want (for both direct rendering in web browsers and for conversion to other e-book format tools), and not have to become CSS gurus to customize CSS for their purpose (although they could build their own branded CSS), or for adapting to their “flavor” of BookXHTML (they can’t, we only allow one flavor of BookXHTML!)

    Lots more I could discuss, but best to wait for the proper BookX community forum to be created. Again, ideas?

    And thanks, too, to Ken for his comment which I did not address in this reply.

  4. Jon, is this project going anywhere, or has it died? It looks like nothing has been done with it in over a year.

    So, what is everyone using now for creating master documents that can be turned into other formats? I looked at DocBook, but it doesn’t seem suitable for fiction, poetry and such.

  5. BookX is not dead, and in fact I’m now working with another person to “finalize” the first BookX schema to allow for a developer with some time this summer to work on an authoring and conversion application.

    Of course, if you are interested in helping out, let me know!

  6. Jon writes,

    BookX is not dead…if you are interested in helping out, let me know!

    This is the first I’ve heard of the BookX Project, which seems to track very closely with some development work I’ve been doing myself since purchasing an iRex iLiad recently.

    I’ve been working on some XSL-based tools for my own personal use for the last month or so. Using two primitive DTDs of my own design, one for novels and another for plays, I have written some simple XSLT transformations to convert my XML sources to HTML (I plan to spend some time exploring XSL-FO this summer so that I can produce PDF output as well.) Although the build environment does not yet support them, I plan to incorporate other existing tools that use the HTML output as an intermediate format en route to device-specific formats (e.g. Plucker to create PDB files.)

    I’m a relative novice when it comes to XSL. As a retired software engineer looking for an excuse to dabble in technologies that have emerged since I retired, I developed a web-based cookbook application for an online forum with which I am associated. The recipes themselves are encoded as XML files, as are the accordion lists used to navigate the cookbook, which are constructed dynamically from data in the recipe files.

    We should talk. If you’d like to drop me a note at tjonz42@yahoo.com, my publishable, spam-besotted “public” address, I’ll reply from my “real” address so we can compare notes and see what interests and goals we have in common.

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com.