Moderator’s note: Great timing, Jon. I’ve just posted The Triumph of social sites: Publishers, listen up! Annotation-style capabilities, of course, will make in-book communities possible. – D.R.

David Rothman recently called on IDPF to develop an open standard, third-party annotation and linking format. I’ve previously written about the need for such a standard in two TeleRead articles [1, 2]. Hopefully the third time will be a charm!

The need for such a standard is pretty obvious. Various companies are already implementing their own proprietary standards for third-party annotation of, and linking between, digital media such as books, music, video, etc. Annotation and linking of content (no matter the type of content) is rapidly becoming a vital and fundamental component of interactivity with content, being of great value to business, academia, education, libraries and archives, social networking, etc.

Thus it is important for interoperability (that is, to prevent another Tower of eBabel) to have a single, well-designed, open standard format for third-party annotation and linking. From my research in this area, I have not yet found a developed standard suitable for this purpose (but if one exists, let me know, please!)

“Real-World” example: Annotating an e-book

Because the above introduction is a tad theoretical, let me give a fun “real-world” example to better illustrate what I’m discussing:

Mary is sitting on the beach reading a steamy romance novel on her e-book reading device (e.g., laptop computer, or dedicated e-book reader.) In a particular scene of the story, she is introduced to a character named “Charles,” about whom she really would like to share her thoughts with others. For example, she might want to share something relatively academic like “Charles reminds me of a character right out of a 19th century English novel,” or maybe something a little more earthy and personal like “Wow, Charles is a real hunk!” (I’m not sure if “Charles” can be both!)

Mary’s reading system fortunately includes the capability to create an annotation object in the standard format. So Mary highlights the portion of text she wants to annotate, and composes her note in an editing window. When she’s finished, she pushes the “Done” button. The reading system then creates a file (in fancier jargon, a “digital object”) which contains her note, plus whatever pointers, identifiers, and metadata are needed, all encoded in the standard annotation format. Mary did not have to learn anything complicated to compose the note—the reading system took care of all the “under the hood” stuff.

(Importantly note that the e-book itself is not changed in any way.)

Mary can now share her digital annotation object with others, such as uploading it to a repository designed for the purpose of sharing annotations, emailing it directly to one of her friends, offering it from her web site, or some other means. Others who read the same e-book title may then download and read her note applied to the same spot in the e-book where Mary applied it, provided of course their reading systems, which need not be the same that Mary uses, support the same annotation standard.

(I’d like to note that OSoft’s innovative dotReader annotates texts pretty much as described above, along with other kinds of powerful interactivity. Of course, OSoft is currently using an “in-house” format for their annotations since there is not yet an industry standard.)

Shall we call it PAL?

Marketing is everything, so a catchy name to describe the system and files is probably a good idea. So I tentatively propose calling this annotation/linking standard “PAL”, short for “portable annotation/linking”. The digital object produced in a PAL system could be called “PALO” where the “O” is for “object”. A PALO might even have a file type suffix of “.pal”.

I’m not wedded to PAL as the name, but unless someone comes up with something catchier, I do kind of like PAL. At least I have a name to call this system for the remainder of this article. (Hmmm, there is a long-time PAL standard for color encoding in television broadcasting. Room for another PAL spec?)

What about linking in PAL?

Up to this point I’ve mostly talked about annotation, but very little on linking. So why have one standard for both? As you will see, they are really variations on the same thing.

First I have to define linking. What I mean by “linking” is essentially “connecting” two or more “spots” together. A spot could either be an entire object (e.g., an e-book), or some particular portion “inside” of an object (e.g., a particular sentence in an e-book.) Furthermore, the spots could either be in the same object (e.g., in the same e-book), or in different objects (e.g., an e-book and a video), or a combination of both. (All combinations of any number and type of objects, and spots within objects, are possible.)

For example, I have two different e-books (two different titles.) I would like to connect a spot or location in one to a spot in the other. So with a tool (hopefully in the reading system), I create a PALO which provides pointers to the two spots, describes the nature of the connection between them, and could optionally include an annotation describing, for human consumption, more details on the connection.

PAL may be useful for things other than just e-books

Since TeleRead primarily focuses on digital publications such as e-books, I’ve given mostly e-book examples. But as we think about it, the PAL standard will work for all kinds of target resources, both digital and, interestingly, non-digital. (A PALO could even reference another PALO!) The potential of universality, of “annotating everything,” is the focus of one of the two prior articles I wrote on this idea.

PAL thus seems to be quite powerful in a variety of ways which I think are obvious to anyone who begins thinking about its many potential uses. As I have noted in private to a few others about PAL (you know who you are!), I believe PAL is potentially a billion dollar business. What I really mean by this statement is that PAL could play a significant role in a number of major commercial products and services: in publishing, and in many other industries completely outside of the digital content arena.

Of personal interest, one of the more intriguing opportunities is a public PALO repository to allow universal storing, sharing and finding of PALOs (imagine it holding trillions of PALOs which essentialy annotate all of human activity, and not just digital content.)

Thoughts on the PAL framework

It is premature to even propose the details of the PAL general framework since we need to put together a comprehensive set of general requirements from a representative cross-section of industries that would apply the general framework for their particular needs.

Stating this in a less formal way, I think that the PAL specification should be general in form—thus my use of the word “framework” to describe it. Particular industries, such as the e-book industry, would specify how exactly the PAL framework would be used for its purposes. For example, IDPF may specify how PAL should point to spots or locations within EPub e-books, along with other specifics.

Nevertheless, I will suggest a number of technologies that are likely to be leveraged in developing the PAL framework. Those that come to mind include W3C specs such as XML, RDF, OWL etc. An interesting alternative to RDF is the Uniform Resource Framework (URF). And of course, many here will notice a distant relationship between PAL and RSS and its relatives (RSS is an application of RDF.) Certainly PAL intersects the worlds of the Semantic Web and Web 2.0. PAL appears to be a powerful way by which the world’s digital and non-digital information may be interconnected by third-parties, both human and machine. (In addition, this standard might also be used in an indexing capacity, since an index is simply a topically-organized list of pointers.)

What’s the next step?

David is suggesting that IDPF start and/or lead the effort to create PAL. I support this provided that IDPF does not restrict the effort solely to EPub without consideration of its broader use. Any effort should bring together a critical mass of representative standards bodies, organizations and companies to assure that the PAL framework is properly designed for broadbase application.

What use is an “in-house” format that may become incompatible to a generalized version later developed for, and embraced by, a number of industries? There is a time to go-it-alone, and a time to work with others. I believe PAL falls squarely into the latter category because of its potential universal use. Even if PAL ends up being restricted to digital content, the world of digital content is a whole lot bigger than just e-books: we have other types of text content including web pages, and of course audio, video, and other multimedia.

One purpose of this article is to introduce PAL to the IDPF Board and interested IDPF member organizations, so they may decide whether IDPF should be involved in developing the PAL framework and its specific application to EPub.

I offer to organize and/or lead any IDPF chartered special interest group to study the feasibility of PAL, to form the necessary alliances, assemble the needed “brain power,” maybe generate a preliminary list of general requirements, and put together a plan of action for the actual authoring and publication of the PAL specification.

Interested?

Anyone reading this who is interested in being involved in the authoring of the PAL specification, let me know!

2 COMMENTS

  1. Calling it PAL isn’t a great idea in my opinion. As well as being the term for the aforementioned European and Australian television standard, it is also used colloquially in order to refer to those markets collectively, especially in the context of the entertainment industry (as in “will this title ever get released in the PAL region”).

  2. Dan, thanks for your feedback.

    As my article notes, I definitely am open to other names. For example, I explored calling each annotation object a “PEA”, for “portable extensible annotation”. So the repository of PEAs could be called a “pod” if we wanted. <laugh/>

    Anyway, the request is out to the creativity of the Internet collective to come up with a better name to call this standard and each annotation/linking object.

    Anyone?

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com.