imagePDF, as I see it, is a toxic to e-books—given all  the scrolling and other hassles it creates for me when I use it on small screens. I can envision PDF for forms and for documents that you may want to print out, as well as for books that are intended more to be admired than read. But as an everyday format for text-intensive reading? No, especially in textbooks, where PDF all too often will slow you down. Via popups, links, you name it, there are smarter ways to handle illustrations than to use PDF’s paper-centric approach.

Not everyone agrees. Without further comment, I’ll present The Other Side, 10 reasons why PDF is the right format for ebooks in education, from the blog for the Keybookshop.com. Have at it, gang—speak up however you feel about this.

“1. Comfort – Teachers and students are already comfortable with PDF. The programs to view the files are ubiquitous. Virtually every computer in an educational setting today can open and view them. This comfort level also means that there is little or no training needed for teachers and students to utilize a PDF file to teach and learn.

“2. Equipment – Schools have the equipment to view PDFs. As mentioned earlier. It would be difficult to find a computer that can’t read a PDF. This means that money for new equipment is not a problem.

“3. Cost – PDFs can be created easily and inexpensively. While the professional level programs to create PDFs can be relatively expensive, there are options on most platforms to create them for free.

“4. Fonts – PDFs accurately display and preserve the function and beauty of fonts. There are definitely times when plain text just doesn’t cut it.

“5. Pictures – PDFs do a great job of displaying pictures in color or black and white. A picture can be worth a thousand words. This is particularly true in the teaching and learning process. Sometimes, a picture is worth far more that a thousand words. It is nice to have a format that can handle text and pictures seamlessly.

“6. Layout – all the richness of desktop publishing documents in a format that can be shared with anyone. As with pictures and fonts, there are times when the effort and talent used to layout a document needs to be preserved. A well designed document can focus attention and guide the flow of interaction.

“7. External Links – bring the resources of the internet to any document. Being able to bring outside resources to a document can add breadth and depth to that may be critical in the instruction process.

“8. Internal Links – quickly and easily navigate even complex documents. As documents grow in size, the ability to navigate and locate specific locations becomes more than just handy, it is a necessity.

“9. Printing – PDFs can be printed when necessary. While in many cases, the ideal method for reading document would be from the computer screen, there are times when a printed copy is still the best way to interact with it. Worksheets and activities are good examples of documents that work best when printed.

“10. Cross Platform – Windows, Mac, or Linux, it doesn’t matter, it just works. Educators can use the equipment they already have to bring the flexibility and convenience of ebooks to the place of learning.”

(Via MobileRead post.)

Technorati Tags: ,,,

22 COMMENTS

  1. aside from the whole reflowing text issue, i’ve never understood the PDF hate that i read in all the various ebook sites i read, including this one. i LOVE PDF, and if someone comes out with a reader that allows me to read PDF books on my mac in something approaching 8.5″x11″ proportions and size, i’d be happy as a clam.

  2. Preview can display PDFs in the way you want, if I understand what you are looking for.

    As to PDF hatred, I am part of it. I don’t read books on my computer, I read them on a handheld device such as a Palm TX or Sony 505. For this market PDF is just a pain, and that’s where the hatred is targeted.

  3. What about all the page-based metadata in the book itself? Educational or academic books can have a fair amount of content that talks about itself in terms of physical book pages, and that can make very little sense once the book gets reflowed into HTML/ASCII.

    For instance, in PDF, a back-of-the-book index still makes sense — if an item was on p. 45 in the original book, it’s still on the page that calls itself ’45’ in the footer (even if no one thoughtfully renumbered the PDF pages to match folio numbering).

    And in PDF, a footnote referring to something on p. 118 can still be found on the page labelled p. 118.

    When these texts get reflowed to HTML, suddenly a 300-page physical book can turn into, say, 612 screens of content….or, on my cellphone, 1198 screens. How do I find p. 118 or p. 45..?

    For already-published scholarly books getting converted to ebooks, staying in PDF is kind of handy for these reasons. Unless someone wants to do a very careful conversion to text, this kind of scholarly information gets lost in translation.

  4. I do a LOT of reading on my computer & Pdf is fine with me. Scanned texts esp from Internet Archive can be blurred etc but I’m just grateful someone’s doing them. Cross system support is VERY important. At home we run Mac’s, Vista & XP with Linux hovering on the horizon. Proprietary formats are a no no, as is DRM…

  5. I can understand the frustrations with using PDF on small devices, though I think the format could actually be quite usable given a good implementation of a PDF reader. I find nearly all PDFs to be quite readable on the iPhone, for example, which has a very flexible zooming capability and thus makes the small text acceptable without requiring a horizontal scroll.

    As for PDFs on the Mac, I find that using Adobe Reader and its “full screen” mode gets it pretty close to normal size on a 15″ MacBook Pro. Since all menus and toolbars are turned off, the full height of the display is used for the document — about 8.25″. When I really want it to be full size (or bigger), I’ll use the rotate feature in Reader and hold the laptop with the display oriented vertically. This admittedly feels a little weird to me and I don’t do it very often, but it’s an option.

  6. I would not say students and teachers are ‘comfortable’ with PDF. In our school, it crashes the browser every time you try and load one from within Firefox, and the kids don’t always know/remember to save it to the hard drive before they try to open it. It’s a pain in the you know what. Also, most teachers (myself included) are notorious for their love of tweaking, and PDF documents are very hard to edit. Sometimes you can copy the text (but you have to reformat it once you get it into the word processor) but sometimes you can’t. Huge pain. I often have to tweak things for my students needs. For example, the French program I use with them does not introduce the past tense until the third level, so with nearly all of my classes, I have to change the verb tenses before I print it out. What would be a five-second find and replace in a text editor off an open-source document is a much longer process when you’re working off of PDF…

  7. The advantage of PDF comes from programming libraries which a CMS can use to do conversions. Also, if the PDF is created correctly, the conversions into formats readable by Sony Reader and mobipocket won’t look too bad.

    My hatred stems from 1)the way my browser freezes when I push on a link and 2)how some web pages don’t warn me that I’m going to download a 10MB PDF. Fortunately the PDF download addon for Firefox helps. Note that I don’t mention DRM, if only because I haven’t needed to buy any DRM-ed content so far.

  8. Also, it’s cumbersome to navigate from one place to another in the browser or even in Adobe Reader. I’m always resizing the Zoom and manually pushing that scrollbar down. I would love to have the ability to use the page up/page down down button.

    I’m not familiar with annotation features on PDF. Perhaps that is available only for people who bought the converter?

  9. PDF seems to be the most popular format at BooksForABuck.com and, frankly, I’m mystified. The whole point of PDF is to allow the creator to handle formatting exactly, without regard to the target machine. But I sometimes read on my PC, sometimes on my eBookWise, and sometimes on my Palm. Some PDF documents can be viewed on my Palm (but only by breaking the whole PDF formatting thing) but the result is slow and clunky (on a higher-powered Palm, this might not be as big a problem). When I get PDF documents that I want to put on my eBookWise, I have to cut and paste into a text document, which wrecks havoc on the line breaks and paragraphing (defeating the purpose of PDF). On my PC, I just have to restart after every few PDFs because something hangs and doesn’t release memory.

    I offer PDF because customers seem to like it. As a big eBook reader myself, I prefer other formats.

    Rob Preece
    Publisher, http://www.BooksForABuck.com

  10. David writes:

    PDF, as I see it, is a toxic to e-books—given all the scrolling and other hassles it creates for me when I use it on small screens.

    What on earth does this mean, David? If a book contains 500 pages, you still have to turn 499 pages regardless of whether the book is published in PDF or another format, don’t you?

    If you’re trying to read a PDF document formatted for an 8.5 x 11 page on a handheld device, that’s another matter — the wrong tool for the job, if you will. If, on the other hand, the PDF file has been properly formatted for the device on which you read it, I don’t understand your objection.

    PDF’s page orientation can be both a strength and a weakness. As my first effort to produce custom content for my iRex iLiad, I tediously formatted the text of a play using a word processor to produce a PDF file. Next I marked up the text of the play as an XML file and used XML-based tools to produce an HTML version of the same play.

    I like to have a header on each page that contains the name of the work and a chapter/act/scene number just to help me keep my bearings in the book as I read it. This was easily accomplished in the PDF version, but not the HTML version. There’s also the matter of the blank line between blocks of text (paragraphs, lines in a play, etc.) The manual production of the PDF version allowed me to make sure that these never appeared at the top of a page, but I have no control over this in the HTML version. Widows are a problem in HTML, too; I put the name of the character who speaks a line of dialog in a play on the line preceding the dialog, and these are often separated when viewing the play in HTML format.

    My XML-based tools depend on a library of CSS stylesheets to handle the vagaries of various specific devices, so matters like page and font sizes aren’t a problem. The visual formatting of the automatically flowed HTML text, however, remains a problem that I have yet to resolve. Some of these will be resolved by HTML 5.0, but I’m not going to hold my breath waiting for browser support.

    My future plans for this toolkit are to make it possible to convert an XML file into multiple PDF files (rather than HTML files) optimized for specific devices, but this will require me to dip my toes into the murky waters of something called XSL-FO, an adventure I’m saving for the summer, when I’ll have more time to devote to the project.

    Bottom line: from where I sit, PDF is superior to HTML for this type of application as well as the best format currently available.

  11. All: Great PDF discussion. Since I regularly gripe about PDF here, I thought it would be cool to encourage defenders, not just my fellow PDF-haters, to speak out. I’m grateful for the thoughtful comments from all sides.

    That said, I think I hope the defenders will pay attention to Ficbot’s complaints about Adobe software crashing browsers. I doubt that publishers will agree with her wish to be able to modify files. But look, what if there were a way of changing things within PDF while flagging these changes and allowing people to revert to the original version and even tracking the changes? I might do a separate post on that. With HTML-style flexibility and more, clearly, the content would be much more useful in K-12.

    As for the problems with screen size, that remains. Gerry, even on my desktop machine, I have trouble figuring out the right size for Wowio books. And this is true not just with my regular Adobe reader but also within Digital Editions. With alternatives like Mobipocket and .epub, by contrast, I fire up the files and they just work. I don’t have to worry about optimizing the Mobi and .epub files for a particular machine, the way Todd has been doing with PDF. Hey, each to his/her own. That’s just my take.

    In her reply to my post, Kate raises excellent questions about page numbering and locations of specific items. but I would hope these could be addressed through workarounds within HTML and, longer term, through .epub standards. If there could be ways to coordinate online page-numbering with the paper variety, that would be wonderful.

    Also let me ask yet another question. Is it possible that PDF works out better on Macs than on PCs? Some of the most passionate PDF defenders are Mac people. Of course, the real reason could be that the Mac has more than its share of fans among publishers, who enjoy the amount of control that PDF gives them over a document’s appearance. My own priorities, by contrast, even though I’m a writer, are those of a reader, a user.

    Answering Rob, I wonder if PDF is popular among many users because they’re printing out files, or because that’s what their systems come with.

    Finally I’ll conclude with a reminder that the TeleBlog is for PDF haters and lovers. We need both/all viewpoints here, and I’m really glad that pro-PDF people are around, so that we HTML and .epub types can be more aware of the deficiencies of our favorites and address them—one reason why I think Sophie in great for .epub, in terms of reminding the IDPF of features missing from the standard. Similarly, I hope that we skeptics’ complaints about PDF will be helpful to those who believe in it. We have visitors from Adobe and I’m sure they’ll be following this debate very closely.

    Thanks,
    David

  12. As a student, pdf documents go straight to the printer, as I absolutely hate reading them on my computer (or anywhere else). But even at the printers they cause a lot of problems. The printer in main computer lab at Ballantine Hall (the biggest academic building in the western hemisphere–or at least at IU) gets so backed up with students’ print jobs whenever somebody has to print out scans of books or something. The printer processes them so horribly slowly. Usually a crowd amasses at the printer and we glare at whoever felt they had to print out their 80-page scanned reading assignment in the middle of the day when everybody else is trying to get their papers printed out for class.

    But I guess that’s really more of a scanning versus OCRing. But if nothing else, Acrobat Reader has been the clunkiest, slowest program to open up on the school computers. I replaced it with Foxit Reader on my own compy, and while that has made dealing with pdfs less painful, I still would prefer to have a more flexible format for dealing with everything.

    [Moderator: Ballantine Hall, shown in photo below from bloomingpedia, is at Indiana University. That’s the “IU” mentioned above. – DR]

    Ballantine Hall

  13. Bob, regarding page up/down, I agree it works, but my mouse never is focused on the right window, plus there’s latency and I can never keep track of what section is from the previous page down. Plus, I can never get the Zoom dimensions right. This probably sounds petty.

  14. The issue about being unable to copy/paste out of Acrobat Reader isn’t an issue with the PDF format per se. By default, a document produced by Acrobat Professional allows copy/paste, and the producer of the document has to explicitly set the preference to disable that functionality.

    I imagine that a DRM’d .epub file could be set to have the same limitation (can someone who has produced such a file confirm?).

  15. David writes:

    I don’t have to worry about optimizing the Mobi and .epub files for a particular machine, the way Todd has been doing with PDF.

    I assume you mean that you don’t have to worry about optimization as a *consumer* of e-books — and, of course, you shouldn’t have to. The *publisher* should worry about this for you, as, for example, the nice folks at “manybooks.net” have done; just go to the page for the title in which you’re interested, select the appropriate device from a pulldown menu, et voilà. This is the sort of publishing environment in which the toolkit I’m developing would be potentially useful; these are not end user tools.

    Is it possible that PDF works out better on Macs than on PCs?

    Quite possibly. Mac OS is very PostScript oriented and uses Display PostScript extensively in the UI (the dock icons, for example, are PDF images, not bit-mapped images, which explains why they scale so nicely when you roll the mouse over them.)

    Mac OS also provides native ability to print to a PDF file from any application (a feature which many Windows users envy), and the Preview utility provides a perfectly acceptable, lightweight image browser with PDF support that most Mac users prefer to Acrobat Reader.

    I hope the defenders will pay attention to Ficbot’s complaints about Adobe software crashing browsers.

    This has absolutely nothing to do with PDF as a file format, and has everything to do with poor quality application software. IMHO Ficbot would be wise to explore alternative PDF browser plug-ins, or to consider configuring the machines in her classroom to use an external PDF browser.

    I can’t comment on Acrobat Reader (I’m not a big fan of Adobe software myself, and I use it only when absolutely necessary), but I can attest to the fact that Adobe is capable of writing perfectly atrocious software. The absolute, rock-bottom, worst application software I have ever used in my life was a video editing abomination called Adobe Premier. Even though I paid close to $1K for it, I replaced it with a competing product in less than a year.

    Ficbot writes:

    PDF documents are very hard to edit.

    Once again, this is a case of the wrong tool for the job. PostScript and PDF were designed for page-oriented, printed output and were never intended as editable formats. Since the specs for these formats are published, it’s *possible* to write an application that allows them to be edited, but just because it’s possible doesn’t necessarily mean it’s a good idea. If editable content is the goal, PDF is certainly not the best tool for the job.

  16. Gerry Manacsa Says:

    I imagine that a DRM’d .epub file could be set to have the same limitation (can someone who has produced such a file confirm?).

    ————-

    I doubt you’re going to get anyone to confirm this, as DRM is not in the current epub spec. Apparently, Adobe has implemented some form of DRM to use with epub in their Digital Editions reader software. I don’t know any more about it, however.

  17. Speaking of PDF and DRM, In my experience, it seems quite common to have the printing feature disabled, as well as the ability to copy text and annotate text. Considering that PDF’s are so widely used for academic ebooks, this is incredibly stupid IMO.

    If I can’t print a passage of text, can’t copy a passage of text, or even annotate the PDF I reading, this makes the ebook almost useless for study purposes. I really believe that the people making these decisions either don’t have a clue about how people need to use their ebook, or they just don’t care, as long as they have your money.

  18. Joe Clark:

    Alas, PUBLISHERS need to learn about tagged PDFs, not readers. There was a recent thread on rec.arts.sf.written about the free e-book version of Scott Sigler’s “Infected”. It was only offered in untagged PDF formatted for hardcover-sized pages. This made it unusable on a lot of e-book reading devices, which tend to be PDAs or dedicated readers with smaller screens. This happens a lot.

    As it happens, Mobipocket’s latest version will convert the file into a useable prc file, rendering it readable in other formats. The conversion did, however, introduce some font and formatting problems.

    In general, I wouldn’t bother to take the time to discover the conversion at all, and Mr. Sigler would lose my attention, falling back into the hordes of authors who suffer from “too many books, too little time.” In this case, since my interest was piqued for other reasons, I did a little bit of legwork.

    PDFs, in my experience, are a warning flag for the general run of e-book readers. Until authors and publisher learn how to cope with this, PDFs will discourage many people from reading e-books in that format. Readers, like myself, generally won’t bother with the extra time and effort. We’ll just read something else.

    Regards,
    Jack Tingle

  19. Fence sitting time! Simply, I love the various formats for what they are good at, and nothing more. I shan’t mention DRM because all formats can and have been DRM shackled.

    I love PDF for the advanced layout options and beautiful font handling. Design-heavy media (magazines, sheet music, some poetry, comics, maps, some novels) often have no other option for any remotely faithful representation. The cute thing is Adobe’s tight control means there is always a reference implementation to follow if the spec is inconsistent or vague. Just “do what Adobe Reader does”. PDF is Virtual Paper and it is pretty darn good at it, coming from a fine Postscript heritage.

    OF course, that’s part of the problem. PDF sucks for portability and usability on any device that doesn’t have a huge, high res screen. Even then almost all PDF readers are obnoxious. Some aren’t stable. Many non Adobe products have missing features. Almost all can’t allow for reformatting, converting, or even printing the contents under certain conditions. Oh, and I know about tagged PDFs. This is one of the more clever things Adobe bolted onto the spec. Guess what? Generating this metadata requires careful oversight and editing to get RIGHT. We might as well be using XML! Oh, wait….

    I love XML+CSS formats for heritage and usefulness. For pure text content it cannot be beaten. Not to mention that it is designed to be infinitely extended, so any kind of craziness can be slid in. Since it is based on open standards, no single company’s whims are law. Careful users of XML formats can even create self-documenting documents. Structure is the core of good XML and whatever metadata isn’t THERE can be inferred easily. And for integration with Content Management Systems and databases? They luv XML. Simply, PDF specifies EXACTLY what and where everything is rendered, while XML only describes the content and lets the clients (and users) determine how to process it. Which leads to the flaws.

    The problems are mainly with design. Precise layout adjustments using CSS are a series of approximations, with every engine having different interpretations of the standards. Assuming they actually follow them at all. Font and layout support have improved greatly (at least in web browsers) but many ebook readers have yet to leverage these improvements. Converting paper books takes time and care to do right, because XML thrives on structure and organization. Without it, there is only bland tag soup. Standards bodies have tried their best to improve things (SVG, MathML, CSS3), but bolting Postscript/PDF onto XML makes about as much sense as bolting XML onto PDF.

    In a perfect world, both formats would not compete. They would do what they are good at (PDF as virtual paper, XML as Content und Structure über alles) until some super format could absorb them both. Wait, no, that could be bad. I imagine some crazy serialization of PDF into an XML schema that could take precise layout instructions in an external file. Now I imagine these instructions to be in some magical Dr. Frankenstein language combining CSS, XPath and/or bits of Postscript and JavaScript. This is either the Best or Worst thing ever.

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com.