The Internet Archive’s Open Library is violating authors’ copyrights

Open Library is a project of Brewster Kahle’s Internet Archive. We’ve written about Kahle, the Archive, and Open Library a few times, including some times I’d forgotten about. Kahle’s Internet Archive was first founded as a way to keep a historical record of the ever-changing Internet for the benefit of future sociological and cultural researchers; it later expanded into archiving other media as well. More recently, Kahle started collecting print books, and scanning them as well as archiving them; it was his intention to collect and save one of every print book ever published.

These scanned books would also go into the Open Library, which was a collaborative venture with some public libraries, in which he would scan those libraries’ books and make them available for checkouts at participating libraries’ physical locations only, at least at one point. But even earlier than that, Kahle’s plans for Open Library seemed likely to draw controversy. According to an article in the Wall Street Journal, Open Library would be treating the paper book and its scanned copy as if they were one and the same item:

With its latest project, the organization is making inroads into the idea of loaning in-copyright books to the masses. Only one person at a time will be allowed to check out a digital copy of an in-copyright book for two weeks. While on loan, the physical copy of the book won’t be loaned, due to copyright restrictions.

At the time, the article acknowledged there could be legal challenges from the Authors’ Guild or others, but they don’t seem to have materialized. Which is curious, given that Open Library doesn’t seem to be restricting its checkouts to in-library locations anymore.

xcom-coverI had forgotten all about Open Library and its plans until the other day. I was searching the Internet to see what I could find out about Diane Duane’s tie-in novel for the X-COM computer game, when I happened across this page at Open Library offering it for checkout in PDF, EPUB, or screen-reader formats.

Curious, I "checked out" the EPUB version of the book. It turned out to be an apparently unproofed scan of the book–right down to including the page number/author/title footers, and a zillion errors per page. The PDF version was a photographic scan of a yellowed-page paperback edition of the book. Both were protected with Adobe Digital Editions DRM, much like a book you might check out through the Overdrive public library program, limited to one two-week check-out at a time, as with any digital library.

Further examination turned out quite a few of Duane’s other books, as well as even more books by another author I know, Mercedes Lackey. Not all, or even most, were available for checkout, but quite a few of them were.

I checked with Lackey and Duane about these books appearing on the site. Lackey confirmed that she had never been asked for permission to host her books, and she or her publishers would be filing DMCA takedown notices. Duane wrote, via Twitter direct messages:

I get real cranky when books are scanned / sold / borrowed without seeking permission. DMCAs routinely follow. I am not a monolithic corporate entity. I am a writer who might give you my stuff for free if for a good cause. I am seriously unlikely to roll over without comment if my creative output is exploited without asking me first. I make this stuff out of nothing. It costs me effort and rent and grocery money and taxes to make it. I would appreciate it if those who consume my creative output would contribute. Yes, "ideas should be free". But should art? And why are those who think so routinely not artists? Winking smile

(I also reached out to Open Library and Prima, publisher of the X-COM novel, via contact forms but have not yet received any response. I will update the article if I do.)

In a recent post on The Digital Reader, Nate Hoffelder quoted a letter from Robert Miller, Global Director for eBooks at the Internet Archive to “Archive Sponsors, Content Contributors, and Partners”. Miller wrote:

Overall, we now have 4.4 million eBooks online (, the difference of 4.4 and 2.0 million is  from uploads of items digitized elsewhere. We also have 500,000 modern eBooks for the print disabled and 250,000 modern eBooks ( Our next goal is to reach 10 million public domain and modern eBooks on line. We are looking for partners that wish to fund more modern books to grow our contemporary global library. By working together, I am confident we will reach this goal in the near future.

The public domain books are, of course, not the problem. Since they’re in the public domain, Open Library can do anything it wants to with them, including checking them out.

And some of those modern e-books, for the print disabled, are only available in protected DAISY format. (For example, Lackey and Piers Anthony’s If I Pay Thee Not in Gold.) These books aren’t the problem either.

DAISY is a digital talking books reader, and its protected format uses a key issued by the Library of Congress, available only to those who have been medically certified to be disabled enough to need it. There is an exemption built into US copyright law, 17 USC § 121, that permits authorized entities to reproduce copyrighted works for consumption by the blind, as long as they are distributed in protected formats that can only be used by the visually-impaired. There is also a movement to codify this sort of exception into international treaty, though some
rights-holders object

The problem lies in those 250,000 modern e-books. In a podcast interview with Kahle by TeleRead contributor Sue Polanka, he said he had been trying to get permission from publishers wherever possible, because that would allow checkout of “more than one copy” of a given book, but that he’d only been able to get permission for about a hundred books at the time. Which meant the vast majority then, and probably now, are scanned from library books under the principle of treating the paper and e-book editions as one and the same. And, as Miller wrote above, they’re looking for people to fund scanning even more of these modern books.

The problem is, it seems unlikely that this print-plus-scan-equals-single-copy theory is legally sound, which would mean Open Library is committing copyright violation on a massive scale.

Here, look at this, taken right off the copyright page of that X-COM PDF:

Fullscreen capture 7102013 65907 PM.bmp

I’m not sure if that quite qualifies as irony, but it seems pretty plain. Open Library is violating copyright by hosting these books. Technically, the Internet Archive is violating it by scanning them—just as Google did for its Google Books program that got the Authors’ Guild so up in arms. The weird thing is, all Google wanted to do, originally at least, was scan, index, search, and serve up snippets—and perhaps make bank by snagging a referral fee for directing people to where they could buy the book from an online store. Not make them available in their entirety (at least, except under the terms of the proposed-and-shot-down settlement that would have let them become a de-facto e-book store for orphan works). But the Authors’ Guild jumped all over them.

So why isn’t it jumping all over Open Library for actually making books available?

Most commercial e-books are not sold but licensed, whether publishers and stores say so out front or not, but library e-books are more licensed than most. Publishers set specific and often restrictive terms for libraries to be able to check out e-books, fearing that library e-books lead to cannibalization of sales. Sometimes this has been the source of controversy, as publishers elected to impose limits on how often books could be checked out, or even disallowed them to be checked out at all.

And publishers have long resisted treating book formats interchangeably—insisting, for example, that each separate format of e-book Fictionwise offered had to be treated as its own separate and distinct edition. Buying a MOBI format book didn’t entitle someone to a copy of the EPUB too, and vice versa, even though they were the exact same book with the exact same text. Are they going to let libraries treat a paper book as an e-book surrogate willy-nilly? I don’t think so.

Now, Open Library might be able to make a case for the fact that they should be able to treat a paper and electronic copy made from it as interchangeable, under fair use. There is case law supporting fair use interpretations for the ability to time-shift and format-shift media for one’s own personal use after all. (Though what Open Library is doing with these files goes a bit beyond mere “personal use.”)

But the thing is, that’s all Open Library can do: make a case. Fair use is a defense, saying “this one copyright infringement is okay because…” and like all such defenses, it has to be decided by a judge. (And frankly, I have my doubts that what Open Library is doing will pass the four-factor test. It would be nice if it did, but it seems to be stepping on too much of publishers’ own prerogatives for a court to let it pass. Though I could be wrong!) Until a court passes it, it’s still a copyright violation and, as such, is illegal and subject to prosecution.

Or maybe it’s just subject to having individual works taken down under the DMCA’s “safe harbor” provisions. But I don’t know about that. I’m not a lawyer, but it seems to me safe harbor’s meant to protect you when it’s your users uploading the copyrighted content. Do you really get to benefit from that when you’re the one who’s doing it, and you know the work is copyright-protected?

As I said before, I’m surprised the Authors’ Guild hasn’t called foul on this. When I was researching this article, I looked for any signs that anyone had complained about all these still-in-copyright books being available for checkout, and couldn’t find any. Has it just passed under their radar?

Whatever the reason, barring explicit permission from the rights-holders, these books shouldn’t be available. Even in DRM-protected and time-limited format. (Thanks to Apprentice Alf and Calibre, the Adobe Digital Editions DRM it uses is basically a joke anyway.) Open Library’s goals of increasing literacy and making books more widely available are laudable, but in doing this without permission, it is violating the authors’ rights. Even if no money is changing hands, the rights-holders have the right to decide how their books are presented.

About Chris Meadows (4158 Articles)
TeleRead Editor Chris Meadows has been writing for us--except for a brief interruption--since 2006. Son of two librarians, he has worked on a third-party help line for Best Buy and holds degrees in computer science and communications. He clearly personifies TeleRead's motto: "For geeks who love books--and book-lovers who love gadgets." Chris lives in Indianapolis and is active in the gamer community.

22 Comments on The Internet Archive’s Open Library is violating authors’ copyrights

  1. It’s not just the Open Library; the Internet Archive itself is a copyright infringement lawsuit waiting to happen.

  2. “Until a court passes it, it’s still a copyright violation”

    I’d love to hear someone with more legal knowledge than me comment on this. It sounds to me like you’re saying that, any time anything new comes along, it is by definition illegal because it hasn’t been done in the past.

    I agree with you that it’s odd that little has been said about Open Library; I went searching for such controversy before you did and also found nothing. But perhaps part of the reason it’s less controversial is that the Internet Archive and the public libraries aren’t making money off this. The idea of Google selling access to my books without my permission made my blood boil, whereas if I found that the Internet Library offered access to my books via my local public library (in the manner you describe), it would bother me no more than learning that my public library had bought a used copy of my book and was loaning it out. Either way, the organization didn’t ask my permission or give me any money, but I think what it did was in the public interest.

    From my perspective, “what’s in the public interest” is what should govern copyright law, not “what the publisher or author personally feels they should control.”

    “And publishers have long resisted treating book formats interchangeably”

    I have long since given up on the idea of major publishers acting in the public interest. I dearly hope your comments in this article weren’t intended as praise of the status quo.

  3. You and the perspectives expressed here represent most of what needs to be fixed about existing Copyright law. I have the feeling from reading this piece that you’re philosophically also against the very idea of libraries, and possibly the First Sale Doctrine. The solution isn’t to try to shut down the libraries, but to repair the laws.

  4. “Fair use is a defense, saying “this one copyright infringement is okay because…””

    While I understand that this is the common interpretation of the law, it is not quite true. The law actually states that fair use “is not infringement of copyright”. That means that if someone is within fair use, they are not guilty of copyright infringement at all. I am aware that whether something is fair use can only be firmly established by a judge, but it is wrong to state categorically that someone is guilty of copyright infringement, when it is a border case and they might be innocent of any infringement entirely. The US justice system has this principle of “innocent until proven guilty” after all.

  5. If Chris Meadows wants to play the part of copyright enforcer, maybe he should get a law degree first. Has anyone been harmed here? No.

  6. It’s worth highlighting that lending via the OpenLibrary is not to anyone, anywhere; it is itself a California accredited library and its collections are available in various places only through the participation of local libraries (which join the OpenLibrary lending network by contributing a book).

    E.g. in California, the State Librarian decided to allow e-lending of books within its own and partner collections should be available anywhere, and not limited by the archaic fig leaf of being physically present at a library.

    In other words, the OpenLIbrary is just a very good example of how libraries should continue their traditional mission in the modern world: by lending their collections to those in their jurisdictions.

    That many local libraries have failed to execute this mission is the problem, not that the OpenLibrary is picking up the slack.

    As with other media, from a moral standpoint, it’s immaterial whether a physical or digital copy is lent, so long as we all agree upon the long-standing convention that only one such loan occurs from a library at a given time.

    If you have a problem with that convention, you have a problem with the existence of libraries, of course. Which some indeed do — or sure seem to.

  7. Rowena Cherry // July 11, 2013 at 4:25 pm //

    My lawyers told me that First Sale Doctrine allows anyone who purchases a paper book to do as they wish with the paper book, including renting it (one at a time) or lending it (one at a time).

    There are probably judges who would probably rule that, if the author did not make an ebook easily available, then scanning a paperback to create one would be allowable.

  8. Authors Guild isn’t made of money so they are choosing their battles in the appallingly expensive world of litigation. The advantage of going after Google is that Google is rich, and winning would be very profitable for AG and its members. Open Library isn’t rich.

    Sure, that’s cold, but it’s also very smart.

    On First Sale Doctrine and the concept of copyright. When you buy a paper book, you are buying the paper, ink, and binding. You AREN’T BUYING THE CONTENTS. The contents remain the property of the copyright owner. That means you can’t reprint the book, copy the book’s contents, put the contents online, or sell the various rights of the contents. Scanning a book ‘s contents is illegal for that reason.

    By the definition of copyright, what Open Library is doing is illegal whether they own a paper book or not.

    The only time it is allowable to scan a book is when a digital copy doesn’t exist that is available for the blind or others covered by the Library of Congress’ exemptions. No one else, including libraries, has that exemption.

  9. There’s a lot of confusion here about what is legal and not legal, and how copyright law works. I appreciate the comments that highlighted (as Rowena Cherry did) that a court might indeed rule this was ok, and that people asserting fair use are not guilty until proved innocent in court. Plus, restrictive language in a printed book like that quoted from the copyright page of the X-COM is useless gesture: it asserts power to stop activities that are legal and where the copyright holder would almost certainly fail in court if it tried, for example, using a short quotation for a purpose other than a review! Or, operating the Bookshare library that I founded. Now, ebooks and their licensing terms are another kettle of fish, which is correctly identified as one of the big fair use and library issues of this age. Expect more lawsuits…

    I suggest interested parties read the Authors Guild v. HathiTrust recent decision, which gave the libraries involved in the Google Book scanning project a gigantic fair use victory around the act of turning physical books into ebooks, for the purposes of searching the full text and accessibility (but not the digital lending of a single copy like the Internet Archive). If you’re surprised the Authors Guild hasn’t tackled Brewster Kahle and the Internet Archive yet, you should check out how badly they got clobbered in a court on a similar topic. If they lose a case against the Archive, that would set another strong fair use precedent. And, that’s a precedent Brewster Kahle would love to set. Now, the HathiTrust case is on appeal, so it’s not settled law. However, I doubt anyone would pick a fresh fight in this area until HathiTrust (and the Georgia State e-reserves case) get settled.
    Full disclosure: I was an expert witness for the National Federation of the Blind, which intervened in the original HathiTrust case, and I have filed, along with Learning Ally, an amicus brief in the appeal, focusing specifically on the disability accessibility issues raised in the case.

  10. I perhaps wasn’t as clear in my post as I could have been.

    Whether you’re guilty or not, you’re still liable for violating copyright and can thus be sued to prove whether you are or aren’t. And if you’re not a big organization able to hire many lawyers to defend yourself, you might just as well be guilty because you’re being punished every bit as effectively by either having to pay huge legal bills or having to pay fines when you can’t afford to defend yourself.

    If I were to try to set up an e-lending library, claiming that I would lend out scanned copies of books I owned and I wouldn’t give the physical books out to anyone else while the e-books were checked out, I’d still be sued to within an inch of my life, be unable to afford to defend myself, and probably be hugely fined. Doesn’t seem fair to me that anyone else should be able to get away with something like this if I can’t…so if it should be legal, let the organization with deep pockets prove it is in court so the rest of us could do it too.

  11. “Doesn’t seem fair to me that anyone else should be able to get away with something like this if I can’t…”

    That’s an interesting perspective to have. Another way to look at it would be that the Internet Archive (and Bookshare and Hathi Trust and other organizations that have fought fiercely for digital rights) are paving the way for greater freedom for the rest of us.

    But yes, I agree that it’s all too likely that the Internet Archive will eventually be sued. The good guys usually are.

  12. Chris, you’re pointing out a very real point about how the law works in the USA. In general, that which is not prohibited by law is theoretically permitted. It’s in contrast with civil code countries (oversimplifying here) where that which is not permitted by law is illegal. In the U.S., what’s legal is defined by court cases over the gray parts (and often the black and white parts). And, you point out the consequences.

    Section 121 has been the law of the land since 1996 (the exception for serving the blind and print disabled). It’s vague and ambiguous, and it affects hundreds of thousands of Americans at a minimum, a few million more likely. And, the odds that the questions about it are going to get settled through legislation or regulation are low. However, nobody wants to go to court to settle it otherwise. Instead, we spend time in conversation with the big stakeholders. In our case at Bookshare, that was the Association of American Publishers and then the Science and Fantasy Writers of America (SFWA, as the author’s group most tuned into digital issues). If we can sell them that we’re doing something good without overmuch bad overtones, we’re probably good. In the case of SFWA, we committed ourselves to be against digital piracy (already our approach since we’re an example of legal copying without permission, didn’t want to support illegal copying without permission) and increased respect for author’s rights in the quality of their accessible version of their work. But if an evil publisher (think Pearson) decides to sue us, we’ll have to do battle. However, for all their threats and nasty behavior, Pearson never sues us. Why? Maybe because they’d lose and set a bad precedent. Yet, threats and intimidation frequently result in good people backing off from (probably) legal behavior because, like you, they can’t afford to fight the deep pocket plaintiff.

    Not defending the status quo. However, it’s reality. Appreciate Dusk’s support for fighting the good fights!

  13. Reminds me a bit of Console Classix. They purchased a large number of video game cartridges, copied ROMs from them, and started renting the ROMs one at a time through a specialized client app that includes an emulator to play them. Each ROM represents one physical cartridge that was purchased, and is only rented to one person at a time. Nintendo originally challenged them with a DMCA letter, but after Console Classix replied (, they weren’t heard from again. Twelve years later, CC is still around, featuring online digital rentals of games from many systems.

  14. So basically, all physical libraries with physical books need to be shut down as well then? So that no one is reading without having paid something first… you’re a dummy! PS. most people hate grassers (taddle-tails in your country).

  15. Copyright protects you from commercial exploitation, but it does not protect you from people lending. The Internet has made distribution a zero cost activity. The fact that people have previously been able to argue that because a work is ‘on a computer’ it deserves radically different terms and laws is absurd. It is even more absurd that these same people feel that they should have any say in how first sale rights are used.

    The notion that any control over an individual work after sale is retained by the author is ludicrous. Deal with it.

  16. imagine a world without copyright.
    imagine a world with all of the worlds knowledge in one place.
    All of the medical records, all of the history of the world for any person to see.
    That technology is available today.

  17. Of course the anti-copyright hipsters just had to flood in their usual bullcrap.

    Imagine a world without copyright.
    Imagine a world where writers and artists get screwed over by internet nerds who just want their stuff for free so they can make meme gifs out of them.

    Imagine a world without property.
    Imagine a world without laws.
    Imagine all the people…oh sorry.

  18. Addressing Aaron’s point, I’m not sure if his claim that one has to be affiliated with a library to use OpenLibrary was true when he wrote his post but it’s not true now. None of my local or interlibrary loan partners or bookstores where I live had a certain older book in a series and I was really happy when I found I could get it on loan through OpenLibrary. I was able to sign up with just an email address; no library affiliation required.

    One other point to be made is that the lack of controversy surrounding the site and the fact that the site itself seems so solid, well established and non-sleazy, create an impression that the site has the approval of authors and publishers. I am careful not to touch sites that post unauthorized works because I have many writer friends and don’t want to patronize sites that cost them sales. It was only when my belief that “things that look too good to be true generally aren’t” finally kicked in that it occurred to me to research the terms under which OpenLibrary operates, which brought me here.

  19. I was reading something similar about this issue. There are copyright exceptions for public and private libraries. The U.S. Library of Congress even more copyright exceptions. I think Internet archive is technically consider a library. Wouldn’t the Internet Archive have some or all of the same copyright exceptions?

  20. well the copyright expires after 70 years from the death of the author, so i guess most of it would be out of copyright, but there may be some that would still have copyright, then again some of the books that are in copyright are only available for lending only so when two weeks run out, the book is no longer in your digital reader, apart from all of this it is a great project as it will really help to preserve the intellect works of our ancestors and that in it self should deserve some credit not Criticism.

  21. Just out of curiosity, as I reviewed old Diane Duane articles to get reminded of potential questions to ask in the podcast this weekend, I ran across this one. I went back and checked the XCOM book and found it is, indeed, still available for checkout.

    I suppose Diane Duane doesn’t actually have the rights to this book, tie-ins generally being works for hire; it’s with Prima, and they don’t seem to be interested in enforcing their copyright. Weird. Maybe I should revisit this story and check for more unauthorized book postings. I’m still amazed nobody’s made a big deal of this.

  22. Similarly, copies and posts practically everything ever posted on a visible web page not specifically protected by robots.txt. All that stuff is automatically copyrighted unless specifically placed by the author in the public domain. I have been wondering what to do with my own web content that links to remote content that disappears (deleted, web site closed, etc.). Is it OK to scrape the old content from and rehost it (to make my web pages whole again)? Hey, if can rehost everything old or current, why can’t I?

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail

wordpress analytics