Vint CerfOh, those pesky librarians and Netfolks. The E-Book Museum solution touted by Google and Amazon isn’t a complete hit. Judging from a report from a recent SXWI conference, some weirdos actually want the right to download individual files of public domain works and remix them.

Why are the Mensa-level brains at Google still struggling with this basic question and saying that that the policy hasn’t been worked out? Isn’t Google the employer of Vint Cerf, one of the fathers of the Internet?

‘Do no evil’–except to the public domain?

Granted, the “Do no evil” gang needs to make a profit. But do they have any idea of the harm they could be doing by pre-empting other efforts without the same restrictions on use of public domain content? I’m a very small Google shareholder, involved with an unrelated online library project, and I’m mightily disappointed in the company’s quasi-library efforts. I hope that others with Google connections, however remote, will join me in feeling the same and speaking up. Look, Google, you guys are supposed to favor open approaches. But actually you’re getting in the way.

The world doesn’t understand the nuances, unfortunately, and I can recall one nonprofit veteran telling our library project, “What’s the point if you’ll be competing against Google?” Although we’ll offer a different and far more imaginative approach, the damage has still been done. Google in the public domain arena is in some ways like Nestle selling baby formula in the jungles and actually harming infants. The public domain will survive only if we use this valuable concept in practice. Does Google want to wean us off the public domain in favor of its corporate formula? Maybe.

Arrogance Central

In arrogance, Google is actually surpassing Microsoft. People at MSFT will return phone calls. Google, by contrast, often reminds me of an old poem:

“So this is good old Boston. The home of the bean and the cod.
Where the Lowells talk only to the Cabots. And the Cabots talk only to God.”

Oh, yes, Google will go to conferences. But true dialog on a working level just isn’t happening to the extent it should, as shown by Google’s lack of participation in the Open Content Alliance. Google is a little like Sony in its increasingly Not Invented Here culture. Granted, APIs and the rest are nice; but everything has to happen on Google’s terms.

Now back to details from the SXWI conference. Here’s an excerpt from Medialoper, via LISNews:

Starting with the ideas of what happens after books are digitized and what the impact of a shrinking pool of knowledge might be, the panel started by discussing the elephant in the room (let me say that it was refreshing to see open back-and-forth dialogue between the panelists, unlike the normal nicey-nice stuff you see): Google’s book-related programs — Microsoft’s project isn’t online yet, so escaped detailed scrutiny. Dan Clancy, of Google, explained the various components of the initiative.

The goal for Google and Microsoft (other than making money, and that’s what corporations do) is to build indexes of authoritative works that will provide resources during search. To do this effectively, they need to have a lot of books digitized. This is an expensive and time-consuming process.

While Microsoft is working largely through the Open Content Alliance initiative, Google is digitizing books in the public domain, in association with publishers, and under non-exclusive deals with various major academic libraries. Public domain works are the most accessible while the works that remain under copyright continue to provide limited views. A major challenge under the library agreements is finding copyright holders. As with the motion picture industry, there isn’t always a clear trail; until the copyright owner can be found, much of what Google digitizes remains unseen. This, I think, is a big discussion that needs to be held sooner rather than later.

Members of the audience raised an issue about lack of remixability for the public domain works in Google’s library. Clancy noted that while the books can be printed, Google hasn’t fully determined its policies on re-use of the books by other sources. They continue to seek a balance between return on investment and community needs. Microsoft hopes to address this balance by realizing that the books are available in multiple places and the Microsoft advantage comes from better user experience.

4 COMMENTS

  1. while the bureaucrats and the technoids
    are doing their best to confuse the issues,
    and drown the users in a sea of complexity,
    meanwhile some of us are busy developing
    _simple_yet_powerful_ ways of organizing
    an infrastructure lending itself to easy access
    and remixing of contents in the cyberlibrary…

    my goal is a system a fourth-grader understands.
    (_any_ fourth-grader, even the “slower” ones…)

    i myself have already liberated one book
    in the public domain scanned by google —
    “books and culture”, by hamilton wright mabie
    — which is available _right_now_ for your
    viewing, reading, and downloading pleasure.

    the scans are there — still branded with
    the “google print” stamp on them — and
    so is the full text, in the plain-text format
    proven to be maximally valuable to remixers.

    a version aimed at “error-reporting” can be found at:
    > http://www.greatamericannovel.com/mabie/mabiep001.html

    (i put error-reporting” in quotes because this text has been
    cleaned by super-proofer jose menendez, so it’s ultra-clean.
    nonetheless, we still need infrastructure for public proofing.)

    another version displays the scans 2-up, for easy reading:
    > http://www.greatamericannovel.com/mabie/mabied000001.html

    (many people enjoy this facing-pages spread since it
    closely resembles the “look” of an open paper-book.
    this interface can be slow for the dial-up users, but it
    is downright snappy for people who have a fat pipe…)

    keep in mind that these are some of the first demos, so
    the u.r.l.s are still volatile, and will almost certainly change
    as i get more experience in scaling the infrastructure up to
    _millions_ of books, but an info page should always be at:
    > http://www.greatamericannovel.com/mabie/mabie.html

    each of the scans can be grabbed at will, and i will be
    putting up a zip file that contains the whole lot of them
    — as well as the .html files that administer their display —
    for the one-button ultimate in downloading convenience…

    currently, the scans are housed on another site:
    > http://snowy.arsc.alaska.edu/bowerbird/mabie

    (if you want these files before i get that zip file posted,
    you can f.t.p. to that location now, and fetch all the files.)

    who cares what google does with their copy of this book?
    even if they lock theirs up, we have our own liberated copy…

    now what we need to do is to liberate all the other books
    google scans from the public domain, which belongs to _us_.

    but _please_ don’t just start scraping scans rogue-style.
    uncoordinated efforts will just mean that some books
    will be downloaded multiple times (which is impolite to
    google’s bandwidth, not to mention a waste of our time),
    while other books will be passed over (which is a crime)…

    and google _will_ ban your ass if you hit them with
    too many page-image requests in too short a time,
    so make sure you coordinate your efforts with others,
    and that you know what you’re doing when you do it.

    -bowerbird

  2. david said:
    > I haven’t researched the legalities

    a straightforward scan of a public-domain page
    is a public-domain scan, and it would be suicide
    for google to try to challenge that in any way…

    > Shame on Google for locking them up!
    um, no. and i mean that quite emphatically.

    try this instead:

    google, thank you, for scanning books for us!
    and for including them in your search engine!
    you’re doing a wonderful service for the world!
    and we bow deeply to you in full appreciation!
    you did what the library of congress should have
    started doing many years ago! so thank you!

    just because a book is in the public-domain
    doesn’t mean you have to give it to us, and
    if your business model indicates you shouldn’t,
    then feel free to pursue that course of action.

    and if you’d rather be _just_ the “card catalog”,
    and you want people to use your service just to
    “look something up”, get their results, and then
    be off on their way, then that’s _fine_ with us…

    you shouldn’t be required to be “the reading room”
    as well, and serve up whole books to anybody who
    wants to read them. not if you decide that that is
    not your business. you want people to “stop by”
    for information, not “camp out” on your service…

    we understand that, and make zero demands on you.
    indeed, we’ll gladly fill the “reading room” role for you.
    (our interface is better than yours anyway.) ;+)

    -bowerbird

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com.