Risky Gamble With Google

From the issue dated December 2, 2005
Wouldn't it be cool if we didn't have to tell students that a Web search
is insufficient for serious scholarly research? Wouldn't it be great if we
could use a single, simple portal to find the most-significant Web pages,
images, scholarly articles, and books dealing with a particular subject or
keyword? Wouldn't it be wonderful if we could do full-text searches of
millions of books?

The dream of a perfect research machine seems almost within our reach.
Google, the Mountain View, Calif., company flying high off a huge initial
public offering of stock and astounding quarterly revenues, announced late
last year that it would digitize millions of bound books from five major
English-language libraries. It plans to make available online the full
text of public-domain books (generally those published before 1923, plus
government works and others never under copyright) and excerpts from works
still in copyright.

Harvard University will allow Google to scan 40,000 books during the pilot
phase of the project, and the number may grow. The library has more than
15 million volumes. The University of Michigan at Ann Arbor has agreed to
let Google scan its entire collection ? some 7.8 million works ? and
Stanford University says it is keeping open the possibility of including
"potentially millions" of its more than eight million volumes. The
Bodleian Library at the University of Oxford will allow Google to scan
public-domain books, which it says are principally those published before
1920. The main library alone holds 6.5 million books in its collection.
And the New York Public Library will put in from 10,000 to 100,000
public-domain volumes. It holds 20 million volumes. Even if the project
only included Michigan's collection, it would be astounding.

Google is doing all the scanning and optical-character recognition with a
secret proprietary machine and promises not to damage the pages or
bindings. According to Google's contract with Michigan (the only contract
released to the public), the university will be offered a digital copy as

I have to confess, I am thrilled and dazzled by the potential of such a
machine and the research and distribution opportunities it presents. I
sincerely wish every Internet user had access to a full-text search of
every book in the Google libraries.

But, as we all know, we should be careful what we wish for. This
particular project, I fear, opens up more problems than it solves. It will
certainly fail to live up to its utopian promise. And it dangerously
elevates Google's role and responsibility as the steward ? with no
accountability ? of our information ecosystem. That's why I, an avowed
open-source, open-access advocate, have serious reservations about it.

It pains me to declare this: Google's Library Project is a risky deal for
libraries, researchers, academics, and the public in general. However,
it's actually not a bad deal for publishers and authors, despite their


I share another of my concerns with those librarians who haven't been as
supportive of Google as the five repositories that joined its project. It
goes beyond the intricacies of copyright and fair use to the fear that
Google's power to link files to people will displace the library from our
lives. Wayne A. Wiegand, a professor of library studies at Florida State
University, uses a phrase to describe his scholarly mission, studying "the
library in the life of the user." That means getting beyond the functional
ways people employ library services and collections. It means making sense
of what a library signifies to a community and the individuals in that
community. Libraries are more than resources. They are both places and
functions. They are people and institutions, budgets and books,
conversations and collections. They are greater than the sum of their

The presumption that Google's powers of indexing and access come close to
working as a library ignores all that libraries mean to the lives of their
users. All the proprietary algorithms in the world are not going to
replace them. There was a reason why Franklin, Jefferson, Madison, and
others of their generation believed the republic could not survive without
libraries. They are embodiments of republican ideals. They pump the blood
of a democratic culture, information.

So I worry. We need services like that provided by Google Library. But
they should be "Library Library" projects. Libraries should not be
relinquishing their core duties to private corporations for the sake of
expediency. Whichever side wins in court, we as a culture have lost sight
of the ways that human beings, archives, indexes, and institutions
interact to generate, preserve, revise, and distribute knowledge. We have
become obsessed with seeing everything in the universe as "information" to
be linked and ranked. We have focused on quantity and convenience at the
expense of the richness and serendipity of the full library experience. We
are making a tremendous mistake.

Siva Vaidhyanathan is an assistant professor of culture and communication
at New York University. He is the author of Copyrights and Copywrongs: The
Rise of Intellectual Property and How It Threatens Creativity (New York
University Press, 2001) and The Anarchist in the Library: How the Clash
Between Freedom and Control Is Hacking the Real World and Crashing the
System (Basic Books, 2004).