[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Google archivesearch in NYT

To: "Ann Okerson" <ann.okerson@yale.edu>, liblicense-l@lists.yale.edu
Subject: Google archivesearch in NYT
From: "Jim O'Donnell" <provost@georgetown.edu>
Date: Wed, 6 Sep 2006 21:08:17 EDT
Reply-to: liblicense-l@lists.yale.edu
Sender: owner-liblicense-l@lists.yale.edu

September 6, 2006
Google to Offer Print-Archives Searches
By JOHN MARKOFF

SAN FRANCISCO, Sept. 5 -- Google plans to announce on Wednesday
that it is offering a service that will permit Internet users to
search through the archives of newspapers, magazines and other
publications and uncover material that in some cases dates back
more than 200 years.

The new feature, to be named Google News Archive Search, will
direct Google searchers to both paid and free digital content on
publishers' Web sites, but will not directly generate revenue for
Google.

Google would not state how many publishers were taking part in
the new service, for which Google has independently indexed
material from online databases and will display the results both
as part of standard searches and through a new archive search
page (news.google.com/archivesearch). However, it announced a
number of partners including The Wall Street Journal, The New
York Times, The Washington Post, Time, Guardian Unlimited,
Factiva, Lexis-Nexis, HighBeam Research and Thomson Gale.

In contrast to Google's book scanning project, which has led to
legal skirmishes with some publishers over copyright issues, some
of the partners involved with the new service said they had been
pressing Google to offer access to their archives for several
years.

The databases included in the service are part of what some have
called the "dark Web" because they cannot be "spidered," or
indexed, by standard search engines and so have not been
accessible through them.

"We have been asking Google and other search engines to please
spider our content for some time," said Patrick Spain, chief
executive of HighBeam Research, a digital content library based
in Chicago.

Some of HighBeam's 3,300 publications and 40 million documents
will be available free, while in other cases users will see just
the headline and the first 600 characters of a document. To see
the whole thing, users must be subscribers to the firm's service,
which costs either $20 a month or a $100 annual fee.

"This symbolizes a major moment," said Allen Weiner, a research
director at Gartner, a market research firm. Google has reached
an accommodation with the content companies that will benefit
both sides, he said.

In a number of cases the entire archive of publications like Time
and The Washington Post will be reachable via a Google search.
Time's entire database is already freely available and supported
by advertising. The magazine made its archive, consisting of
4,300 issues and 300,000 articles dating back to 1923, available
free through www.time.com last month.

With some publications, including The New York Times and The
Washington Post, searchers will be sent to Web sites where they
will be able to buy individual articles.

Google executives said that the archive service would not
generate revenue directly and that the company did not yet know
how it would make money from it.

"We're not focusing on monetization yet," said Anurag Acharya, a
distinguished engineer at Google who helped develop the service.
"This is new territory for us."

The new service is not encyclopedic, Mr. Acharya said, but
instead presents users with a representative list of relevant
articles that are arranged in a timeline fashion. The service
tries to offer a pointer to the time period that is most relevant
to the search query. For example, in the case of the search
phrase "moon landing," an arrow points the user to 1969.

Mr. Weiner of Gartner said he expected Google to link the archive
service to its Google Checkout payment system. In the future, he
said, video archives are almost certain to be added.

"They have to convince CBS News to make Edward R. Murrow
available," he said.

copyright 2006 the New York Times
---2071850956-780499665-1157591289=:31459--

Prev by Date: September issue of the SPARC Open Access Newsletter
Next by Date: OCR
Previous by thread: September issue of the SPARC Open Access Newsletter
Next by thread: OCR
Index(es):
- Date
- Thread