[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

from CHE: libraries and search engines



Of possible interest; excerpted from the Chronicle Online
___________________________

Libraries Aim to Widen Google's Eyes
Search engines want to make scholarly work more visible on the Web
 
By JEFFREY R. YOUNG

Google and other commercial search engines are often the first source that
students and professors turn to when doing research, but search engines
sometimes fail to include material contained in free scholarly archives.

That may be about to change.

If students searched Google today for George McTurnan Kahin's Governments
and Politics of Southeast Asia, for example, they would come up with no
more than a few mentions of the work. The full text of the book, however,
is available free in an online collection run by the University of
Michigan at Ann Arbor. It just doesn't seem to have been indexed by
Google.

Many of the 22,000 volumes in the university's digital collection, in
fact, are off the radar of Google and other commercial search engines,
says John P. Wilkin, an associate university librarian. The reason: The
electronic books are in Web databases that are difficult for many search
engines to "crawl." Crawling is the process by which a search engine's
servers regularly scan and index the Internet.

Other universities use similar methods to store their digital collections,
meaning that many academic collections are out of the range of popular
search engines.

No one knows how many free scholarly materials are invisible to search
engines, whose creators are usually secretive about their indexing
methods. Experts guess that the total reaches tens of millions of pages.

But search-engine providers suddenly seem interested in covering that
territory. In recent months, representatives of both Google and Yahoo, two
of the most widely used search engines, have begun working with librarians
and colleges to develop ways to shine light on the deep Web.

An engineer from Google attended a meeting of the Digital Library
Federation in New Orleans last month to talk with academic-library leaders
about how the company's search engine could better reach their content.
Google is in the midst of a pilot project with DSpace, a digital library
at the Massachusetts Institute of Technology, and 16 other institutions
(The Chronicle, April 23). And Google is working with the OCLC Online
Computer Library Center, a nonprofit library organization, on a pilot
project to point Google's users to printed books in local and academic
libraries.

In March Yahoo announced its Content Acquisition Program, in which the
company's search-engine division is working with nine institutions,
including Michigan, to bring more scholarly content into its online-search
tool.

Experts say the reason for the sudden interest in academic content by
search-engine makers is simple: competition. Google's addition of features
to its search tool coincides with the initial public stock offering for
which it filed last month. Microsoft has announced plans to build a new
search engine of its own, and Yahoo has renewed its focus on searching,
making search engines the latest tech-industry battleground.

"These search engines are now in competition for quality content" rather
than just quantity in their search results, says Herbert Van de Sompel, a
researcher at Los Alamos National Laboratory who has designed software to
help search engines find academic materials.

[SNIP]