[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

mining and rights



Joe Esposito's inquiry -- I would be very interested to hear comment from publishers -- about the licensing issues raised by wanting to use large databases of journal articles for data mining connects with something in an interview with Cliff Lynch in the May/June Educause Review. Excerpts:

"We now have about fifty years of investment in text analysis
and text mining. THe intelligence community is still spending
heavily on these technologies, and industry is getting very
interested for lots of reasons. For example, I'm told that the
pharmaceutical industry is very interested in computational
mining of the biomedical literature base. This is an important
part of what is at stake in these massive digitization programs.
Are we going to be able simply to read the digitized works, or
are we going to be able to compute on them at scale as well?
(Presumably, Google will be able to compute on everything it
digitizes, even the in-copyright works. Almost nobody seems to
have figured this out yet! What an amazing and unique resource.
It's not clear what the academy broadly will be able to compute
on.) The answer will make a big difference for the future of
scholarship. This move to computation on text corpora is going
to have vast implications that we haven't even thought about yet
-- implications for copyright, implications for publishers,
implications for research groups. In fact, it may represent the
point of ultimate meltdown for copyright as we know it today."

Leaving aside the undoubted substantial potential -- are there any indications that mining issues are affecting the way publishers are granting or withholding access to material?

Jim O'Donnell
Georgetown U.