[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: a preservation experience



The lesson of this example is that authors should always additionally
deposit a copy of their published papers in an institutional archive. This
is also known as author *self*-archiving, in other words, under the
author's and institution's control. I would not expect any properly
conceived, properly managed institutional archive, with full institutional
backing, to delete or lose any paper once accepted into the archive. By
doing this the author gets all the benefits of OAI search as well as
Google and the Wayback Machine, etc., and is effectively participating in
a mini-LOCKSS scheme (multiple copies).

Steve Hitchcock
IAM Group, School of Electronics and Computer Science
University of Southampton SO17 1BJ,  UK
Email: sh94r@ecs.soton.ac.uk
Tel:  +44 (0)23 8059 3256     Fax: +44 (0)23 8059 2865

At 17:37 21/10/03 -0400, you wrote:
It might not be irrelevant to this list's consideration of issues
surrounding digital resources and their preservation to hear a little
story of discovery.

A colleague had 'published' an article in the proceedings of an
international conference about three years ago.  The proceedings were only
published on-line, and she had linked from her own home page to the
official version.  On looking for that article a couple of days ago (to
verify some quotations and figures), she discovered that the original
publisher had either moved or deleted the original file.  A moderately
thorough search of the site showed that it was advertising *next* year's
conference in the same series, but the publication itself was gone.  A
Google search was no help.

Consulted on this, I wondered what would happen if . . .  So I went to the
Internet Archive site (www.archive.org) and used their "Wayback Machine":
type in the URL of the desired resource and see what happens.  In a few
seconds (good DSL), I had the list.  Hits are listed by Wayback by date of
archiving sweep -- thus, if the same file was modified over time, captures
at different dates will capture different versions.  There were 6 hits for
the year 2001 and 1 for February 2002, none since (suggesting when the
original was lost).  The first hit proved a null set -- file not found.
The second through seventh were all gold:  the original file in its
original 'published' form, complete with all graphics and links.

I was gobsmacked!  It left me feeling as I do when I try some improbable
keystroke combination deep in the bowels of Microsoft Word, and something
I thought impossible suddenly happens.  I feel equally sure that the
achievement might be hard to reproduce.  (Naturally we made a copy to hold
onto.)

Does this model suggest the value of a comprehensive Internet archive?
Does it exemplify the "Lots of Copies Keep Stuff Safe" principle?  Or was
it gross dumb luck?  I leave these questions to others to discuss.

Jim O'Donnell
Georgetown University