[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: archiving thoughts

To me there is very little difference among the scenarios the Kimberly
Parker lists. The real problem is the technical one -- keeping the
material accessible by the current browsers (or tools). But, let's
blithely assume that this will be done -- although I am sure we will lose
a lot of material before we realize what it takes to plan for really long
term preservation and access in the electronic world.

One statement is a problem -- and will be for most publishers She assumes
that most material will eventually be available in the public domain --
which implies that everyone can make whatever use of the material that
they wish. This will not happen before the copyright runs out.

However, and maybe this is what Kimberly meant to say, access to most
material probably will (and should) be made available for free, but the
publisher will certainly want to retain the copyright -- if only to make
different versions and derivative works.

The rest of the scenario makes reasonable sense. For academic material (in
the sciences), only the latest information will serve the researcher. So
the incentive to purchase the latest version is a strong one. Making the
older material available for free (but not for commercial re-use) serves
as advertising for the latest edition. It makes great sense for publisher
to do that.

Whether the material is in serial form, or integrated edition form (which
we will soon get away from in the interconnected Web world), or the loose
leaf form (which will come to dominate in the Web world) is relatively
unimportant. The driving force is that the latest edition is important to

But, this discussion bothers me because it is based on treating the
electronic material as if it were simple static information on paper. The
real evolution will be away from simple words on paper (or on the screen)  
and into interactive information, live math, equations into which a user
can plug her data, 3-D visions with which the user can interact, up to the
minute databases of small chunks of information which are assembled upon
request for each user from the latest information from a whole variety of
Web sources -- information not just from one publisher, but from many

This is the world we should be preparing for -- and it is a lot harder to
understand what to do. But, let's not take up too much time discussing how
to handle a situation which will not be relevant in five years.
Peter B. Boyce    -   Senior Consultant for Electronic Publishing, AAS
email: pboyce@aas.org
Summer address:                                Winter: 4109 Emery Place,
33 York St., Nantucket, MA 02554        Washington, DC 20016
Phone:  508-228-9062                           202-244-2473

At 07:42 PM 1/31/00 -0500, you wrote:
>Recently on the e-collections list there's been a thread going about
>archiving/perpetual access requirements for web subscriptions to
>databases. I contributed something to that discussion that David Goodman
>suggested would also interest the liblicense community, since archiving
>has been a topic we discussed in the past.  So here's a revised version of
>what got posted on e-collections.  Maybe it will start another lively
>discussion on this list...
>--Kimberly Parker
>Since I think about the "archiving" question on and off regularly, I
>occasionally come up with ideas to which I like to seek reactions.
>Please keep in mind as you read that I'm not trying to say the technical
>questions have been solved.  However, I think there are ways that we can
>think about the issue that can inform the technical and economic
>directions that companies and libraries will choose to pursue.  And I hope
>this opens debate!
>There are at least two aspects of the "archiving of e-resources" issue
>that are different but interrelated.  One aspect is the
>"archive/preservation/historical record." The other aspect is the
>"perpetual access/ownership of subscribed-to content."
>The approach to these questions is different for different types of
>e-resources.  There are serial publications that have distinct and
>discretely available chunks of content that become available sequentially.
>There are also coherent publications that, while they may be updated over
>time, have integrated content.
>In the former case ("e-serials" for short), we already see many solutions
>in place. Companies and institutions are tackling the "e-serials" archive
>and perpetual access problem by stating some level of commitment to
>maintaining the content of the data over time.  If enough people do this
>and in enough places, the archive question is pretty much solved. (I say
>that so blithely.)  If there is commitment to maintaining *subscription
>records* (who subscribed to which years) for as long as the maintaining
>company cares to restrict subscribers' access to *only* those years (I'm
>assuming here that at some point, the whole older record might end up in
>the public domain), then perpetual access becomes available in an
>acceptable form.
>The latter case ("e-databases" for short), can be thought of in two ways.
>One way is as regular "editions" of a work like an encyclopedia or a
>directory.  Another way is as a loose-leaf publication which has parts
>that are regularly edited or appended.  Let's take the "editions" model
>If the producing company is willing to take regular (once a year, once
>every two years, once every five years?) snapshots of their product and
>maintain those, they have content to which they can provide perpetual
>access while still providing an economic incentive for people to keep
>subscribing.  When an institution stops subscribing, they'd get access to
>the next "oldest" snapshot until the "new" snapshot (containing, at least
>partially, data they subscribed to) is available, at which point they'd
>get access to that newer snapshot, and there their access would remain. In
>this model, there's a tension between producing snapshots more often,
>costing more money to maintain more versions, and waiting longer to
>produce snapshots, thereby giving some subscribers better (more complete,
>more up-to-date) data than they originally paid for.  Of course, a
>producing company might choose to roll through snapshots-- deleting older
>ones, thereby reducing their load of versions to maintain, and constantly
>providing perpetual access to all subscribers to the oldest version.
>This idea, however, would only solve perpetual access and would not solve
>the archive question.
>In the "editions" model, the archive is intended to preserve the
>historical record, and it is important to have all the different
>"editions" available for a scholar to review.  Just as some libraries
>collect every edition of Encyclopedia Britannica, or every fifth edition
>of the CRC Handbook of Chemistry and Physics, the goal would be to have a
>sequence of snapshots available for review.
>The "loose-leaf" model requires more technical capability.  Here I am
>assuming that every bit of data added into the coherent "e-database" would
>have an invisible (or visible?) time-stamp/datestamp.  With appropriate
>filters in the interface, what can be seen by any specified subscriber is
>only that data that passes certain time/date-signature filters.  For
>perpetual access, subscriber information needs to be maintained to record
>what date filtered version is appropriate.  For the archive, the whole
>database including these time/date signatures is maintained indefinitely
>and future scholars can filter the data however they need to to see what
>information was available at any moment in time.
>So...are there gaping holes in my logic?  Besides the economic ones, that
>Kimberly Parker
>Electronic Publishing and Collections Librarian
>Yale University Library
>130 Wall Street              Voice (203) 432-0067
>P.O. Box 208240              Fax (203) 432-8527
>New Haven, CT  06520-8240