[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Methodology for estimating cost per article at an institution



Scott Plutchak, Ann Okerson and others have started some constructive
dialog by comparing subscription vs. producer (author) payments at their
own institutions.  So librarians can begin to calculate these figures for
their own institution, I have provided a methodology for estimating these
figures below.  The two most challenging tasks are 1) estimating the total
article output of an institution in the absence of a single comprehensive
index, and 2) correcting for articles that have authors from multiple
institutions.  Below is an example methodology.  While I can imagine
before even posting this, receiving a number of responses trying to pick
holes in the weakness of the methodology, I ask those individuals in
response to propose a correction or simpler model (Occam's Razor).


ESTIMATING THE ARTICLE COVERAGE OF ISI

Number of titles indexed by ISI (SCI, SSCI, AHCI) = 8,769
Estimate of the number of scholarly journals = 20,000

This means that ISI is only indexing 44% of journals; however, not all
journals publish the same number of articles per year and ISI attempts to
index the most prolific core journals.  If we assume a logarithmic
(skewed)  distribution, these 44% of indexed journals represent nearly 92%
of the number of articles published (i.e. log 8,769 / log 20,000).


CORRECTING FOR MULTIPLE-AUTHOR ARTICLES

Number of Cornell author "hits" in ISI for 2003 = 5,465

However, many of these articles have multiple authors from multiple
institutions.  If we assume that the first author would pay all publishing
charges (this is the BMC model, and it seems more reasonable than
splitting $525 among 100 authors of a high-energy physics article), then
how many of the above "hits" represents first author-articles?

ISI displays only the first 500 records and makes it difficult to get an
absolute count.  We need to resort to sampling.  I took a distributed
sample of 100 articles (the first and last article on each page of
results), and found that 61 (of the 100) included a Cornellian as a first
author.  Therefore the % first author hits = 61% (do this calculation for
your own institution).

The estimate of article output by first authors at Cornell = 5,465 x 0.61
= 3,636


ESTIMATING THE COST PER ARTICLE IF AN INSTITUTION MOVED TO AN OPEN ACCESS
MODEL

Scenario 1.  The entire industry swaps from a subscription model to an
open-access model overnight, and all authors publish in open-access
journals.  The breakeven cost/article = total article output of an
institution / total amount spent on purchasing journals.  For Cornell,
this number is about $1,100/article

Scenario 2.  The entire industry except Elsevier moves from a subscription
to OA model.  The library still purchases Elsevier journals and authors
still publish in them.  Calculate the number of articles published in
Elsevier journals in 2003 by searching for your institution in Science
Direct advanced journal search for affiliation.  Correct for multiple
authors.  Remove money spent on Elsevier journals and number of articles
published in Elsevier journals and recalculate the cost/article.  For
Cornell, this number is under $800/article.  In other words, OA has to
cost less than $800/article for the model to start saving money.

Scenario 3.  Same is 2 but assume all large publishers will not
participate (Kluwer, Wiley, Springer).  For Cornell, the cost/article is
less than $400/article.  The reason that the cost per article would need
to be much lower in Scenario 1 and 2 is because the large commercial
publishers take up a disproportionate amount of a libraries funds compared
to the number of articles published in these journals.  For example, the
Elsevier ratio is 43% / 16%.

Philip Davis, Life Sciences Bibliographer
Mann Library, Cornell University, Ithaca, NY 14853
(607) 255-7192 ;  (607) 255-0318 fax
pmd8@cornell.edu
http://people.cornell.edu/pages/pmd8/