[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Role of arXiv

>> On Thu, 7 Oct 2010, Joseph Esposito wrote (in liblicense):
>> >
>> >JE:
>> >Finally, once again taking the centrality of arXiv to the
>> >community it serves into consideration, what would happen if a
>> >modest deposit fee were assessed--say, $50 per article?
>> SH:
>> The IR cost per paper deposited will be closer to 50c than $50, once all
>> universities are hosting their own output, and mandating that it be
>> deposited.
> SW:
> I do not think the 50c number is supported by fact or by trend. I know
> that for Cornell's IR the number is much closer to $50 than to 50c if
> one divides cost to operate by the number of new submissions in the
> same period. (I would love to see data for other IRs.)

Simeon, I can only repeat:

*"once all universities are hosting their own output, and 
mandating that it be deposited."*

Cornell has not mandated deposit, and it is far from hosting all 
of its annual output. Ditto for all but about 100 universities so 
far worldwide: http://www.eprints.org/openaccess/policysignup/

> SW:
> For arXiv the number is <$7. We have the benefit of significant scale
> (65k submissions/year) and a user community that require very little
> hand-holding.

Yes, you have significant scale. But, for Arxiv, Cornell -- and 
the other subsidisers, including some universities -- are paying 
for all deposits, from all universities, in one central 

To repeat: The sensible solution (and probably the only 
practical, affordable one) is for Arxiv -- and any other central 
archives like it, in other fields -- to harvest their content 
automatically from Institutional Repositories that host their own 
research output.

The annual cost per paper deposited will be far less for an 
Institutional Repository -- hosting only its own research output 
-- *once the institutions are indeed hosting all of their annual 
research output* -- and not a small fragment of it, as now.

Most institutions have IRs that are near-empty rather than at 
capacity (as far as OA's target output is concerned). (The 
cost/benefit of hosting their grey literature and other kinds of 
content is another matter, but not to be reckoned into this 
comparison with Arxiv regarding per-paper cost. IRs can archive 
lots of kinds of things, including family photo albums, if 

And Cornell, of course, has the double burden of hosting a 
near-empty, unmandated IR for its own refereed research output, 
plus the (partial) expense of hosting Arxiv for the world!


Annual Costs Per Deposit of Hosting Refereed Research Output 
Centrally Versus Institutionally 

Why Cornell's Institutional Repository Is Near-Empty 

More: http://bit.ly/MoreOnCornellPolicy

> SW:
> This is not to say that IRs aren't worth the support from their local
> institution! Compared with the cost of doing research resulting in an
> article, $50 is pocket change. I think that a key driver for IRs is
> that they align well funding with mission. At Cornell we consider it a
> worthwhile service for our faculty to provide considerably more
> support for the IR than arXiv could provide its users.

There are many valid reasons for institutions creating and 
supporting their IRs -- but only if they mandate that they be 
filled with their target content.

Among those many valid reasons are economic ones:

"Among the many important implications of Houghton et al's 
(2009) timely and illuminating JISC analysis of the costs and 
benefits of providing free online access (Open Access, OA) to 
peer-reviewed scholarly and scientific journal articles one 
stands out as particularly compelling.  It would yield a 
forty-fold benefit/cost ratio if the world's peer-reviewed 
research were all self-archived by its authors so as to make it 
OA.  There are many assumptions and estimates underlyin Houghton 
et al's modeling and analyses, but they are for the most part 
very reasonable and even conservative. This makes their strongest 
practical implication particularly striking: The 40-fold 
benefit/cost ratio of providing Green OA is an order of magnitude 
greater than all the other potential combinations of alternatives 
to the status quo analyzed and compared by Houghton et al. This 
outcome is all the more significant in light of the fact that 
self-archiving already rests entirely in the hands of the 
research community (researchers, their institutions and their 
funders), whereas OA publishing depends on the publishing 
community. Perhaps most remarkable is the fact that this outcome 
emerged from studies that approached the problem primarily from 
the standpoint of the economics of publication rather than the 
economics of research."

Harnad, S. (2010) The Immediate Practical Implication of the 
Houghton Report: Provide Green Open Access Now. Prometheus 28 
(1). pp. 55-59. http://eprints.ecs.soton.ac.uk/18514/

> SW:
> (As a side note I mention that at arXiv we consider free access and
> free submission to be foundational and thus did not consider an
> author-pays model. See http://arxiv.org/help/support/whitepaper for
> more details of our business planning process.)

Arxiv is a repository for articles that have been or will be 
refereed and published by *journals*. There is an "author pays" 
model for paying for that refereeing and publishing through 
author/institution publication fees (for OA journals, and a 
subscription model for non-OA journals, still the vast majority).

But there is not, never was, and never need by an "author pays" 
model merely for the *deposit* of the author's draft of those 
same articles.

Arxiv is a repository, providing access, not a publisher of 
refereed research. The journals are still doing that. And they 
need to be paid for it, either via subscriptions or via "author 

>> > JE:
>> >I am not
>> >suggesting that this should or should not happen; I am simply
>> >wondering what the outcome would be. (BioMed Central, PLoS, and
>> >Hindawi all charge more than this, though they provide additional
>> >services.) Would the number of deposits remain about the same?
>> >Would the number drop? And if it dropped, how precipitously?
>> SH:
>> Guess again! Once the burden of hosting, access-provision and archiving is
>> offloaded onto each author's institution, the only service that journals
>> will need to provide is peer review, and hence journals will be charging
>> institutions a lot less than they are charging now. (Print editions as
>> well as online editions and their costs will be gone too.)
> SW:
> Overlay journals are also very interesting and I hope will grow in
> number. This does not seem to be happening yet though. A trend we see
> right now is a rather problematic increase in the number of low
> quality author-pays website-and-little-else online journals. They
> aggressively promote their articles through open-access services such
> as arXiv while established journals wrestle with the transition.

On this you are entirely right, Simeon (though I think the term 
"overlay journals" is a misdescription of what may eventually 
come to pass, once all refereed, published articles are being 
self-archived in their author's IR).

(And Cornell is aiding and abetting this trend, by agreeing 
pre-emptively to subsidize "author pays" costs for (some of) 
their authors' articles while failing to mandate self-archiving 
of all of their authors' articles, cost-free!)

See: http://bit.ly/PreemptiveCOPEandSCOAP3

Harnad, S. (2009) The PostGutenberg Open Access Journal. In: 
Cope, B. & Phillips, A (Eds.) The Future of the Academic Journal. 
Chandos. http://eprints.ecs.soton.ac.uk/15617/

> SW:
> In all of this the tools necessary to use IR content effectively still
> lag well behind the facilities offered by subject repositories.

Many of the necessary tools are not needed at the individual IR 
level, because search occurs at the harvester level.

What IRs lack is not tools, but content. Once we have the 
content, developing the tools is a piece of cake.

> SW:
> One should also not underestimate the cost of building effective
> collections over harvested data (see, for example, the NSDL experience
> http://arxiv.org/abs/cs/0601125 ).

We can cross that bridge when we get to it -- if Google Scholar 
does not cross it for us -- once the target content is indeed 
being deposited in the IRs, globally, because deposit has been 

Stevan Harnad