[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sub-sidy/cription for ArXiv



On 26-Jan-10, at 7:15 PM, Nat Gustafson-Sundell wrote:

> I don't expect local repositories to ever offer quality 
> control.

Of course not. They are merely offering a locus for authors to 
provide free access to their preprint drafts before submitting 
them to journals for peer review, and to their final drafts 
(postprints) after they have been peer-reviewed and accepted for 
publication by a journal.

Individual institutions cannot peer-review their own research 
output (that would be in-house vanity-publishing).

And global repositories like arxiv or pubmedcentral or citeseerx 
or google scholar cannot assume the peer-review functions of the 
thousands and thousands of journals that are actually doing the 
peer- review today. That would add billions to their costs 
(making each into one monstrous (generic?) megajournal: near 
impossible, practically, if it weren't also totally unnecessary 
-- and irrelevant to OA and its costs).

> Also, users have said again and again that they prefer 
> discovery by subject, which will be possible for semantic docs 
> in local repositories or better indexes (probably built through 
> better collaborations), but not now.

Search should of course be central and subject-tagged, over a 
harvested central collection from the distributed local IRs, not 
local, IR by IR.

(My point was that central *deposit* is no longer necessary nor 
desirable, either for content-provision or for search. The 
optimal system is institutional deposit (mandated by institutions 
as well as funders) and then central harvesting for search. 
http://bit.ly/62M14a

> I agree that it would be great if local repositories were more 
> used, and eventually, the systems will be in place to make it 
> possible, but every study I've seen still shows local 
> repository use to remain disappointingly low, although some 
> universities are doing better than others.

"Use" is ambiguous, as it can refer both to author use (for 
deposit) and user use (for search and retrieval). We agree that 
the latter makes no sense: users search at the harvester level, 
not the IR level.

But for the former (low author "use," i.e., low levels of 
deposit), the solution is already known: Unmandated IRs (i.e., 
most of the existing c. 1500 IRs) http://roar.eprints.org/ are 
near empty (of OA's target content, which is preprints and 
postprints of peer-reviewed journal articles) whereas mandated 
IRs (c. 150, i.e.m 1%!) 
http://www.eprints.org/openaccess/policysignup/ are capturing, or 
on the way to capturing their full annual postprint output.

So the solution is mandates. And the locus of deposit for both 
institutional and funder mandates should be institutional, not 
central, so the two kinds of mandates converge rather than 
compete (requiring multiple deposit of the same paper). 
http://openaccess.eprints.org/index.php?/archives/369-guid.html

For the special case of arxiv, with its long history of 
unmandated deposit, a university's IR could import its own remote 
arxiv deposits (or export its local deposits to arxiv) with 
software like SWORD, but eventually it is clear that 
institution-external deposit makes no sense:

Institutions are the universal providers of all peer-reviewed 
research, funded and unfunded, across all fields. 
One-stop/one-step local deposit (followed by automatic import. 
export. and harvesting to/ from whatever central services are 
needed) is the only sensible, scaleable and sustainable system, 
and also the one that is most conducive to the growth of 
universal OA deposit mandates from institutions, reinforced by 
funder mandates likewise requiring institutional deposit rather 
than discouraged by gratuitously requiring institution-external 
deposit.

> Inter-institutional repositories by subject area (however 
> broadly defined) simply work better, such as arXiv or even the 
> Princeton-Stanford repository for working papers in the 
> classics.

"Work better" for what? Deposit or search? You are conflating the 
locus of search (which should, of course, be cross-institutional) 
with the locus of deposit, which should be institutional, in 
order to accelerate institutional deposit mandates and in order 
to prevent their discourage adoption and compliance because of 
the prospect of having to deposit the same paper in more than one 
place.

(Yes, automatic import/export/harvesting software is indifferent 
to whether it is transferring from local IRs to central CRs or 
from central CRs to local IRs, but the logistics and pragmatic of 
deposit and deposit mandates, since the institution is always the 
source of the content, makes it obvious that one-time deposit 
institutionally fits all output, systematically and tractably, 
whereas willy-nilly IR/ CR deposit, depending on fields' prior 
deposit habits or funder preferences is a recipe for many more 
years of the confusion, inaction, absence of mandates, and 
near-absence of OA content that we have now.)

> Currently, universities are paying external middlemen an 
> outsized fee for validation and packaging services.  These 
> services can and should be brought "in-house" (at least as an 
> ideal/ goal to develop toward whenever the opportunities can be 
> seized) except in cases where prices align with value, which 
> occurs still with some society and commercial publications.

I completely agree that along with hosting their own 
peer-reviewed research output, and mandating its deposit in their 
own IRs, institutions can also use their IRs (along with 
specially developed software for this purpose) to showcase, 
manage, monitor, and measure their own research output. That is 
what OA metrics (local and global) will make possible. 
http://www.openscholarship.org/jcms/c_6162/repositories

But not until the problem of getting the content into OA IRs is 
solved. And the solution is institutional and funder mandates -- 
for *institutional* (not institution-external) deposit.

> To the extent that an arXiv or the inter-institutional 
> repository for humanities research which will be showing up in 
> 3-7 years moves toward offering these services, they are 
> clearly preferable to old fashioned subscription models (since 
> the financial support is for actual services) and current local 
> repositories which do not offer everything needed in the value 
> chain (as listed in Van de Sompel et al. 2004).

(1) The reason 99% of IRs offer no value is because 99% of IRs 
are at least 85% empty. Only the 1% that are mandated are 
providing the full institutional OA content -- funded and 
unfunded, across all disciplines -- that all this depends on.

(2) The central collections, as noted, are indispensable for the 
services they provide, but that does *not* include locus of 
deposit and hosting: There, central deposit is counterproductive, 
a disservice.

(3) With local hosting of all their research output, plus central 
harvesting services, institutions can get all they need by way of 
search and metrics, partly through local statistics, partly from 
central ones.

>  I remember when I first read an article quoting a researcher 
> in an arXiv covered field who essentially said that journals in 
> his field were just for vanity and advancement, since all the 
> "action" was in arXiv (Ober et al. 2007 quoting Manuel 2001 
> quoting McGinty 1999) -- now think about the value of a 
> repository that doesn't just store content and offer access.

This familiar slogan, often voiced by longstanding arxiv users, 
that "Journals are obsolete: They're only for tenure committees. 
We [researchers] only use the arxiv" is as false, empirically, as 
it is incoherent, logically: It is just another instance of the 
"Simon Says" phenomenon: (Pay attention to what Simon actually 
*does*, not to what he says.) http://bit.ly/cYwed6

Although it is perfectly true that most arxiv users don't bother 
to consult journals any more, using the OA version in arxiv only, 
and referring to the journal's canonical version-of-record only 
in citing, it is equally, and far more relevantly true that they 
all continue to submit all those papers to peer-reviewed 
journals, and to revise them according to the feedback from the 
referees, until they are accepted and published.

That is precisely the same thing that all other researchers are 
doing, including the vast majority that do not self-archive their 
peer- reviewed postprints (or, even more rarely, their unrefereed 
preprints) at all.

So journals are not just for vanity and advancement; they are for 
peer review. And arxiv users are just as dependent on that as all 
other researchers. (No one has ever done the experiment of trying 
to base all research usage on nothing but unrefereed preprints 
and spontaneous user feedback.)

So the only thing that is true in what "Simon says" is that when 
all papers are available OA as peer-reviewed final drafts (and 
sometimes also supplemented earlier by the prerefereeing drafts) 
there is no longer any need for users or authors to consult the 
journal's proprietary version of record. (They can just cite it, 
sight unseen.)

But what follows from that is that journals will eventually have 
to scale down to becoming just peer-review service-providers and 
certifiers (rather than continuing also to be access-providers or 
document producers, either on-paper or online).

Nothing follows from that about the value of repositories, except 
that they are useless if they do not contain the target content 
(at least after peer review, and where possible and desired by 
authors, also before peer review).

Harnad, S. (1998/2000/2004) The invisible hand of peer review. 
Nature [online] (5 Nov. 1998), Exploit Interactive 5 (2000): and 
in Shatz, B. (2004) (ed.) Peer Review: A Critical Inquiry. 
Rowland & Littlefield. Pp. 235-242. http://cogprints.org/1646/

> Do I think the financial backing will remain in place?  It 
> depends on the services actually offered and to what extent 
> subject repositories could replace a patchwork system of single 
> titles offered by a patchwork of publishers.

At the moment the issue is whether arxiv, such as it is (a 
central locus for institution-external *deposit* of institutional 
research content in some fields, mostly physics, plus a search 
and alerting service), can be sustained by voluntary 
sub-sidy/scription -- not whether, if arxiv also somehow "took 
over" the function of journals (peer review), that *too* could be 
paid for by voluntary sub-sidy/ scription...

> Universities could save a great deal by refusing to pay the 
> same overhead over and over again to maintain complete 
> collections in single subject areas (not to mention paying for 
> other people's profits).

I can't quite follow this: You mean universities cancel journal 
subscriptions? How do those universities' users then get access 
to those cancelled journals' contents, unless they are all being 
systematically made OA? Apart from those areas of physics where 
it has already been happening since 1991, that isn't going to 
happen in most other fields till OA is mandated by the universal 
providers of that content, the universities (reinforced by 
mandates from their funders).

Then (but only then) can universities cancel their journal 
subscriptions and use (part of) their windfall saving to pay 
(journals!) for the peer-review of their own research output, 
article by article (instead of buying in other universities' 
output, journal by journal): 
http://www.nature.com/nature/debates/e-access/Articles/harnad.html#B1

> More importantly, more could be done to make articles useful 
> and discoverable in a collaborative environment, from metadata 
> to preservation, so that the value chain is extended and 
> improved (my sci-fi includes semantic docs, not just cataloged 
> texts, and improved, or multi-stage, peer review, or peer 
> review on top of a working papers repository).

All fine, and desirable -- but not until all the OA content is 
being provided, and (outside of physics), it isn't being provided 
-- except when mandated...

So let's not build castles in Spain before we have their contents 
safely in hand.

> I think there's been plenty of 'chatter' to indicate that the 
> basic assumptions in conversations between universities are 
> changing (see recent conference agendas), so that we can expect 
> to see more and more practical plans to collaborate on 
> metadata, preservation, and , yes, publications.

I'll believe the "chatter" when it has been cashed into action 
(deposit mandates). Till then it's just distraction and 
time-wasting.

> My head spins to think of the amount of money to be saved on 
> the development of more shared platforms, although, the money 
> will only be saved if other expenditures are slowly turned off.

All this talk about money, while the target content -- which 
could be provided at no cost -- is still not being provided (or 
mandated)...

> Sandy mentioned in another post that she [he] would hope for 
> arXiv like support for university monographs...

Monographs (not even a clearcut case, like peer-reviewed 
articles, which are all, already, author give-aways, written only 
for usage and impact) are moot, while not even peer-reviewed 
articles are being deposited, or mandated...

> Open access and NFP publications which do offer the full value 
> chain have been proven to have much lower production costs per 
> page than FP publishers and they do not suffer any impact 
> disadvantages -- and these are still operated on a largely 
> stand-alone basis, without the advantages that can be gained by 
> sharing overhead.

Cash castles in Spain again, while the free content is not yet 
being provided or mandated...

> Maybe local repositories really are the way to go, since then 
> each institution has more control over its own contribution, 
> but the collaboration and the support will still need to occur 
> to support discovery (implying metadata, both in production and 
> development of standards and tools) and preservation.

No, search and preservation are not the problem: content is.

> I suppose another problem with local repositories, however, is 
> that a consensus is far less likely to unite around local 
> repositories as a practical option at this juncture -- the case 
> can't just be made with words, you need the numbers and arXiv 
> has them -- and while I am interested to see strong local 
> repositories emerge, there is greater sense in supporting what 
> can be achieved, since we need more steps in the right 
> direction.

"The numbers" say the following:

Physicists have been depositing their preprints and postprints 
spontaneously (unmandated) in arxiv since 1991, but in the 
ensuing 20 years this commendable practice has not been taken up 
by other disciplines. The numbers, in other words, are static, 
and stagnant. The only cases in which they have grown are those 
where deposit was mandated (by institutions and funders).

And for that, it no longer makes sense (indeed it goes contrary 
to sense) to deposit them institutional-externally, instead of 
mandating institutional deposit and then harvesting centrally.

And the virtue of that is that it distributes the costs of 
managing deposits sustainably, by offloading them onto each 
institution, for its own output, instead of depending on 
voluntary institutional sub- sidy/scription for obsolete and 
unnecessary central deposit.

Stevan Harnad

PS (See also the "denominator fallacy" http://bit.ly/brhkMD , 
which arises when you compare the size of size of central 
repositories with the size of institutional repositories: The 
world's 25,000 peer reviewed journals publish about 2.5 million 
articles annually, across all fields. A repository's success rate 
is the proportion of its annual target contents that are being 
deposited annually. For an institution, the denominator is its 
own total annual peer-reviewed journal article output across all 
fields. For a central repository, it is the total annual article 
output -- in the field(s) it covers -- from all the institutions 
in the world. Of course the central repository's numerator is 
greater than any single institutional repository's numerator. But 
its denominator is far greater still. Arxiv has famously been 
doing extremely well for certain areas of physics, unmandated, 
for two decades. But in other areas arxiv is not not doing so 
well, relative to the field's true denominator; and most other 
central repositories are likewise not doing well, In fact, it is 
pretty certain that -- apart from physics, with its 2-decade 
tradition of deposit, plus a few other fields such as economics 
(preprints) and computer science, unmandated central repositories 
are doing exactly as badly unmandated institutional repositories 
are doing, namely about 15%.)