[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Does the arXiv lead to higher citations and reduced publisher dowloads?

I am unsure what type of "wishful thinking" I am alleged to have 
engaged in. Our journals Diabetes Care and Diabetes are freely 
available after 3 months, and papers accepted in the journals may 
be posted on acceptance in any institutional repository--making 
them, at least by Stevan's own criteria, open access. Thus I have 
no interest in disproving -- and, in fact, an interest in proving 
-- an open access advantage. As I said, I think it exists, but 
doubt that some of the data supporting it is of sufficient rigor 
to accurately measure its magnitude.

For example, Stevan suggests that Antelman's data and that of his 
colleagues "show the same thing." Actually, they don't. The 
Antelman data show an OA advantage that is quite modest, if it 
exists at all; the Harnad data show one that is quite large in 
some disciplines. When I was in graduate school, it was expected 
that one would try to explain a magnitude of order difference 
between one's own data and those of another investigator, not to 
paper over the differences and call it a day simply because they 
trended the same way.

It would help to see not only the relative increase in citation 
through OA, but also the absolute increase. What does a 100% or 
200% increase represent? If it's an increase from 0.1 average 
citations per paper to 0.2 or 0.3, the effect on the 
dissemination of knowledge is much less significant than if one 
is speaking of an increase from 2 citations to 4 or 6. Because 
very few papers have even one citation, I suspect we're talking 
much more about the former case than the latter.

Peter Banks
American Diabetes Association
Email: pbanks@diabetes.org

>>> harnad@ecs.soton.ac.uk 03/22/06 8:14 PM >>>
On Tue, 21 Mar 2006, Peter Banks wrote:

> [Re: Kristin Antelman's findings] I... suspect that there is a 
> small OA citation advantage, I am not convinced by these 
> data... I doubt that most of the results reach statistical 
> significance...

Based on past postings from Peter, I think there may be an 
element of wishful thinking here (ex officio)! Peter, if you are 
not convinced by KA's data alone, look at all the other data that 
shows the same thing. For example, see Figure 4 in:

      Hajjem, C., Harnad, S. and Gingras, Y. (2005) Ten-Year
      Cross-Disciplinary Comparison of the Growth of Open Access and How
      it Increases Research Citation Impact. IEEE Data Engineering Bulletin
      28(4) pp. 39-47. http://eprints.ecs.soton.ac.uk/11688/

You will see that the ratio of the proportion of OA articles to 
non-OA articles peaks in the 4-7 citation range, and falls off 
for higher and lower citation (quality) ranges. But it is always 
greater than one (i.e., an OA Advantage) except for articles with 
zero citations (where the ratio reverses); that of course is also 
the largest number of articles.

But this effect is again just a correlation, and is just as 
compatible with a Quality self-selection Bias (QB) as with a 
Quality Advantage (QA) (except that it is hard to see why 
self-selection QB should peak at the 4-7 range, whereas it's 
perhaps less difficult to see how a QA advantage could have 
inverted U-shape, absent for the duds and trivial for the gems -- 
but this awaits more confirmatory data and ways of testing 
causality more directly.

> I also don't understand how these data exclude Phil's 
> hypothesis. Since Kristin seems to define quality in terms of 
> citations, then the logic seems self-referential: how would one 
> detect a difference in citation due to intrinsic quality when 
> one has defined quality as number of citations?

You're quite right, except that that argument cuts in both 
directions: No data to date can decide directly between QA and 

Stevan Harnad