[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: McCabe and Snyder respond to criticism on their paper
- To: liblicense-l@lists.yale.edu
- Subject: Re: McCabe and Snyder respond to criticism on their paper
- From: Stevan Harnad <harnad@ecs.soton.ac.uk>
- Date: Sat, 12 Feb 2011 11:40:29 EST
- Reply-to: liblicense-l@lists.yale.edu
- Sender: owner-liblicense-l@lists.yale.edu
On 2011-02-10, at 5:26 PM, Philip Davis wrote: > "After reading the original post regarding our paper, and the > subsequent comments, I thought it would be appropriate to address > the issue that is generating some heat here, namely whether our > results can be extrapolated to the OA environment..." > > read full response here: > http://j.mp/hGSY6Z MCCABE: . . . I thought it would be appropriate to address the issue that is generating some heat here, namely whether our results can be extrapolated to the OA environment . . . . (1) Selection bias and other empirical modeling errors are likely to have generated overinflated estimates of the benefits of online access (whether free or paid) on journal article citations in most if not all of the recent literature. If "selection bias" refers to author bias toward selectively making their better (hence more citeable) articles OA, then this was controlled for in the comparison of self-selected vs. mandated OA, by Gargouri et al (uncited in the M & S article, but known to the authors -- indeed the first author requested, and received, the entire dataset for further analysis: we are all eager to hear the results). If "selection bias" refers to the selection of the journals for analysis, I cannot speak for studies that compare OA journals with non-OA journals, since we only compare OA articles with non-OA articles within the same journal. And it is only a few studies like Evans and Reimer's, that compare citation rates for journals before and after they are made accessible online (or, in some cases, freely accessible online). Our principal interest is in the effects of immediate OA rather than delayed or embargoed OA (although the latter may be of interest to the publishing community). MCCABE: 2. There are at least 2 "flavors" found in this literature: 1. papers that use cross-section type data or a single observation for each article (see for example, Lawrence (2001), Harnad and Brody (2004), Gargouri, et. al. (2010)) and 2. papers that use panel data or multiple observations over time for each article (e.g. Evans (2008), Evans and Reimer (2009)). We cannot detect any mention or analysis of the Gargouri et al. paper in the M & S paper MCCABE: 3. In our paper we reproduce the results for both of these approaches and then, using panel data and a robust econometric specification (that accounts for selection bias, important secular trends in the data, etc.), we show that these results vanish. We do not see our results cited or reproduced. Does "reproduced" mean "simulated according to an econometric model"? If so, that is regrettably too far from actual empirical findings to be anything but speculations about what would be found if one were actually to do the empirical studies. MCCABE: 4. Yes, we only test online versus print, and not OA versus online for example, but the empirical flaws in the online versus print and the OA versus online literatures are fundamentally the same: the failure to properly account for selection bias. So, using the same technique in both cases should produce similar results. Unfortunately this is not very convincing. Flaws there may well be in the methodology of studies comparing citation counts before and after the year in which a journal goes online. But these are not the flaws of studies comparing citation counts of articles that are and are not made OA within the same journal and year. Nor is the vague attribution of "failure to properly account for selection bias" very convincing, particularly when the most recent study controlling for selection bias (by comparing self-selected OA with mandated OA) has not even been taken into consideration. Conceptually, the reason the question of whether online access increases citations over offline access is entirely different from the question of whether OA increases citations over non-OA is that (as the authors note), the online/offline effect concerns *ease* of access: Institutional users have either offline access or online access, and, according to M & S's results, in economics, the increased ease of accessing articles online does not increase citations. This could be true (although the growth across those same years of the tendency in economics to make prepublication preprints OA [harvested by RepEc] through author self-archiving, much as the physicists had started doing a decade earlier in Arxiv, and computer scientists started doing even earlier [later harvested by Citeseerx] could be producing a huge background effect not taken into account at all in M & S's painstaking temporal analysis. But any way one looks at it, there is an enormous difference between comparing easy vs. hard access (online vs. offline) is hardly the same thing as comparing access with no access -- as it is when we compare OA vs non-OA for all those potential users that are at institutions that cannot afford subscriptions (whether offline or online) to the journal in which an article appears. The barrier, in other words (though one should hardly have to point this out to economists) is not an ease barrier but a price barrier: Non-OA articles are not just harder to access for users at nonsubscribing institutions: They are *impossible* to access unless a price is paid. (I certainly hope the M & S will not reply with "let them use interlibrary loan (ILL)"! A study analogous to M & S's online/offline study comparing citations for offline vs. online vs. ILL access in the click-through age would not only strain belief if it too found no difference, but it too would fail to address OA, since OA is about access when one has reached the limits of one's institutions subscription/license/pay-per-view budget. Hence it would again miss all the citations that an article would have gained it is had been accessible to all its potential users and not just those whose institutions could afford access, by whatever means.) It is ironic that M & S draw their conclusions about OA in economic terms (and, predictably, as their interest is in modelling publication economics) in terms of the cost/benefits of paying to publish in an OA journal for an author, concluding that since they have shown it will not generate more citations, it is not worth the money. But the most compelling findings on the OA citation advantage come from OA author self-archiving (of articles published in non-OA journals), not from OA journal publishing. Those are the studies that show the OA citation advantage, and the advantage does not cost the author a penny! And the extra citations are almost certainly coming from users for whom access to the article would otherwise have been financially prohibitive. (Perhaps it's time for econometric modeling from the user's point of view too) I recommend that M & S look at the studies of Michael Kurtz in astrophysics. Those too were sophisticated long-term studies of the effect of the wholesale switch from offline to online, and Kurtz found that total citations were in fact slightly reduced, overall! But astrophysics, too, is a field in which OA self-archiving is widespread. Hence whether and when journals go online is moot, insofar as citations are concerned. (The likely hypothesis for the reduced citations -- compatible also with our own findings in Gargouri et al -- is that OA levels the playing field for users: OA articles are accessible to everyone, not just those whose institutions can afford toll access. As a result, users can *self-selectively* decide to cite only the best and most relevant articles of all, rather than having to do with a selection among only the ones to which their institutions can afford toll access. -- Corollary of this [though probably also a spinoff of the Seglen/Pareto effect] is that the biggest beneficiaries of the OA citation advantage will be the best articles.) MCCABE: 5. At least in the case of economics and business titles, it is not even possible to properly test for an independent OA effect by specifically looking at OA journals in these fields since there are almost no titles that *switched* from print/online to OA (I can think of only one such title in our sample that actually permitted backfiles to be placed in an OA repository). Indeed, almost all of the OA titles in econ/business have always been OA and so no statistically meaningful before and after comparisons can be performed. The multiple conflation here is so flagrant that it is almost laughable. Online does not equal OA and OA does not equal OA journal. First, the method of comparing the effect on citations before vs. after the offline/online *switch* will have to make do with its limitations. (We don't think it's of much use for studying OA effects at all.) The method of comparing the effect on citations of OA vs. non-OA within the same (economics/business, toll-access) journals can certainly proceed apace in those disciplines, the studies have been done, and the results are much the same as in other disciplines. M & S have our latest dataset: Perhaps they would care to test whether the economics/business subset of it is an exception to our finding that (a) there is a significant OA advantage in all disciplines, and (b) it's just as big when the OA is mandated as when it is self-selected. MCCABE: 6. One alternative, in the case of cross-section type data, is to construct field experiments in which articles are randomly assigned OA status (e.g. Davis (2008) employs this approach and reports no OA benefit). And another one -- based on an incomparably larger N, across far more fields -- is the Gargouri et al study that M & S fail to mention in their article, and for which they have the full dataset in hand, as requested. MCCABE: 7. Another option is to examine articles before and after they were placed in OA repositories, so that the likely selection bias effects, important secular trends, etc. can be accounted for (or in economics jargon, "differenced out"). Evans and Reimer's attempt to do this in their 2009 paper but only meet part of the econometric challenge. M & S are rather too wedded to their before/after method and thinking! The sensible time for authors to self-archive their papers is immediately upon acceptance for publication. That's before the published version has even appeared. Otherwise one is not studying OA but OA embargo effects. (But let me agree on one point: Unlike journal publication dates, OA self-archiving dates are not always known or taken into account; so there may be some drift there, depending on when the author self-archives. The solution is not to study the before/after watershed, but to focus on the articles that are self-archived immediately rather than later. Stevan Harnad Gargouri, Y., Hajjem, C., Lariviere, V., Gingras, Y., Brody, T., Carr, L. and Harnad, S. (2010) Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research. PLOS ONE. http://eprints.ecs.soton.ac.uk/18493/
- Prev by Date: RE: A Useful Clarification of Harvard's OA Fund
- Next by Date: Re: SAGE rolls out rewards program for all journal reviewers
- Previous by thread: McCabe and Snyder respond to criticism on their paper
- Next by thread: Latest features and functionality from Cambridge Journals
- Index(es):