[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RECENT MANUAL MEASUREMENTS OF OA AND OAA



David,

Your work on the validity of using automated robots to detect OA articles is very important if we are to better understand the effect of Open Access on article impact. Many of us appreciate your work on this topic.

The thing that I'm troubled by is that the term "Open Access Advantage" is both loaded and vague. There are many different types of open access (full, partial, delayed, open to developing countries, publisher archived, self-archived, institutional archived, etc.), so to talk about an Open Access Advantage gives credit to many different publishing and access models at the same time.

Because studies of this sort are not controlled experiments, this means that the best a researcher can do is come up with likely causal explanations, and hope to rule others out. In the case of higher citations for articles found as OA, article quality and early notification are possible explanations, as are editor selection bias, self-selection bias, and the Matthew Effect. Hundreds of Emerald articles that were republished among two or more journals demonstrates that simple article duplication can also explain increased impact. All of these may be partial causes that explain higher impact (the "Open Access Advantage"), yet they are not limited to Open Access.

As a consequence, I am worried about people making unqualified statements like "OA can increase the impact of an article by 50-250%!" The answer may be more like, "author republishing (online and in print) may increase citation impact in some fields, especially among highly prestigious journals and authors". Although this is not as simple as declaring that Open Access increases citation impact, it may be much more precise.

--Phil Davis


At 05:34 PM 1/11/2006, you wrote:
Within the last few months, Stevan Harnad and his group, and we in our
group, have carried out together several manual measurements of OA (and
sometimes OAA, Open Acess Advantage). The intent has been to independently
evaluate the accuracy of Chawki Hajjem's robot program, which has been
widely used by Harnad's group to out similar measurements by computer.

The results from these measurements were first reported in a joint posting
on Amsci,* referring for specifics to a simultaneously posted detailed
technical report,** in which the results of each of several manual
analyses were separately reported.

From these data, both groups agreed that "In conclusion, the robot is not
yet performing at a desirable level and future work may be needed to
determine the causes, and improve the algorithm."

Our group has now prepared an overall meta-analysis of the manual results
from both groups. *** We are able to combine the results, as we all were
careful to examine the same sample base using identical protocols for both
the counting and the analysis. Upon testing, we found a within-group
inter-rater agreement of 93% and a between-groups agreement of 92%.

Between us, we analyzed a combined sample of 1198 articles in biology and
sociology, 559 of which the robot had identified as OA, and 559 of which
the robot had reported as non-OA.

Of the 559 robot-identified OA articles , only 224 actually were OA (37%).

Of the 559 robot-identified non-OA articles, 533 were truly non-OA (89%).

The discriminability index, a common used figure of merit, was only 0.97.

(We wish to emphasize that our group's results find true OAA in biology at
a substantial level, and we all consider OAA one of the many reasons that
authors should publish OA.)

In the many separate postings and papers from the SH group, such as ****
and ***** done without our group's involvement, their authors refer only
to the SH part of the small manual inter-rater reliability test. As it was
a small and nonrandom sample, it yields an anomalous discriminability
index of 2.45, unlike the values found for larger individual tests or for
the combined sample. They then use that partial result by itself to prove
the robot's accuracy.

None of the SH group's postings or publications refer to the joint report
from the two groups, of which they could not have been ignorant, as the
report was concurrently being evaluated and reviewed by SH.

Considering that both the joint ecs technical report ** and the separate
SH group report***** were both posted on Dec .16 2005, we have here
perhaps the first known instance of a author posting findings on the same
subject, on the same day, as adjacent postings on the same list, but with
opposite conclusions.

In view of these joint results, there is good reason to consider all
current and earlier automated results performed using the CH algorithm to
be of doubtful validity. The reader may judge: merely examine the graphs
in the original joint Technical Report; **. They speak for themselves.

Dr. David Goodman
Palmer School of Library and Information Science
Long Island University     <dgoodman@liu.edu>