[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RECENT MANUAL MEASUREMENTS OF OA AND OAA



On Mon, 23 Jan 2006, Phil Davis wrote:

> It would be much more constructive if Stevan spent time trying 
> to find problems in their methodology and analysis...

As I said, the discrepancies between our test of the robot's 
accuracy and Goodman et al's prompted us to try to find the basis 
for the discrepancy, and we think we have found it:

The robot sends ISI reference queries to several search engines 
and then tests up to the first 60 hits to see if any of them is 
OA, stopping and returning "OA" as soon as its algorithm judges 
that a hit is OA, and returning "NOA" if none of the (up to) 60 
hits is OA.

The right way to check the robot's accuracy is to save all the 
hits, and hand-check a sample of all of them for a subset that 
the robot judged "OA" and a subset the robot judged "NOA". What 
we instead did in our own small test sample was to do a search by 
hand for a subsample of 100 references that the robot had judged 
to be OA and 100 references it had judged NOA (in Biology). 
Goodman et al. did the same for a sample about three times as big 
in Biology, as well as in Sociology.

All three tests found very different accuracies. The reason now 
seems clear: When one hand-checks the accuracy of a device, this 
has to be on the *device*'s sample, not a different sample. All 
of us had used a different sample (and even different search 
engines). The right test of the robot's accuracy requires 
hand-checking the (up to) 60 hits that the robot actually sampled 
and processed and judged OA or NOA. We are now re-doing both the 
searches and the tests, saving the hits for doing this 
hand-checking.

In other words, all three tests were biassed against the robot -- 
being based on different samples, from different sources, united 
only by whether or not the robot had judged the reference item to 
have an OA version somewhere among the (up to) 60 hits in the 
*first* sample. We had not noticed the bias earlier, because our 
test had yielded such a strong accuracy despite the (unnoticed) 
bias.

As I said before, I am glad Goodman et al. did the further test, 
whose much weaker result alerted us to the fact that something 
was amiss. We think we have found what was amiss, and it was not 
in the robot's accuracy but in our test of the robot's accuracy.

Stay tuned for the results for both Biology and Sociology, which 
are being completely re-done by the robot, but this time saving 
all the hits; the robot accuracy test will be available soon for 
a still larger subsample of these same data. We are also saving 
all the hits (for all of Biology and Sociology, not just this 
larger sample), so anyone else can hand-check them if they wish.

Stevan Harnad

> At 08:41 PM 1/22/2006, you wrote:
>>Before anyone gets too excited about the tiny Goodman et al. test
>>result, may I suggest waiting a couple of weeks, when we will be
>>reporting the results of a far bigger and more accurate test of
>>the robot's accuracy?
>>
>>Those who (for some reason) were hoping that the robot would
>>prove too inaccurate and that the findings on the OA advantage
>>would prove invalid may be disappointed with the outcome. I can
>>already say that overinterpretations of the tiny Goodman et al.
>>test as showing that the OA/OAA findings to date are "worthless"
>>are rather overstated even on the meagre evidence to date,
>>especially since two thirds of the published findings on the OA
>>citation advantage are not even robot-based!.
>>
>>(This shrillness also seems to me to be trying to make rather
>>much out of having actually done rather little!)
>>
>>As to the separate issue of how to treat the OA journal article
>>counts (as opposed to the counts for the self-archived non-OA
>>journal articles): We count it all, of course, but only use the
>>non-OA journal article counts in calculating the OA advantage,
>>because those are (necessarily) within-journal ratios, and
>>citation ratios of zero and infinity are meaningless. Think about
>>it.
>
> [SNIP]
>
>>Stevan Harnad