[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Size of repositories (fwd)

---------- Forwarded message ----------
Date: Tue, 20 Mar 2007 15:42:35 +0100
From: Isidro F. Aguillo <isidro@CINDOC.CSIC.ES>
Subject: Size of repositories

We have made an experiment when preparing the new edition of the 
Webometrics Ranking of World Universities 
(http://www.webometrics.info/) to calculate the number of 
documents in web repositories needed for reaching certain level. 
We defined Premier League for the Universities in the Top 200, 
World Class for the Top 500 and Regional Class when they appear 
among the Top 1000.

We collected data for rich formats, including Adobe Acrobat 
(pdf), MS Word (doc), MS PowerPoint (ppt) and PostScript (ps) 
files and Google Scholar database.

The thresholds are as follows:

                   PDF     DOC    PPT      PS    SCHOLAR

PREMIER LEAGUE   19000    4000    2000    1000     3300

WORLD CLASS       7000    2000    1000     300     1200

REGIONAL CLASS    3000     500     300      50      400

These figures could be used as a reference in repository 
planning. All the data refers to publicly accessible documents in 
the Web being indexed by major search engines

Isidro F. Aguillo
Ph:(+34) 91-5635482 ext. 313
Cybermetrics Lab
Joaquin Costa, 22
28002 Madrid. SPAIN