[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FW: Crawling publishers' sites

To: liblicense-l@lists.yale.edu
Subject: Re: FW: Crawling publishers' sites
From: Daniel Feenberg <feenberg@nber.org>
Date: Sun, 18 Apr 1999 18:58:58 EDT
Reply-To: liblicense-l@lists.yale.edu
Sender: owner-liblicense-l@lists.yale.edu

Scott Mellon scott.mellon@nrc.ca writes:

 >or ethics involved in sending a web crawler to visit and
 >index the sites of publishers for whom we have site licences...

I think the key is to respect the 'robots.txt' file. Longstanding
custom requires that sites wishing not to be crawled say so in the
'robots.txt' on the site. See "A Standard for Robot Exclusion" 
at http://info.webcrawler.com/mak/projects/robots/robots.html
and no doubt many other places.

A publisher has no valid cause for complaint if you follow those
guidlines. They are well established.

Daniel Feenberg
National Bureau of Economic Research
feenberg@nber.org

Prev by Date: Do your end-users see publishers' licenses?
Next by Date: Re: digest 201
Prev by thread: FW: Crawling publishers' sites
Next by thread: Re: FW: Crawling publishers' sites
Index(es):
- Date
- Thread