[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: FW: Crawling publishers' sites
- To: liblicense-l@lists.yale.edu
- Subject: Re: FW: Crawling publishers' sites
- From: Daniel Feenberg <feenberg@nber.org>
- Date: Sun, 18 Apr 1999 18:58:58 EDT
- Reply-To: liblicense-l@lists.yale.edu
- Sender: owner-liblicense-l@lists.yale.edu
Scott Mellon scott.mellon@nrc.ca writes: >or ethics involved in sending a web crawler to visit and >index the sites of publishers for whom we have site licences... I think the key is to respect the 'robots.txt' file. Longstanding custom requires that sites wishing not to be crawled say so in the 'robots.txt' on the site. See "A Standard for Robot Exclusion" at http://info.webcrawler.com/mak/projects/robots/robots.html and no doubt many other places. A publisher has no valid cause for complaint if you follow those guidlines. They are well established. Daniel Feenberg National Bureau of Economic Research feenberg@nber.org
- Prev by Date: Do your end-users see publishers' licenses?
- Next by Date: Re: digest 201
- Prev by thread: FW: Crawling publishers' sites
- Next by thread: Re: FW: Crawling publishers' sites
- Index(es):