Difference between revisions of "Robots"

From EPrints Documentation
Jump to: navigation, search
Line 5: Line 5:
 
These robots cause unnecessary load on the repository servers, as well as skewing the download statistics for the published data.
 
These robots cause unnecessary load on the repository servers, as well as skewing the download statistics for the published data.
  
We at EPrints Services and IRUS have observed a number of harmful robots either by their IP address or their user agent.
+
We at EPrints Services and IRUS have observed a number of harmful robots which can be identified either by their IP address or their user agent.
  
 
Are we working to produce and maintain a simple list of these, so they can be more easily filtered or blocked by repository systems administrators.
 
Are we working to produce and maintain a simple list of these, so they can be more easily filtered or blocked by repository systems administrators.
 +
 +
The first version of this list can be found below.
  
 
[[Media:bad_robots.txt]]
 
[[Media:bad_robots.txt]]

Revision as of 15:49, 9 August 2016

Web crawling robots are a fact of life. There are many "out there" on the web, many doing a good job at indexing our content.

However there are also an increasing number of robots which are causing repository owners problems.

These robots cause unnecessary load on the repository servers, as well as skewing the download statistics for the published data.

We at EPrints Services and IRUS have observed a number of harmful robots which can be identified either by their IP address or their user agent.

Are we working to produce and maintain a simple list of these, so they can be more easily filtered or blocked by repository systems administrators.

The first version of this list can be found below.

Media:bad_robots.txt