IRStats2
Contents
What is IRStats2?
IRStats2 is a statistical framework for EPrints - It comes with some cool default tools and reports and it can also be customised to, for instance, add new metrics or data sets. It has a Javascript API to include stats on any pages you want.
IRStats2 is developed against EPrints 3.3 but it was written to also work on EPrints 3.2. Older versions of EPrints are, however, not supported.
What's new in version 1.1?
This new version includes a number of improvements to existing features such as easier deployment, faster database code, tool tips and improved browser detection, as well as a number of smaller tweaks and fixes.
It now also includes filtering to allow the blocking of web crawling robots as standard.
Changes in 1.1, since 1.0.x
Updates:
- Feature: IP based robot filtering and default values
- Feature: Adding an option to only show live items in the stats
- Enhancement: Avoid using experimental perl code.(i.e. ~~ )
- Enhancement: restructure to make epm deployment easier
- Enhancement: tooltip help text for KeyFigures
- Enhancement: Optimisation for innodb
- Enhancement: CSV, JSON, XML saves file as instead of open directly in the browser
- Enhancement: Added missing libraries check (Date::Calc and Geo::IP) on bazzar installation page. resolves #10*
- Bugfix: Stats::View::google::Graph lose first statistics #69
- Bugfix for Browser identification issue #66. Browser ID should be further improved
Merged pull requests:
- Enhancement: %IGNORE_LIST of words (stopwords) are very few and only in "en"
- Enhancement: Add support for transactions
- Bugfix: The title of Screen::IRStats2::Report should not change according to report you chose
- Bugfix: Avoid XSS vulnerability in some CGI output
Installation
Dependencies
- Geo::IP or Geo::IP::PurePerl
- Date::Calc
Both can usually be installed via your Linux package managers (apt-get, yum, ...) or via CPAN.
EPrints 3.3
IRStats2 can be installed directly via the Bazaar on EPrints 3.3.
EPrints 3.3.11 onwards
Installing IRStats2 from the Bazaar is all you need to do. It is recommend that you restart Apache after doing so.
EPrints 3.3.1 to 3.3.10
You need to install IRStats2 from the Bazaar as above, but you also need to apply a few patches to enable the Google map showing the "Origins of downloads".
The patches relate to an incompatibility between the Prototype JS library (used by EPrints) and Google Charts (used by IRStats2). The two patches you need to apply are:
EPrints 3.2.x
On EPrints 3.2 you will have to manually copy the required files to your EPrints installation path. It is a low-risk operation since IRStats2 is a true add-on to EPrints and it does not interact with the core software. You may want to back-up your EPrints files and your database but again, this should not be necessary.
1. Get the files from GitHub or by following this [link https://github.com/eprints/irstats2/tarball/master] [tar.gz] 2. Copy the modules and various configuration files to your local archive:
cp bin/* /opt/eprints3/archives//bin/ cp cfg/* /opt/eprints3/archives//cfg/ cp cgi/* /opt/eprints3/archives//cgi/ (create the bin and cgi directories if they don't exist).
3. Test everything is OK:
/opt/eprints3/bin/epadmin test
4. Add in the <head> sections of your template files (usually located in /opt/eprints3/archives//cfg/lang/en/templates/) the following:
<script type="text/javascript" src="http://www.google.com/jsapi">// <!-- No script --></script> <script type="text/javascript"> google.load("visualization", "1", {packages:["corechart", "geochart"]}); </script>
5. Restart the web server
Processing
Processing works in two steps: the initial processing and then a daily incremental processing. Because the initial processing will take care of all your legacy "download" data, this can take a (very) long time. It may take a few days if your repository is very large, although more likely it will take a few hours.
For the initial processing, run, as the "eprints" user, the below command (and remember this may take a long time to complete). If you are running it from an SSH session, you may want to use the "screen" Linux utility to make sure your SSH session will persist.
/opt/eprints3/archives/REPO_ID/bin/stats/process_stats REPO_ID --setup --verbose
For the daily incremental processing, add the below line in cron. It is a good idea to let this run over-night when there is less traffic to your repository.
perl /opt/eprints3/archives/REPO_ID/bin/stats/process_stats REPO_ID 1>/dev/null 2>/dev/null The two redirections to /dev/null forces the process to not output anything.
When the initial processing has completed, you may point your browser to http://yourrepo.url/cgi/stats/report to look at some stats!