IRStats Technical Documentation

From EPrints Documentation
Revision as of 18:23, 29 March 2007 by Gobfrey (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This document is intended as guidance to the last stage of development of EPstats.

Directory Structure

/opt/epstats

Contains data files for GeoIP. If I had had root access, I would have put them in the correct place. They are linked to from the correct place. These need regular updating, something which hasn't been implemented.

/opt/epstats/bin

Contains the scripts needed to update the table.

  • daily_update.sh - Runs all the scripts in the right order.
  • extract_metadata_from_archive.pl - Extracts eprint, author and group metadata from the repository by iterating over every eprint.
  • update_table.pl - Filters and processes new entries in the accesslog to update the epstats_true_acesses_table. Uses 'SearchParser.pm' and 'repeatscache'.
  • convert_ip_to_host.pl - Attempts to convert ip addresses of the new entries in epstats_true_acesses_table to hostnames. Uses 'host_updated' to keep track of where it got to last time.

Note that most of these scripts probably need to be tidied up. They were written in a hurry and were never polished.

/opt/epstats/cache

Contains cache files. Feel free to delete these whenever you like.

/opt/epstats/cgi

Contains two scripts, 'get_view' and 'stats'.

  • get_view returns the output of a EPstats::View (see below), which is currently a chunk of html or csv, but could be almost anything.
  • stats is a handy cgi form that passes arguements to get_view

/opt/epstats/img

Conceptually, where any images would be kept (e.g. national flags). At the moment, only the img/graphs directory is used. This is where generated graphs are stored.

/opt/epstats/perl_lib

Contains all the epstats classes.

EPstats Classes