IRStats Technical Documentation
This document is intended as guidance to the last stage of development of EPstats.
Contents
Directory Structure
/opt/epstats
Contains data files for GeoIP. If I had had root access, I would have put them in the correct place. They are linked to from the correct place. These need regular updating, something which hasn't been implemented.
/opt/epstats/bin
Contains the scripts needed to update the table.
- daily_update.sh - Runs all the scripts in the right order.
- extract_metadata_from_archive.pl - Extracts eprint, author and group metadata from the repository by iterating over every eprint.
- update_table.pl - Filters and processes new entries in the accesslog to update the epstats_true_acesses_table. Uses 'SearchParser.pm' and 'repeatscache'.
- convert_ip_to_host.pl - Attempts to convert ip addresses of the new entries in epstats_true_acesses_table to hostnames. Uses 'host_updated' to keep track of where it got to last time.
Note that most of these scripts probably need to be tidied up. They were written in a hurry and were never polished.
/opt/epstats/cache
Contains cache files. Feel free to delete these whenever you like.
/opt/epstats/cgi
Contains two scripts, 'get_view' and 'stats'.
- get_view returns the output of a EPstats::View (see below), which is currently a chunk of html or csv, but could be almost anything.
- stats is a handy cgi form that passes arguements to get_view
/opt/epstats/img
Conceptually, where any images would be kept (e.g. national flags). At the moment, only the img/graphs directory is used. This is where generated graphs are stored.
/opt/epstats/perl_lib
Contains all the epstats classes.