This page describes what Kieker is and how to integrate it with EPrints (tested on v3.3). The Kieker framework is developed and maintained at the University of Kiel in Germany and it is a neat tool for analysing a system whether in development or in production.
What is Kieker
Concretely Kieker allows you to profile and to monitor all the internal module calls (to EPrints). Any functions called will be trapped and sent to a queue for further processing.
Kieker comes with many post-processing graphs and diagrams to see, for instance, all the calls made, the execution times, etc. This then allows you to have a global picture of the EPrints' internals in order to e.g. optimise certain parts of the system, find coupled modules, detect un-wanted loops (between modules) etc.
Read more about it here: http://kieker-monitoring.net/
How does it work?
For EPrints, Kieker uses a PERL module called Sub::WrapPackages which basically allows user to define global wrappers for internal calls. It is possible to restrict which modules are wrapped: for instance, you may want to only profile the database layer, in which case you probably want to wrap EPrints::Database and EPrints::Database::MySQL. The EPrints/Kieker extensions (https://github.com/eprints/epkieker) make it easy to set this up.
When Kieker is enabled it will trap any selected internal calls and will send some information to a queueing system. The data sent will contain a high-precision timestamp, the name of the module, of the function etc. Traditionally Kieker uses JMS as a queueing system but we integrated it with memcached which is a popular, easy-to-install server-wide caching system.
Once you're happy with the profiling, you can disable kieker and retrieve the data from the queue. Then you may use Kieker's built-in tools to generate all sorts of graphs.
We used Kieker v1.9 during our tests. You may get it from: http://kieker-monitoring.net/download/
Note that Kieker requires Java (but this can easily be installed under Ubuntu).
If you're using Ubuntu, you can install the modules as follow:
sudo apt-get install libclass-accessor-perl libnet-stomp-perl libdevel-caller-ignorenamespaces-perl libsub-install-perl libparams-util-perl libdata-optlist-perl libsub-exporter-perl libsub-prototype-perl libsub-wrappackages-perl
If you'd rather install them via CPAN, the above packages' name are:
You must also install memcached, if not already present on your server:
sudo apt-get install memcached libcache-memcached-fast-perl
You need to copy a few modules from https://github.com/eprints/epkieker to finalise the installation of Kieker, to run with EPrints.
If EPrints is install under its default path /opt/eprints3 (otherwise adjust the paths):
cp -rf perl_lib/Kieker* /opt/eprints3/perl_lib cp perl_lib/EPrints/Apache/KiekerHandler.pm /opt/eprints3/perl_lib/EPrints/Apache cp -rf bin/kieker /opt/eprints3/archives/<id>/bin/
And you're done.
There are only a few options you need to edit to configure Kieker, which are in /opt/eprints3/perl_lib/EPrints/Apache/KiekerHandler.pm.
This option allows you to restrict the URI which will capture profiling data. For instance, if you have a problem with searching, you may want to set this to "/cgi/search". To do the same with the browse views, use "/view/".
my $MONIT_URI = "/"; # monitors "/" ie. /index.html my $MONIT_URI = /cgi/search"; # monitors the search my $MONIT_URI = "/view/year/"; # monitors the browse view "by year" my $MONIT_URI = "/cgi/users/home"; # monitors the user (logged-in) area
It can be dangerous to allow monitoring for any clients using your repository (if using Kieker on a production server). This will slow down the system for everyone, as well as generating lots of useless data.
This option is there to restrict which IP address enables monitoring. At the moment it can only be a single IP address, you cannot specify network blocks. If you want to monitor EPrints then this should probably be set to your IP (v4) address.
my $MONIT_URI = "126.96.36.199";
Remember I told you that, with PERL, Kieker wraps any modules? Well this last option allows you to specify which modules will be wrapped hence which modules will be monitored. Anything outside of the selected scope will not be monitored and will not generate any data.
Some common examples:
my $MONIT_PACKAGES = "EPrints EPrints::*"; # any EPrints call my $MONIT_PACKAGES = "EPrints::Database EPrints::Database::*"; # anything relating to EPrints' Database layer my $MONIT_PACKAGES = "EPrints::Plugin::Screen::Items;" # the "Manage deposits" page my $MONIT_PACKAGES = "EPrints::MetaField::* EPrints::XML;" # EPrints' metafield layer, in relation to the XML module
- Separation of the Core API and UI (/rendering) components - that mainly means that core objects (Repository, MetaField*, DataObj*) won't have any render_* method. this could also imply that the core or the UI could be upgraded separately. It would also make re-use of the eprints core easy for other purposes (e.g. data repository, OER etc.)