Files/OAI Harvester for EPrints 3.2+

From EPrints Documentation
Revision as of 09:38, 6 July 2011 by (talk | contribs) (OAI Harvester for EPrints 3.2+)
Jump to: navigation, search

Keeps your repository in-sync with another repository via OAI-PMH.

Note that these modules contain only the abstract classes, you will need to write your own module which translate whatever XML format you're harvesting to EPrints data structure. An example is provided under cfg/plugins/EPrints/Plugin/Import/OAIPMH/


  • As root user:

cpan HTTP::OAI

  • As eprints user:

cp -rf bin/ cfg/ ./archives/ARCHIVEID/

check the perl declaration in file ./archives/ARCHIVEID/bin/oai/harvest and change it accordingly

./bin/epadmin update_database_structure ARCHIVEID

  • As root user:

Restart Web Server.


  • Create a module which transforms XML as explained in the intro of this file. Say you created a DC importer called OAIPMH::OAI_DC.
  • Edit the configuration file cfg/cfg.d/ and create a new configuration for the service you want to harvest (an example is provided in that file) eg:
$c->{oai_harvester}->{service_name} = {
     url => '',
     set => 'that_set'
     default_values => sub {
          my( $session, $epdata, $header ) = @_;
          $epdata->{userid} = 1234;	# user '1234' will own all imported publications
          $epdata->{eprint_status} = 'archive';	# imported publications will go straight to the live archive
          # etc...
  • Run (or setup in cron) bin/oai/harvest periodically:

bin/oai/harvest ARCHIVEID --plugin=OAIPMH::OAI_DC --conf=service_name

That's it!