From EPrints Documentation
Revision as of 15:38, 11 October 2013 by Kiz (talk | contribs) (Writing your own Importer)
Jump to: navigation, search

This page is about SWORD which is a lightweight protocol for remotely depositing content into repositories.

The SWORD project was funded by JISC and more information can be found on the official website.

SWORD made easy

SWORD is basically an http put (or POST) to a defined web URL, where the content of the posted request is the thing being deposited.

SWORD 1.3 uses an http header field to define how the thing has been wrapped up (packaged)

SWORD 2.0 uses the content-type to deduce how to understand thing

By default, sword is ENABLED in all SWORD 3.2 & 3.3 EPrints servers, and access is available to all registered users.

EPrints 3.2 uses SWORD 1.3

EPrints 3.3 uses SWORD 2.0

This document covers EPrints 3.2 & SWORD 1.3 For information on SWORD 2 see API:EPrints/Apache/CRUD.


SWORD 1.3 uses some specific terms for specific meanings

  • collection The specific URL within the server for the data to go into. For EPrints this generally means inbox, review, archive, deleted - however for DSpace, there is a Collection concept; and Fedora has a similar RDF tag for defining collective groupings.
  • package The URI that identifies how a particular deposit has been wrapped up.
  • mediation This is where one user can deposit on behalf of another user.
  • servicedocument The document that the SWORD server can return to inform clients of what collections and what packages are understood by the service

Protocol implementation

verbose no-op

Configuring SWORD

The default location for SWORD configuration is

 archives/<your repo>/cfg/cfg.d/sword.pl

This is where you enable and disable access to various collections, and add/remove packages


The servicedocument is an XML listing of which "collections" are available, and what "packages" can be used with each one.

  • a "collection" in EPrints terms is inbox, review, archive - which correspond to the users workspace, the administration review buffer, and visible in the live repository
  • a "package" is an agreed method for wrapping up the data being sent over - as XML, as formatted text, in a zip file, etc...

The default location for the servicedocument is /sword-app/servicedocument


Below is an example framework of the servicedocument

  <service xmlns="http://www.w3.org/2007/app" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sword="http://purl.org/net/sword/"
    <collection href="http://opendepot.org/sword-app/deposit/buffer">
    <collection href="http://opendepot.org/sword-app/deposit/inbox">

and in each collection is listed for package formats unsderstood:

  <collection href="http://opendepot.org/sword-app/deposit/buffer">
    <atom:title>Repository Review</atom:title>
    <sword:acceptPackaging q="0.2">http://www.loc.gov/METS/</sword:acceptPackaging>
    <sword:acceptPackaging q="1.0">http://eprints.org/ep2/data/2.0</sword:acceptPackaging>
    <sword:acceptPackaging q="0.2">http://www.imsglobal.org/xsd/imscp_v1p1</sword:acceptPackaging>
    <sword:acceptPackaging q="0.2">http://purl.org/net/sword-types/METSDSpaceSIP</sword:acceptPackaging>
      Deposited items will undergo the review process. Upon approval, items will appear in the live repository.
  <dcterms:abstract>This is the repository review.</dcterms:abstract>

Writing your own Importer

In EPrints 3.2, you need to create two things to enable a new importer

  1. You need to configure the repository to recognise a new packagge format, and associate that format with the code that handles it
    • This lives in ~~eprints/archives/<ID>/cfg/cfg.d/
  1. You need to write the actual package that handles the file being deposited
    • This live in ~~eprints/archives/<ID>/cfg/plugins/EPrints/plugin/Sword/Import/
    • For complex importers, you may end up writing multiple packages - its Perl.... TMTOWTDI

Configuration file

This is relatively easy file to write - for example:

  # Add in the RJ_Broker acceptance type
  $c->{sword}->{supported_packages}->{"http://opendepot.org/broker/1.0"} = 
    name => "Repository Junction Broker",
    plugin => "Sword::Import::RJ_Broker",
    qvalue => "0.8"
  • The name is the string that's shown in the servicedocument
  • The plugin is the package used to handle the file deposited
  • The qvalue is what's known as the "Quality Value" - how closely the importer matches all the metadata wanted by the repository.
    • The theory is that clients have a list of packages they can export as, and servers have a list of packages they understand - therefore a client can determine the best package for that transfer, based on relative QValues.

Importer Plugin

The importer plugin is a perl package that handles the deposited file, and creates a new record in the repository for the record deposited.

In practice, the best way to learn how importers are written is to look at existing importers (~~eprints/perl-lib/EPrints/Plugin/Sword/Import/...)

Basic framework

The very basic framework for an importer is to subclass from the generic "import" class:

package EPrints::Plugin::Sword::Import::MyImporter;

use strict;

use EPrints::Plugin::Sword::Import;
our @ISA = qw/ EPrints::Plugin::Sword::Import /;

sub new {
  my ( $class, %params ) = @_;
  my $self = $class->SUPER::new(%params);
  $self->{name} = "My Sword Importer";
  return $self;
} ## end sub new