|   |   | 
| Line 14: | Line 14: | 
|  | By default, sword is ENABLED in all SWORD 3.2 & 3.3 EPrints servers, and access is available to all registered users. |  | By default, sword is ENABLED in all SWORD 3.2 & 3.3 EPrints servers, and access is available to all registered users. | 
|  |  |  |  | 
| − | EPrints 3.2 uses SWORD 1.3 | + | EPrints 3.2 uses [[SWORD 1.3]] | 
|  |  |  |  | 
| − | EPrints 3.3 uses SWORD 2.0 | + | EPrints 3.3 uses [[SWORD 2.0]] | 
|  |  |  |  | 
| − | This document covers EPrints 3.2 & SWORD 1.3
 | + | For further information on SWORD 2 see [[API:EPrints/Apache/CRUD]]. | 
| − | For information on SWORD 2 see [[API:EPrints/Apache/CRUD]]. |  | 
| − |   |  | 
| − | == Terminology ==
 |  | 
| − |   |  | 
| − | SWORD 1.3 uses some specific terms for specific meanings
 |  | 
| − | * '''collection''' The specific URL within the server for the data to go into. For EPrints this generally means inbox, review, archive, deleted - however for DSpace, there is a Collection concept; and Fedora has a similar RDF tag for defining collective groupings.
 |  | 
| − | * '''package''' The URI that identifies how a particular deposit has been wrapped up.
 |  | 
| − | * '''mediation''' This is where one user can deposit ''on behalf of'' another user.
 |  | 
| − | * '''servicedocument''' The document that the SWORD server can return to inform clients of what collections and what packages are understood by the service
 |  | 
| − |   |  | 
| − | == Protocol implementation ==
 |  | 
| − |   |  | 
| − | verbose
 |  | 
| − | no-op
 |  | 
| − |   |  | 
| − | == Configuring SWORD ==
 |  | 
| − |   |  | 
| − |   |  | 
| − |   |  | 
| − | The default location for SWORD configuration is 
 |  | 
| − |   |  | 
| − |   archives/<your repo>/cfg/cfg.d/sword.pl
 |  | 
| − |   |  | 
| − | This is where you enable and disable access to various ''collections'', and add/remove ''packages''
 |  | 
| − |   |  | 
| − | == servicedocument ==
 |  | 
| − |   |  | 
| − | The servicedocument is an XML listing of which "collections" are available, and what "packages" can be used with each one.
 |  | 
| − |   |  | 
| − | * a "collection" in EPrints terms is <tt>inbox</tt>, <tt>review</tt>, <tt>archive</tt> - which correspond to the users workspace, the administration review buffer, and visible in the live repository
 |  | 
| − | * a "package" is an agreed method for wrapping up the data being sent over - as XML, as formatted text, in a zip file, etc...
 |  | 
| − |   |  | 
| − | The default location for the servicedocument is <tt>/sword-app/servicedocument</tt>
 |  | 
| − |   |  | 
| − | === example ===
 |  | 
| − | Below is an example framework of the servicedocument
 |  | 
| − | <pre>
 |  | 
| − |   <service xmlns="http://www.w3.org/2007/app" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sword="http://purl.org/net/sword/"
 |  | 
| − |    xmlns:dcterms="http://purl.org/dc/terms/">
 |  | 
| − |   <workspace>
 |  | 
| − |     <atom:title>OpenDepot.org</atom:title>
 |  | 
| − |     <collection href="http://opendepot.org/sword-app/deposit/buffer">
 |  | 
| − |     ....
 |  | 
| − |     </collection>
 |  | 
| − |     <collection href="http://opendepot.org/sword-app/deposit/inbox">
 |  | 
| − |     .... 
 |  | 
| − |     </collection>
 |  | 
| − |   </workspace>
 |  | 
| − | </pre>
 |  | 
| − |   |  | 
| − | and in each collection is listed for package formats unsderstood:
 |  | 
| − | <pre>
 |  | 
| − |   <collection href="http://opendepot.org/sword-app/deposit/buffer">
 |  | 
| − |     <atom:title>Repository Review</atom:title>
 |  | 
| − |     <accept>*/*</accept>
 |  | 
| − |     <sword:acceptPackaging q="0.2">http://www.loc.gov/METS/</sword:acceptPackaging>
 |  | 
| − |     <sword:acceptPackaging q="1.0">http://eprints.org/ep2/data/2.0</sword:acceptPackaging>
 |  | 
| − |     <sword:acceptPackaging q="0.2">http://www.imsglobal.org/xsd/imscp_v1p1</sword:acceptPackaging>
 |  | 
| − |     <sword:acceptPackaging q="0.2">http://purl.org/net/sword-types/METSDSpaceSIP</sword:acceptPackaging>
 |  | 
| − |     <sword:collectionPolicy/>
 |  | 
| − |     <sword:treatment>
 |  | 
| − |       Deposited items will undergo the review process. Upon approval, items will appear in the live repository.
 |  | 
| − |     </sword:treatment>
 |  | 
| − |   <sword:mediation>true</sword:mediation>
 |  | 
| − |   <dcterms:abstract>This is the repository review.</dcterms:abstract>
 |  | 
| − |   </collection>
 |  | 
| − | </pre>
 |  | 
| − |   |  | 
| − | == Writing your own Importer ==
 |  | 
| − |   |  | 
| − | In EPrints 3.2, you need to create two things to enable a new importer
 |  | 
| − |   |  | 
| − | # You need to configure the repository to recognise a new packagge format, and associate that format with the code that handles it
 |  | 
| − | ** This lives in <tt>~~eprints/archives/<ID>/cfg/cfg.d/</tt>
 |  | 
| − | # You need to write the actual package that handles the file being deposited
 |  | 
| − | ** This live in <tt>~~eprints/archives/<ID>/cfg/plugins/EPrints/plugin/Sword/Import/</tt>
 |  | 
| − | ** For complex importers, you may end up writing multiple packages - its Perl.... TMTOWTDI
 |  | 
| − |   |  | 
| − | === Configuration file ===
 |  | 
| − | This is relatively easy file to write - for example:
 |  | 
| − |   |  | 
| − | <pre>
 |  | 
| − |   # Add in the RJ_Broker acceptance type
 |  | 
| − |   $c->{sword}->{supported_packages}->{"http://opendepot.org/broker/1.0"} = 
 |  | 
| − |   {
 |  | 
| − |     name => "Repository Junction Broker",
 |  | 
| − |     plugin => "Sword::Import::RJ_Broker",
 |  | 
| − |     qvalue => "0.8"
 |  | 
| − |   };
 |  | 
| − | </pre>
 |  | 
| − |   |  | 
| − | * The <tt>name</tt> is the string that's shown in the servicedocument
 |  | 
| − | * The <code>plugin</code> is the package used to handle the file deposited
 |  | 
| − | * The <tt>qvalue</tt> is what's known as the "Quality Value" - how closely the importer matches all the metadata wanted by the repository.
 |  | 
| − | ** The theory is that clients have a list of packages they can export as, and servers have a list of packages they understand - therefore a client can determine the best package for that transfer, based on relative QValues.
 |  | 
| − |   |  | 
| − | === Importer Plugin ===
 |  | 
| − | The importer plugin is a perl package that handles the deposited file, and creates a new record in the repository for the record deposited.
 |  | 
| − |   |  | 
| − | The Importer system is, as with all EPrints code, deeply hierarchical:
 |  | 
| − |   |  | 
| − | # <code>EPrints::Plugin::Sword::Import::MyImporter</code> will inherit most of its code from <code>EPrints::Plugin::Sword::Import</code>
 |  | 
| − | # <code>EPrints::Plugin::Sword::Import</code> is never run directly, and inherits most it its functions from <code>EPrints::Plugin::Import</code>
 |  | 
| − | # <code>EPrints::Plugin::Import</code> is another ''base class'' (one that is there to provide central functions), and inherits from <code>EPrints::Plugin</code>
 |  | 
| − | # <code>EPrints::Plugin</code> is the ''base class'' for all plugins.
 |  | 
| − |   |  | 
| − | In practice, the best way to learn how importers are written is to look at existing importers (<tt>~~eprints/perl-lib/EPrints/Plugin/Sword/Import/...</tt>)
 |  | 
| − |   |  | 
| − | ==== Basic framework ====
 |  | 
| − | The very basic framework for an importer is just to register the importer with EPrints, and leave all functions to be handled by inheritence:
 |  | 
| − |   |  | 
| − | <pre>
 |  | 
| − | package EPrints::Plugin::Sword::Import::MyImporter;
 |  | 
| − |   |  | 
| − | use strict;
 |  | 
| − |   |  | 
| − | use EPrints::Plugin::Sword::Import;
 |  | 
| − | our @ISA = qw/ EPrints::Plugin::Sword::Import /;
 |  | 
| − |   |  | 
| − |   |  | 
| − | sub new {
 |  | 
| − |   my ( $class, %params ) = @_;
 |  | 
| − |   my $self = $class->SUPER::new(%params);
 |  | 
| − |   $self->{name} = "My Sword Importer";
 |  | 
| − |   return $self;
 |  | 
| − | } ## end sub new
 |  | 
| − |   |  | 
| − | 1;
 |  | 
| − |   |  | 
| − | </pre>
 |  | 
| − |   |  | 
| − | This is actually pretty useless as all it will do is create a blank record, with the deposited file attached as a document. To be useful, it needs to somehow read in some metadata and maybe attach some files.
 |  | 
| − |   |  | 
| − | The first important task will be to handle the file deposited. For this, you need to create a function called <code>input_file</code>, and it will have a framework something like:
 |  | 
| − |   |  | 
| − | <pre>
 |  | 
| − | ##        $opts{file} = $file;
 |  | 
| − | ##        $opts{mime_type} = $headers->{content_type};
 |  | 
| − | ##        $opts{dataset_id} = $target_collection;
 |  | 
| − | ##        $opts{owner_id} = $owner->get_id;
 |  | 
| − | ##        $opts{depositor_id} = $depositor->get_id if(defined $depositor);
 |  | 
| − | ##        $opts{no_op}   = is this a No-op?
 |  | 
| − | ##        $opts{verbose} = is this verbosed?
 |  | 
| − | sub input_file
 |  | 
| − | {
 |  | 
| − |     my ( $plugin, %opts) = @_;
 |  | 
| − |   |  | 
| − |     my $session = $plugin->{session};
 |  | 
| − |   |  | 
| − |     # needs to read the xml from the file:
 |  | 
| − |     open my $fh, $file;
 |  | 
| − |     my @xml = <$fh>;
 |  | 
| − |     close $fh;
 |  | 
| − |     my $xml = join '', @xml;
 |  | 
| − |   |  | 
| − |     my $epdata = {};
 |  | 
| − |     my $epdata = $plugin->parse_xml($xml);
 |  | 
| − |     my $eprint = $dataset->create_object( $plugin->{session}, $epdata );
 |  | 
| − |     return $eprint;
 |  | 
| − | }
 |  | 
| − |   |  | 
| − | sub parse_xml
 |  | 
| − | {
 |  | 
| − |     my ($plugin, $xml) = @_;
 |  | 
| − |     my $epdata = {};
 |  | 
| − |   |  | 
| − |     #### do stuff
 |  | 
| − |   |  | 
| − |     return $epdata;
 |  | 
| − | }
 |  | 
| − | </pre>
 |  |