|
|
Line 14: |
Line 14: |
| By default, sword is ENABLED in all SWORD 3.2 & 3.3 EPrints servers, and access is available to all registered users. | | By default, sword is ENABLED in all SWORD 3.2 & 3.3 EPrints servers, and access is available to all registered users. |
| | | |
− | EPrints 3.2 uses SWORD 1.3 | + | EPrints 3.2 uses [[SWORD 1.3]] |
| | | |
− | EPrints 3.3 uses SWORD 2.0 | + | EPrints 3.3 uses [[SWORD 2.0]] |
| | | |
− | This document covers EPrints 3.2 & SWORD 1.3
| + | For further information on SWORD 2 see [[API:EPrints/Apache/CRUD]]. |
− | For information on SWORD 2 see [[API:EPrints/Apache/CRUD]]. | |
− | | |
− | == Terminology ==
| |
− | | |
− | SWORD 1.3 uses some specific terms for specific meanings
| |
− | * '''collection''' The specific URL within the server for the data to go into. For EPrints this generally means inbox, review, archive, deleted - however for DSpace, there is a Collection concept; and Fedora has a similar RDF tag for defining collective groupings.
| |
− | * '''package''' The URI that identifies how a particular deposit has been wrapped up.
| |
− | * '''mediation''' This is where one user can deposit ''on behalf of'' another user.
| |
− | * '''servicedocument''' The document that the SWORD server can return to inform clients of what collections and what packages are understood by the service
| |
− | | |
− | == Protocol implementation ==
| |
− | | |
− | verbose
| |
− | no-op
| |
− | | |
− | == Configuring SWORD ==
| |
− | | |
− | | |
− | | |
− | The default location for SWORD configuration is
| |
− | | |
− | archives/<your repo>/cfg/cfg.d/sword.pl
| |
− | | |
− | This is where you enable and disable access to various ''collections'', and add/remove ''packages''
| |
− | | |
− | == servicedocument ==
| |
− | | |
− | The servicedocument is an XML listing of which "collections" are available, and what "packages" can be used with each one.
| |
− | | |
− | * a "collection" in EPrints terms is <tt>inbox</tt>, <tt>review</tt>, <tt>archive</tt> - which correspond to the users workspace, the administration review buffer, and visible in the live repository
| |
− | * a "package" is an agreed method for wrapping up the data being sent over - as XML, as formatted text, in a zip file, etc...
| |
− | | |
− | The default location for the servicedocument is <tt>/sword-app/servicedocument</tt>
| |
− | | |
− | === example ===
| |
− | Below is an example framework of the servicedocument
| |
− | <pre>
| |
− | <service xmlns="http://www.w3.org/2007/app" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sword="http://purl.org/net/sword/"
| |
− | xmlns:dcterms="http://purl.org/dc/terms/">
| |
− | <workspace>
| |
− | <atom:title>OpenDepot.org</atom:title>
| |
− | <collection href="http://opendepot.org/sword-app/deposit/buffer">
| |
− | ....
| |
− | </collection>
| |
− | <collection href="http://opendepot.org/sword-app/deposit/inbox">
| |
− | ....
| |
− | </collection>
| |
− | </workspace>
| |
− | </pre>
| |
− | | |
− | and in each collection is listed for package formats unsderstood:
| |
− | <pre>
| |
− | <collection href="http://opendepot.org/sword-app/deposit/buffer">
| |
− | <atom:title>Repository Review</atom:title>
| |
− | <accept>*/*</accept>
| |
− | <sword:acceptPackaging q="0.2">http://www.loc.gov/METS/</sword:acceptPackaging>
| |
− | <sword:acceptPackaging q="1.0">http://eprints.org/ep2/data/2.0</sword:acceptPackaging>
| |
− | <sword:acceptPackaging q="0.2">http://www.imsglobal.org/xsd/imscp_v1p1</sword:acceptPackaging>
| |
− | <sword:acceptPackaging q="0.2">http://purl.org/net/sword-types/METSDSpaceSIP</sword:acceptPackaging>
| |
− | <sword:collectionPolicy/>
| |
− | <sword:treatment>
| |
− | Deposited items will undergo the review process. Upon approval, items will appear in the live repository.
| |
− | </sword:treatment>
| |
− | <sword:mediation>true</sword:mediation>
| |
− | <dcterms:abstract>This is the repository review.</dcterms:abstract>
| |
− | </collection>
| |
− | </pre>
| |
− | | |
− | == Writing your own Importer ==
| |
− | | |
− | In EPrints 3.2, you need to create two things to enable a new importer
| |
− | | |
− | # You need to configure the repository to recognise a new packagge format, and associate that format with the code that handles it
| |
− | ** This lives in <tt>~~eprints/archives/<ID>/cfg/cfg.d/</tt>
| |
− | # You need to write the actual package that handles the file being deposited
| |
− | ** This live in <tt>~~eprints/archives/<ID>/cfg/plugins/EPrints/plugin/Sword/Import/</tt>
| |
− | ** For complex importers, you may end up writing multiple packages - its Perl.... TMTOWTDI
| |
− | | |
− | === Configuration file ===
| |
− | This is relatively easy file to write - for example:
| |
− | | |
− | <pre>
| |
− | # Add in the RJ_Broker acceptance type
| |
− | $c->{sword}->{supported_packages}->{"http://opendepot.org/broker/1.0"} =
| |
− | {
| |
− | name => "Repository Junction Broker",
| |
− | plugin => "Sword::Import::RJ_Broker",
| |
− | qvalue => "0.8"
| |
− | };
| |
− | </pre>
| |
− | | |
− | * The <tt>name</tt> is the string that's shown in the servicedocument
| |
− | * The <code>plugin</code> is the package used to handle the file deposited
| |
− | * The <tt>qvalue</tt> is what's known as the "Quality Value" - how closely the importer matches all the metadata wanted by the repository.
| |
− | ** The theory is that clients have a list of packages they can export as, and servers have a list of packages they understand - therefore a client can determine the best package for that transfer, based on relative QValues.
| |
− | | |
− | === Importer Plugin ===
| |
− | The importer plugin is a perl package that handles the deposited file, and creates a new record in the repository for the record deposited.
| |
− | | |
− | The Importer system is, as with all EPrints code, deeply hierarchical:
| |
− | | |
− | # <code>EPrints::Plugin::Sword::Import::MyImporter</code> will inherit most of its code from <code>EPrints::Plugin::Sword::Import</code>
| |
− | # <code>EPrints::Plugin::Sword::Import</code> is never run directly, and inherits most it its functions from <code>EPrints::Plugin::Import</code>
| |
− | # <code>EPrints::Plugin::Import</code> is another ''base class'' (one that is there to provide central functions), and inherits from <code>EPrints::Plugin</code>
| |
− | # <code>EPrints::Plugin</code> is the ''base class'' for all plugins.
| |
− | | |
− | In practice, the best way to learn how importers are written is to look at existing importers (<tt>~~eprints/perl-lib/EPrints/Plugin/Sword/Import/...</tt>)
| |
− | | |
− | ==== Basic framework ====
| |
− | The very basic framework for an importer is just to register the importer with EPrints, and leave all functions to be handled by inheritence:
| |
− | | |
− | <pre>
| |
− | package EPrints::Plugin::Sword::Import::MyImporter;
| |
− | | |
− | use strict;
| |
− | | |
− | use EPrints::Plugin::Sword::Import;
| |
− | our @ISA = qw/ EPrints::Plugin::Sword::Import /;
| |
− | | |
− | | |
− | sub new {
| |
− | my ( $class, %params ) = @_;
| |
− | my $self = $class->SUPER::new(%params);
| |
− | $self->{name} = "My Sword Importer";
| |
− | return $self;
| |
− | } ## end sub new
| |
− | | |
− | 1;
| |
− | | |
− | </pre>
| |
− | | |
− | This is actually pretty useless as all it will do is create a blank record, with the deposited file attached as a document. To be useful, it needs to somehow read in some metadata and maybe attach some files.
| |
− | | |
− | The first important task will be to handle the file deposited. For this, you need to create a function called <code>input_file</code>, and it will have a framework something like:
| |
− | | |
− | <pre>
| |
− | ## $opts{file} = $file;
| |
− | ## $opts{mime_type} = $headers->{content_type};
| |
− | ## $opts{dataset_id} = $target_collection;
| |
− | ## $opts{owner_id} = $owner->get_id;
| |
− | ## $opts{depositor_id} = $depositor->get_id if(defined $depositor);
| |
− | ## $opts{no_op} = is this a No-op?
| |
− | ## $opts{verbose} = is this verbosed?
| |
− | sub input_file
| |
− | {
| |
− | my ( $plugin, %opts) = @_;
| |
− | | |
− | my $session = $plugin->{session};
| |
− | | |
− | # needs to read the xml from the file:
| |
− | open my $fh, $file;
| |
− | my @xml = <$fh>;
| |
− | close $fh;
| |
− | my $xml = join '', @xml;
| |
− | | |
− | my $epdata = {};
| |
− | my $epdata = $plugin->parse_xml($xml);
| |
− | my $eprint = $dataset->create_object( $plugin->{session}, $epdata );
| |
− | return $eprint;
| |
− | }
| |
− | | |
− | sub parse_xml
| |
− | {
| |
− | my ($plugin, $xml) = @_;
| |
− | my $epdata = {};
| |
− | | |
− | #### do stuff
| |
− | | |
− | return $epdata;
| |
− | }
| |
− | </pre>
| |