Difference between revisions of "SWORD 1.3"
m (add categories) |
m (Change category Developer to Contribute) |
||
(One intermediate revision by the same user not shown) | |||
Line 204: | Line 204: | ||
</pre> | </pre> | ||
− | [[Category: | + | [[Category:Contribute]] |
− | [[Category: | + | [[Category:Howto]] |
[[Category:Eprints3.2]] | [[Category:Eprints3.2]] |
Latest revision as of 23:49, 11 September 2018
SWORD 1.3 is the only option in EPrints 3.2, and is available via an EPrints bazaar plugin in EPrints 3.3
Contents
Terminology
SWORD 1.3 uses some specific terms for specific meanings
- collection The specific URL within the server for the data to go into. For EPrints this generally means inbox, review, archive, deleted - however for DSpace, there is a Collection concept; and Fedora has a similar RDF tag for defining collective groupings.
- package The URI that identifies how a particular deposit has been wrapped up.
- mediation This is where one user can deposit on behalf of another user.
- servicedocument The document that the SWORD server can return to inform clients of what collections and what packages are understood by the service
Protocol implementation
verbose no-op
Configuring SWORD
The default location for SWORD configuration is
archives/<your repo>/cfg/cfg.d/sword.pl
This is where you enable and disable access to various collections, and add/remove packages
servicedocument
The servicedocument is an XML listing of which "collections" are available, and what "packages" can be used with each one.
- a "collection" in EPrints terms is inbox, review, archive - which correspond to the users workspace, the administration review buffer, and visible in the live repository
- a "package" is an agreed method for wrapping up the data being sent over - as XML, as formatted text, in a zip file, etc...
The default location for the servicedocument is /sword-app/servicedocument
example
Below is an example framework of the servicedocument
<service xmlns="http://www.w3.org/2007/app" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sword="http://purl.org/net/sword/" xmlns:dcterms="http://purl.org/dc/terms/"> <workspace> <atom:title>OpenDepot.org</atom:title> <collection href="http://opendepot.org/sword-app/deposit/buffer"> .... </collection> <collection href="http://opendepot.org/sword-app/deposit/inbox"> .... </collection> </workspace>
and in each collection is listed for package formats unsderstood:
<collection href="http://opendepot.org/sword-app/deposit/buffer"> <atom:title>Repository Review</atom:title> <accept>*/*</accept> <sword:acceptPackaging q="0.2">http://www.loc.gov/METS/</sword:acceptPackaging> <sword:acceptPackaging q="1.0">http://eprints.org/ep2/data/2.0</sword:acceptPackaging> <sword:acceptPackaging q="0.2">http://www.imsglobal.org/xsd/imscp_v1p1</sword:acceptPackaging> <sword:acceptPackaging q="0.2">http://purl.org/net/sword-types/METSDSpaceSIP</sword:acceptPackaging> <sword:collectionPolicy/> <sword:treatment> Deposited items will undergo the review process. Upon approval, items will appear in the live repository. </sword:treatment> <sword:mediation>true</sword:mediation> <dcterms:abstract>This is the repository review.</dcterms:abstract> </collection>
Writing your own Importer
In EPrints 3.2, you need to create two things to enable a new importer
- You need to configure the repository to recognise a new packagge format, and associate that format with the code that handles it
- This lives in ~~eprints/archives/<ID>/cfg/cfg.d/
- You need to write the actual package that handles the file being deposited
- This live in ~~eprints/archives/<ID>/cfg/plugins/EPrints/plugin/Sword/Import/
- For complex importers, you may end up writing multiple packages - its Perl.... TMTOWTDI
Configuration file
This is relatively easy file to write - for example:
# Add in the RJ_Broker acceptance type $c->{sword}->{supported_packages}->{"http://opendepot.org/broker/1.0"} = { name => "Repository Junction Broker", plugin => "Sword::Import::RJ_Broker", qvalue => "0.8" };
- The name is the string that's shown in the servicedocument
- The
plugin
is the package used to handle the file deposited - The qvalue is what's known as the "Quality Value" - how closely the importer matches all the metadata wanted by the repository.
- The theory is that clients have a list of packages they can export as, and servers have a list of packages they understand - therefore a client can determine the best package for that transfer, based on relative QValues.
Importer Plugin
The importer plugin is a perl package that handles the deposited file, and creates a new record in the repository for the record deposited.
The Importer system is, as with all EPrints code, deeply hierarchical:
EPrints::Plugin::Sword::Import::MyImporter
will inherit most of its code fromEPrints::Plugin::Sword::Import
EPrints::Plugin::Sword::Import
is never run directly, and inherits most it its functions fromEPrints::Plugin::Import
EPrints::Plugin::Import
is another base class (one that is there to provide central functions), and inherits fromEPrints::Plugin
EPrints::Plugin
is the base class for all plugins.
In practice, the best way to learn how importers are written is to look at existing importers (~~eprints/perl-lib/EPrints/Plugin/Sword/Import/...)
Basic framework
The very basic framework for an importer is just to register the importer with EPrints, and leave all functions to be handled by inheritence:
package EPrints::Plugin::Sword::Import::MyImporter; use strict; use EPrints::Plugin::Sword::Import; our @ISA = qw/ EPrints::Plugin::Sword::Import /; sub new { my ( $class, %params ) = @_; my $self = $class->SUPER::new(%params); $self->{name} = "My Sword Importer"; return $self; } ## end sub new 1;
This is actually pretty useless as all it will do is create a blank record, with the deposited file attached as a document. To be useful, it needs to somehow read in some metadata and maybe attach some files.
The first important task will be to handle the file deposited. For this, you need to create a function called input_file
, and it will have a framework something like:
## $opts{file} = $file; ## $opts{mime_type} = $headers->{content_type}; ## $opts{dataset_id} = $target_collection; ## $opts{owner_id} = $owner->get_id; ## $opts{depositor_id} = $depositor->get_id if(defined $depositor); ## $opts{no_op} = is this a No-op? ## $opts{verbose} = is this verbosed? sub input_file { my ( $plugin, %opts) = @_; my $session = $plugin->{session}; # needs to read the xml from the file: open my $fh, $file; my @xml = <$fh>; close $fh; my $xml = join '', @xml; my $epdata = {}; my $epdata = $plugin->parse_xml($xml); my $eprint = $dataset->create_object( $plugin->{session}, $epdata ); return $eprint; } sub parse_xml { my ($plugin, $xml) = @_; my $epdata = {}; #### do stuff return $epdata; }
Depositing (from Perl)
A SWORD deposit is, at its most basic level, just an HTTP POST request, so can be scripted fairly easily:
# $ep is eprint to transfer my $ua = LWP::UserAgent->new; my $auth = "Basic " . MIME::Base64::encode( "$username:$password", '' ); # eg: sworduser:mySecretPassword my %headers = ( 'X-Packaging' => $package, # eg: http://opendepot.org/broker/1.0 'X-No-Op' => 'false', 'X-Verbose' => 'true', 'Content-Disposition' => "filename=$filename", # The name of the "file" to be importer things its reading 'Content-Type' => $mime, # eg: application/zip 'User-Agent' => 'OA-RJ Broker v0.2', 'Authorization' => $auth, ); my $url = "${host}${collection}"; # eg: http://eprints.example.com/sword-app/deposit/review my $buffer = $ep->export($exporter) # eg BibTeX (as in EPrints::Export::BibTeX) my $r = $ua->post( $url, %headers, Content => $buffer ); if ( $r->is_success ) { # Transferred my $content = $r->content; my $return_id; if ( $content =~ m#<atom:id>([^<]+)</atom:id># ) { $return_id = $1 if $1; } } else { # fail }