Contribute: Plugins/StoragePluginsArkivum

From EPrints Documentation
Revision as of 16:50, 12 January 2015 by Gareth.brown@arkivum.com (talk | contribs) (Storage Plugin (Version 1): Arkivum)
Jump to: navigation, search

Storage Plugin (Version 1): Arkivum

In this tutorial we will look at installing version 1 of the Arkivum Storage Plugin. This plugin allows you to use an alternative storage location and how to mount a Samba (CIFS/Windows) share to store that data.

The plugin uses EPrints event tasks to manage the process of copying and removing data from the A-Stor service as a background task. The process of copying or removing data from the A-Stor service is driven by two new two new screens. The first will allow a user to request a document to be copied (or a copied document removed) to the A-Stor service. The second screen allows these requests to be approved or rejected and then displays the status of the approved requests as they are processed by the background event tasks.

This plugin was designed to mount the Arkivum Storage system, but could equally be used to store data on any CIFS/Samba system

Background

The Arkivum ePrints storage plugin allows content in an ePrints repository to be copied or moved into Arkivum’s data archiving service.

Arkivum provides online data archiving as a service to organisations that need to keep data for long-term compliance or reuse.

The company is a spin out of the University of Southampton and is based on a decade of expertise and technology created from working with large archives on digital preservation and long-term data retention and access. As part of the Arkivum service, three copies of customer data are held in three UK locations: two online in secure data centres and one offline in escrow with a third-party. Arkivum actively manages data integrity through regular checks and migrates media and infrastructure to counter obsolescence and to ensure prices remain low.

Good preservation practice, trained staff and very carefully controlled processes means that Arkivum can offer a guarantee of data integrity.

All data returned from our service is always bit-for-bit identical to the data the customer supplied, with no restrictions on time or volume. The guarantee is backed by worldwide insurance and is included in our contract and SLA. Arkivum is certified to ISO27001 and is regularly audited for the integrity, confidentiality and availability of data assets in our possession. This approach makes the Arkivum service ideal for long-term storage of research data, for example to meet research council retention requirements and to ensure it is available for easy access in the future.

For more information see www.arkivum.com including our series of articles and webinars on Research Data Management.

Installation

The A-Stor service is accessible using the SMB protocol (also known as CIFS). The A-Stor service does this through an appliance that is installed at the customer site that uses Samba to expose a network file share. This means that the A-Stor network share needs to be mounted on the server running ePrints so that the plugin to ePrints can copy files to the A-Stor network share and read files from the A-Stor network share.

Therefore, the installation process is one of mounting the smb network share on the ePrints server, making sure permissions are set properly so ePrints can read and write files to that share, installing the plugin and then testing that the plugin is visible and functioning correctly within ePrints.

Before You Start

​Install smbfs on the eprints server

apt-get install smbfs;# Debian
yum install cifs-utils;# Redhat

Note: Redhat uses cifs mounting rather than samba (so replace smbfs for cifs if using redhat

Make sure you have:

  • Access to: /etc/fstab
  • The UID of eprints and GID of the webserver group (likely to be www-data or apache)
  • A directory and read/write access for the above user somewhere on the system

Preparing the directory

ON THE Arkivum Unit, or Samba Server

  • Create the directory you want to mount the Astor Archive on to (eg /mnt/Archive)
  • mkdir your_di​rectory
  • edit /etc/fstab and add to the bottom:
//ASTOR_SERVER/astor /YOURDIR_ON_EPRINTS cifs defaults,guest,file_mode=0666,dir_mode=0777,uid=USERS_UID,gid=USERS_GID,forcegid,forceuid,rw 0 1

should you want to connect as ​​a specific domain user, use:​

//ASTOR_SERVER/astor /YOURDIR_ON_EPRINTS        cifs    defaults,sec=none,rw,soft,username=DOMAIN_USERNAME,password=DOMAIN_PASSWORD,dom=DOMAIN,uid=EPRINTS_UID,gid=APACHE_GID,forcegid,forceuid 0 0
  • at the command line run: mount /YOURDIR_ON_EPRINTS
  • running: df -h ; #will test to see if the mount succeeded

Configure Apache

On the Eprints Server

Remember to add the directory set as archive in the apache config file.

It should look like this:

<Directory "YOURDIR_ON_EPRINTS">
  Options FollowSymLinks
                  AllowOverride none
                Order allow,deny
                Allow from all
</Directory>​
  <Location "YOURDIR_ON_EPRINTS">
    PerlSetVar EPrints_ArchiveID fluffy

    Options +ExecCGI
    Order allow,deny
    Allow from all
  </Location>

The Storage Plugin

You can download the storage plugin from here: http://bazaar.eprints.org/313/

Then edit the configuration file to set the mount point of your A-Stor directory:

/opt/eprints3/lib/epm/arkivum/cfg/cfg.d/x_arkivum.xml

$c->{plugins}->{"Storage::ArkivumStorage"}->{params}->{mount_path} = "/mnt/arkivum";

This example would configure the Arkivum Storage Plugin to use the mounted directory /mnt/arkivum to place files you wanted to securely archive.

The Event Plugin

Before you can enable the plugin you will also need to configure the URL for the A-Stor appliance:

/opt/eprints3/lib/epm/arkivum/cfg/cfg.d/x_arkivum.xml

$c->{plugins}->{"Storage::ArkivumStorage"}->{params}->{server_url} = "https://astor-server.:8443";
$c->{plugins}->{"Event::Arkivum"}->{params}->{server_url} = "https://astor-server:8443";

This example would define the A-Stor appliance is available on the URL "https://astor-server:8833". This URL is used by both the storage and event plugin to communicate with the A-Stor appliance using the REST API.

You can also optionally change the cron schedule used by the background astor-checker event task. To do this you will need to edit the follwoing control screen:

/opt/eprints3/lib/plugins/EPrints/Plugin/Screen/EPMC/Arkivum.pm

and change the cron schedule in each of the entries marked:

EPrints::DataObj::EventQueue->create_unique( $repo, {
	   pluginid => "Event",
	   action => "cron",
	   params => ["0,15,30,45,60 * * * *",
			   "Event::Arkivum",
			   "astor_checker",
	   ],
});

In the default cron schedule shown above, the checker event will be run every 15 minutes and is set-up when the plugin is enabled and removed when the plugin is disabled.

Check in E-Prints

If the installation is successful and the plugin has been enabled correctly, you should see the main event astor-checker task here:

E-prints web page --> LOGIN --> ADMIN --> SYSTEM TOOLS --> STATUS --> Background Task Queue

You will also see two additional screens. The first is available in the document management screen and is found via a new A-Stor icon. The second screen is available in the manage records screen and is available as a new menu option called A-Stor Requests.

For more information on Astor, see www.arkivum.com including our series of articles and webinars on Research Data Management.