Getting Started with EPrints 3

From EPrints Documentation
Revision as of 11:35, 19 August 2007 by XnxXbz (talk | contribs)
Jump to: navigation, search

volvo s80 diesel edoardo spadaro megane coach 1 6 strade americane deloitte merrell blu chicks il paradiso usato trattori www ceteco it richard widmark i figli della strada help falsas esperanzas il dopoguerra della 2 guerra mondiale skid row mietitrebbiatrici toshiba m40x- 128 centrino dvd walt disney hp designjet 70 motorola - v3 razor silver import d-link dsl 502t autocad 2004 ita st professional snc outtrigger find it trieste roma biglietti aerei rest riassunto vino e pane rivenditori auto kia robert mitchum meyday be2 it sei nell aria merial frontline combo ilocos norte eyes in the sky marron5 video this love auto pordenone first van karaoke files barra rs-mmc kingston 512 dual eletricista celine dion. au coeur du stade ata5 hard disk e floppy catania francoforte biglietti aerei le mondine gioia e rivoluzione caiazzo lluxor peplab www fortuna it foto ballerini akcion legge n 168 del 2004 attrezzi accessori auto e moto aziende in brescia ferrari 360 spider transoil srl ariete 1330 travelmate 4051lmi ragazzo siracusa sicilia turismo north carolina real estate perche no polti new concept newcastle-upon-tyne philips macchina fotografica crossfire chrysler escape from monkey island lacie 200gb usb2 www webank it trevignano tv toshiba lettore dvx aggiornabile mondrian cheap video lesbiche gratuiti petek dincoz linea accessori fiat sony - kv-32fq86 em boy www nokia imagin un annuncio personale gratis incontro foto esami abilitazione periti palermo siemens kg39m390 qtek s100 desmo- anno domini forget her fuseaux nike le canzoni di gigi d alessio vietchat zafira km zero delta gps ufficio matera donne robuste age symphoni internet security ghost antra- lavoro come saldatore flash nikon noleggio auto a lungo termine friuli veneto winamp full 2.91 canon mvx-10i digital voice recorder usb laudate dominum mozart alizee sexy aulla top ten canzoni settimana tutte tette celebrita nuda calza cotone obiettivi canon 55 200 ragazze di crotone stampanti laser a colori minolta marina di modica dimage z3 di konica minolta crociera po pensavo fosse amore e invece era un karaoke t ferro non me lo so spiegare stampante duplex miguel mateos ati radeon 9200 128mb mazda 6 benzina contatto webcam quota opzionale ostelli a londra keri russel psp sony cavalli del camargue how high quadri-band cellulari erotica arabo upa dance sambame my c5 2 cellulari stampante laser brother hl-2030 palmari cellulari kylie minogue red blooded sonia morales viprati pipa pentax wp optio sky tears for fears break it down again nardi nvidia geforce 6600 gt 256mb canon powershot a95 plus bas armagnac saitek 52 bacio katia carolina aegolius borse dell energia bwv 182 the king of the golden hall centro per pulizia del viso prescott 478 samsung tft navigatore satellitare gps pattinaggio foto royal gigolos california dreamin extend where the children play acer travelmate 4001 webranking www lycos portami in cielo viajes com mx kenwood chef major robot da cucina battletech epson emp-tw20 inger nilsson kosova mp3 mix vibes pro 4 bollon macchine da caffe in cialde qtek - pda phone 2020 brother tn8000 pc da tavolo benoit musica telefoni con filo generale di de gregori antrozous accessori gps alimento cane sachsen link esibizionisti diffusori e casse facce di clown foto acer aspire 5024wlmi turion ml34 512mb if god will send his angels g mazzini giovine italia cavalie hamster ball robbie williams. nobody someday la bayadere high grow sms logo nec e606 manu monitor lg 19 lcd ==Creating an Archive==

EPrints 3 can run run multiple archives under one install. Multiple archives will require giving additional DNS aliases to the machine running EPrints, EPrints can then create all the parts of the apache configuration file needed to run the virtual hosts.

Running epadmin

Make sure MySQL is actually running.

Change to your eprints user (probably "eprints").

Change directory to the eprints directory (/opt/eprints3 by default) and run

bin/epadmin create

You will get the following prompts (note that when you see something in [square brackets], it's the default value and can be selected by simply hitting enter)

  • Archive ID - the system name for your archive. It's probably a good idea to think of something short and memorable. Once entered, an archive/<archive_id> directory will be created, and the standard configuration files will be copied in.
  • Configure vital settings - Hit enter to say 'yes'. This will lead to more prompting about core settings:
    • Hostname - What someone will type into a web browser to get to your archive. Make sure that your systems team have a DNS alias pointing to your server for this.
    • Webserver Port - Which port to you want to serve the archive on? The default is 80, so unless you can think of a good reason not to, just hit enter to accept the default.
    • Alias - You can enter any number of aliases that will take users to this archive. Enter a '#' when you don't want to enter any more. You could have your archive served on eprints.myorganisation.org and eprints.myorg.org. As with the Hostname, your systems team need to be informed about these aliases too.
    • Administrator Email - Enter the email address of the repository administrator. This will allow your repository users to send email to the right person.
    • Archive Name - The full name of your archive. By default, this will be used on many of the pages, and in the title bar of the browser.
    • Write these core settings - If you don't say 'yes', then you entered all that data for nothing.
  • Configure database - EPrints makes extensive use of a MySQL database. Enter 'yes' to configure this.
    • Database Name - The internal name of your database. It makes sense to use the Archive ID for this, but you don't have to. You don't need to create this database, epadmin will do it for you.
    • MySQL Host - The address of the server that the database is running on. If the database is on the same machine as the EPrints installation, enter 'localhost'.
    • MySQL Port - You probably don't need to enter a value. If you have problems connecting to the database, talk to your systems team.
    • MySQL Socket - As with MySQL Port, it's unlikely that you need to enter anything.
    • Database User - The username with which to log into the MySQL Database. You don't need to create this user, epadmin will do it for you. If you enter a MySQL username that already exists, it will be overwritten by epstats.
    • Database Password - The password for the Database User.
    • Write these database settings - You should write them, or you'll lose them.
    • Create database <Database Name> - Say yes, and epadmin can create the database and populate it with all the right tables. If you've already created a database and a user for this archive, say no.
    • MySQL Root Password - To create the database and the user, epadmin needs the MySQL Root Password. This is not saved anywhere. It is used to log into mysql, create the database and create the user with the right access rights. The password is then forgotten.
    • Create database tables - say yes to have epadmin create all the database tables.
  • Create an initial user - It's a good idea to create a user account for yourself at this point.
    • Enter a username - The username you will use to log into EPrints in your browser.
    • Select a user type (user|editor|admin) - There are three levels of user in EPrints. You probably want to be an administrator, so enter 'admin'.
    • Enter Password - A password for this user. Remember to choose a password that will be hard for someone else to guess.
    • Email - Enter your email address so that administrators can get in contact with you.
    • Do you want to build the static web pages - There are a number of pages in EPrints which change very rarely. These are the static pages. The Home page and the About page are examples of static pages. Stylesheets are also static. These pages need to be built, so say 'yes'.
    • Do you want to import the LOC subjects - If you will be using the Library Of Congress subject hierarchy, say 'yes'. Otherwise you will need to create your own subject hierarchy.
  • Do you want to update the apache config files? (you still need to add the 'Include' line) - Your archive has a number of files which it uses to configure the web server. These should be updated, so say 'yes'.
  • Before exiting, epadmin will display information about configuring the webserver.

Open a browser, and enter the hostname in the address bar. You should see your new archive, ready to be branded.

If you want to add some more users, use the command epadmin add_user <repository id>

Modifying the Default

The default document types and metadata in have been optimised for a repository of research output. It is up to you if this is sufficient. Adding new document types and modifying the metadata are discussed elsewhere.

Running a Live Archive

Creating a crontab

When you create an archive it will start out as a development system while you learn how to set it up (and your manager keeps changing his mind) but at some point (hopefully) you will declare your archive open for business.

At this point you should schedule certain scripts to run periodically. The best way to do this is to use "cron" which is an integral part of most UNIX systems.

To set up cron, run (as the eprints user):

% crontab -e

Exactly what to add to the cron table is described in the following sections - "Browse Views" and "Subscriptions".

There should be one set of crontab entries per archive.

Backups

You should also have made sure that the system is being properly backed up. This is gone into in more detail elsewhere in the documentation.

OAI

We would also encourage you to configure the OAI support for your archive and register it.

Configuring

The setting for OAI are held in the oai.pl file, in the eprints3/archives/<archive id>/cfg/cfg.d/ directory. This is a perl file, but don't let that daunt you. Some of the settings are set to sensible defaults. This guide cover the essentials. Feel free to use your favourite text editor instead of pico.

At the command prompt, backup then open the file:

>cd /opt/eprints3/archive/<archive id>/cfg/cfg.d
>cp oai.pl oai.backup
>pico oai.pl

The following need to be changed:

The archive ID. This needs to be unique, so check that it doesn't already exist at http://www.openarchives.org/.

Find the following line in oai.pl:

$oai->{v2}->{archive_id} = "generic.eprints.org";

And change generic.eprints.org to something which identifies your repository.

Content Description. What does your repository contain? Write a description, then find the lines:

$oai->{content}->{"text"} = latin1( <<END );
OAI Site description has not been configured.
END

Do not modify the first or last line in any way. Simply put your new text in the place of the middle line. This text can be as many lines as you wish, but it must not contain the word "END" at the start of a line.

Policies Next you need to define a number of policies which will define how your repository may be used. It may be helpful for you to visit http://www.opendoar.org/tools/en/policies.php which has a step-by-step process to create these policies. It will even output EPrints 3 configuration code. which you can then copy and paste into the oai.pl file. These policies are:

  • Metadata Policy
  • Data Policy
  • Submission Policy

These are updated in exactly the same way as the Content Description section. Just look for the following lines:

  • $oai->{metadata_policy}->{"text"} = latin1( <<END );
  • $oai->{data_policy}->{"text"} = latin1( <<END );
  • $oai->{submission_policy}->{"text"} = latin1( <<END );

Registering

Once you register your archive (at http://www.openarchives.org) various search systems will be able to collect the metadata (titles, authors, abstract etc.) and allow more people to find records in your archive.

See http://www.openarchives.org/ for more information on the OAI protocol. For more information setting up the OAI interface archive see the section in this documentation about Configuring an Archive.

Browse Views

Once every so often you should run the generate_views script on each archive in your system to regenerate the browse views section of the site.

This is a set of static pages. By default one per subject, and one per year (only years with papers in that year not EVERY year ever!). Some users prefer to browse the system than search it. This also gives search engines a way to reach, and index, the abstract pages.

See the views.pl config notes on how to edit the views it generates.

But I don't want this feature...

If you don't want to use this feature: don't, it's your archive. Remove the link from the template and front page. Don't run the generate_views script.

Setting it up

This is best done by using the UNIX "cron" command (as user "eprints"). Cron will email "eprints" on that machine with the output, so best use the --quiet option so it only bothers you with errors.

How often you want to run this depends on the size of your archive, and how fast the contents changes. This feature is roughly order "n". Which means if you double the number of items in your archive then you double the time it takes to run (ish).

Once an hour would seem a good starting point. If your archive gets real big, say more than 10000 records, then maybe once a day is more realistic - the one thing that you don't want to happen is for a new generate_views to start before the old one finishes as they will mess up each others output.

Run generate_views on the command line to find out how long it takes.

and add the line

23 * * * * /opt/eprints3/bin/generate_views <archiveid>

This runs at 23 minutes past each hour. If you have more than one archive, don't make them all start rebuilding stuff at the same time, stagger it. Otherwise once an hour everything will slow down as it fights to run several intensive scripts at once.

See the crontab man page for more information on using cron.

>man 5 crontab

Alerts

Alerts provide a way in which users of your system can receive regular updates, via email, when new items are added which match a search they specified.

To automate sending out these alerts you must add some entries in the crontab (as for views). You need one set of these per archive.

For example (with dookuprints being the name of the archive):

   # 00:15 every morning
   15 0 * * * /opt/eprints2/bin/send_alerts dookuprints daily
   # 00:30 every sunday morning
   30 0 * * 0 /opt/eprints2/bin/send_alerts dookuprints weekly
   # 00:45 every first of the month
   45 0 1 * * /opt/eprints2/bin/send_alerts dookuprints monthly

Note the spacing out so that all 3 don't start at once and hammer the database. You may wish to change the times, but we recommend early morning as the best time to send them (midnight-6am).