Autocompletion

From EPrints Documentation
Revision as of 23:54, 23 August 2007 by AvuPhu (talk | contribs)
Jump to: navigation, search

volkswagen lupo 1.4 tdi filt uil yes, giorgio torme philips dect 215 case in affitto catania ebiet g ade hark maseratti milan calcio a s exzes copertura isdn filmsex empire schede acquisizione e tv federica ridolfi sexy mobimanager nokia batteria cordless brondi panasonic cq tonia d acquisto babe h fpt gerardo carmine gargiulo isaia 61 testo dvd shrink download carlinhos il tuono di giovanni pascoli matures with boys pc case completo crema uomo www aventura it volanti per pc joystick bl 6c nokia gainward 6600gt inas cisl nokia tastiera nikon caricabatterie mh-61 last minute per habana cuba il tiranno banderas sblocco cellulari ni sam ja supermen serenissima mp3 amori toto cutugno let get retarded cucchiaio awm 8500 liublu kit e sub traduzione italiano latino www milla acer 5002lc canon lbp-5200 rd hazor huevos de oro profumi cerruti 1881 bogner, josip piaceri erotici di una signora bene valeria lesbo hoya 55mm karen blixen galleria di foto di donne vecchie insieme per caso risultati ed eletti al consiglio provinc dinamite jim club posada iltipo limperatore di roma mtb disco browning duck hunter booble bouble quisisana casa di cura roma roccobarocco borse www sony ericsson toner canon cartuccia 703 immagini e video per outlaw volleyball cecchetto motorola u6 pebl lettore mp3 acer 20 giga hoobstank out of control comunione e liberazione gioco flash gratis barracuda km0 fiat sw benzina auto km 0 hard disk maxtor 200 gb diamondmax mini aspirapolvere portatile rap game tomtom go 500 gps tv philips 16 9 100hz 32 humax digitali terrestri nortek 800 cori milan biancaneve nella foresta nera lloret de mar alberghi e hotel busco banco de putas gioco puzzle bubble creek integrato www lacronica com ioan gruffudd taich manubri 25 mm maturita 2003 2004 decoder terrestre strong moto usate veneto kelis trck me big water kit audi 80 1.8 cabrio gigi quanti amori viacces playstation1 soluzione storielle del bosco viennese foto ragazze al mare elisa raimbow bouce bino madre figlia amante spartiti you need love per pianoforte www amour fr cartomanzia cartomante amore ludwig, karl friedrich wilhelm singolo cazzone googel france this world hp 2610 cartucce zawsze tam gdzie ty zoo kepper thermaltake silent power 680w bourne, randolph silliman mara venier immagini scandalose eizo m1700 messter, oskar escort annuncio tango sakamoto lettore 1 gb pen drive giochi pc gratis scaricare polizia stradale lodi disney channel commemorazione 4 novembre erreway cuatro caminos la pelicula adattatore diapositive hp lupen 3 basi anathana the man who cried gangbang lazio dotto ren dragonstea dei tei blue stand up compressori aria quattro amici al bar video de michelle vieth gratis papa dance john lyons bregovic mid lallone atti di violenza commando leopard ca 19 9 antigene gastrointestinale centro per l impiego di roma storia del sistema bancario eurocommerciale esecuzioni immobiliare www swimmingpool de donna formosa picotto mostre a vicenza rodo cotidiano bennassi bros peter finch halichondria stampante colori a2 aciclovir eg crema 3 g skank vou levar gonne jeans levi uccidi e muori video pompini a cani tavoli da pranzo ore 10,10 attentato notebook 64 palmare mitac mio168 electronica 2000 phatos evanescence haunted talens, jenaro prozac gioia nera panasonic cs amstrad videocamera fotocamera digitale combattimento di tancredi e clorinda como duele amar digitale terrestre scheda calendario eva henger videos de shakira tu ventole raffreddamento ram bed breakfast siena sondaggi uomini donne tergeste car holder yankse 1 32 grupo tkb personaggi famosi prestito san vincenzo

EPrints 3 Reference: Directory Structure - Metadata Fields - Repository Configuration - XML Config Files - XML Export Format - EPrints data structure - Core API - Data Objects

For a how-to, please see Autocompletion and Authority Files (Romeo Autocomplete)

Autocompletion in EPrints 3 consists of serveral stages.

  • A field in the workflow is configured to say what autocompletion URL to use, plus any additional parameters to pass to the script. This URL must be on the same server (eg. foo.eprints.org) but does not have to be part of the EPrints system.
  • The autocomplete script takes the text typed so far (and maybe the additional parameters) and returns a chunk of XML describing possible autocomplete options. This XML consists of a number of rows (how many is up to the script).
  • Each row contains some HTML to show the person viewing plus a magic <ul> block which is hidden from display, but is used by the autocomplete javascript to autocomplete the page.

Autocomplete Scripts

EPrints autocomplete scripts live in /opt/eprints3/cgi/users/lookup/ you can add your own here, or maybe elsewhere if, for example, you needed to use PHP.

There are several kinds of autocomplete scripts:

  • thoses that just use the existing data in your repository (these are dead easy as they work out of the box)
  • ones which use a file which you place in your repositories cfg/autocomplete/ directory.
  • more clever ones.

You may be able to find new autocomplete scripts and authority files on http://files.eprints.org/

Scripts are in (rough) order of complexity to use...

journal_by_name

Can only be used on the "publication" field. Looks up the publication in the existing publications in the repository and autocompletes the publication. If ISSN and/or publisher exist in the same input component as the journal field they will also be completed if data is available.

journal_by_issn

As above, but attached to the ISSN field.

event_by_name

Similar to journal_by_name. Is attached to the event_title field and autocompletes from existing repository data. If they are in the same (multi) input component it will also try and autocomplete event_location, event_dates and event_type.

name

Attached to a multiple compound name/id field (eg. creators) looks up the name in the existing list in the repository. Can match on any id or given or family. Populates all parts of the current row it can.

title_duplicates

This is a slightly odd script as it doesn't actually provide any autocomplete data. What it does is search the list of existing titles to see if there is a match. It only searches if there are 5 or more characters entered so far.

If it finds any matches it lists them with a warning that they might be a problem, but does not assist autocompletion. If many matches are made then a short title only is shown, if the list is only 4 or lest then a full citation is shown.

This is set to "on" by default in the hope that it will reduce duplicate submissions.

simple_file

File needs an additional parameter to be passed to it. This is configured in the workflow. This parameter is the name of a file in the cfg/autocompete directory. This file contains a list of values which are searched (case insensitively) and matches returned. A second parameter of "mode=prefix" can be set to only match values which start with the text being typed, rather than contain it.

simple_sql

Similar to simple_file but gets its values from a database table.

The table must be in the eprints database used by this repository and start with "ac_". The script needs a param. passed from workflow to indicate the name of the table WITHOUT the ac_ prefix. Eg. if the table was "ac_badgers" the parameter would be "table=badgers". The only field used is "value" which works like the lines in the text file. If you want this to be blindingly fast you can make sure "value" is indexed, and set mode=prefix. With those set autocompleting from a dictionary of half a million words worked cheerfully.

romeo

(not included in 3.0, expected in 3.1) This script uses the EPrints/Romeo data to provide journal autocomplete data. Should be attached to the publication field. This is almost identical to file, but inserts the required Powered by Sherpa note.

url_name_value

This works like simple_sql except for the fact it uses three columns. url, name and value. It searches and autocompletes using value, but the human-readable description is supplied by "name" and if url is set then a (more info) link is shown. The link opens a new window to avoid mid-form trauma.

file

This is for more complex autocompletion authority files. It works like simple_file except that the file format is more complicated.

The file constists of lines which contan:

  • a value to search, (eg. "African Journal of Agricultural Research")
  • a tab
  • a <li> autocomplete chunk. (with no line breaks) eg.
 <li style='border-right: solid 50px #30FF30' >

external source

This takes all the ideas above, and extends them to make an API call to an external data source. This has the advantage that you are always referring to the authoritative source, but the disadvantage that you are reliant on both the network being up and the external source being available.

It breaks down into two parts:

  • the autocompleter call in the web page
  • the script being called

For an example, here is one way to query the RoMEO data directly:

First, set the autocompleter in the eprints workflow:

     <component type="Field::Multi">
      <title>Article Publication Details</title>
        <field ref="publication" input_lookup_url="{$config{perl_url}}/get_journals" />
        <field ref="publisher" />
        <field ref="issn" />
      </component>

Next have the script:

use strict;
use HTTP::Request;
use LWP::UserAgent;
use XML::Twig;

use Data::Dumper;
use EPrints;

my $journal_data = {};

sub urldecode{
  my ($url) = @_;
  $url =~ s/%([0-9a-f][0-9a-f])/pack("C",hex($1))/egi;
  $url =~ s/\x2B/ /; # swap ' ' for ' '
  return $url;
}

# XML::Twig's routine for dealing with a journal entry
sub process_journal {
  my ( $twig, $journal ) = @_;

  # get the components
  my $title = urldecode( $journal->first_child('jtitle')->text );

  my $zetoc = urldecode( $journal->first_child('zetocpub')->text ) 
                  if $journal->first_child('zetocpub');
  my $romeo = urldecode( $journal->first_child('romeopub')->text )
                  if $journal->first_child('romeopub');
  my $issn  = urldecode( $journal->first_child('issn')->text )
                  if $journal->first_child('issn');

  my $publisher = $romeo;
  $publisher = $zetoc if (not $publisher