Triggers

From EPrints Documentation
Revision as of 17:13, 25 July 2024 by Drn@ecs.soton.ac.uk (talk | contribs) (Example)
Jump to: navigation, search

Triggers are a means of defining a function that will be called at a particular execution point (often referred to as a hook). EPrints repository software facilitates this as often a repository may want to carry out a particular task when a specific event occurs (or is about to occur). EPrints has many different types of trigger some are generic and some are applied to specific dataset. The following are those most commonly implemented:

Repository Triggers

EP_TRIGGER_LOG

When logging messages to standard error. (Not currently implemented. You can use $c->{log} setting in log.pl instead).

EP_TRIGGER_BOILERPLATE_RDF

When generating the boilerplate RDF for the repository.

Example

Adds boilerplate triples (e.g. licence and attribution) to the RDF output.

$c->add_trigger( EP_TRIGGER_BOILERPLATE_RDF, sub {
    my( %o ) = @_;

    my $license_uri = $o{repository}->config( "rdf", "license" );
    if( defined $license_uri )
    {
        $o{graph}->add(
            subject => "<>",
            predicate => "cc:license",
            object => "<$license_uri>" );
        }
        else
        {
            $o{graph}->add(
                subject => "<>",
                predicate => "rdfs:comment",
                object => "The repository administrator has not yet configured an RDF license.",
                type => "xsd:string" );
        }

        my $attributionName = $o{repository}->config( "rdf", "attributionName" );
        if( defined $license_uri )
        {
            $o{graph}->add(
                subject => "<>",
                predicate => "cc:attributionName",
                object => $attributionName,
                type => "xsd:string" );
        }

        my $attributionURL = $o{repository}->config( "rdf", "attributionURL" );
        if( defined $attributionURL )
        {
            $o{graph}->add(
                subject => "<>",
                predicate => "cc:attributionURL",
               object => "<$attributionURL>" );
        }
});

EP_TRIGGER_REPOSITORY_RDF

When generating repository specific RDF.

Example

Adds repository-specific triples to the RDF output.

$c->add_trigger( EP_TRIGGER_REPOSITORY_RDF, sub {
    my( %o ) = @_;

    my $repository_uri = "<".$o{repository}->config( "base_url" )."/id/repository>";

    my @eprint_ids = sort @{$o{repository}->dataset("archive")->get_item_ids( $o{repository} )};

    my $oai_config = $o{repository}->config( "oai" );

    $o{graph}->add(
        subject => $repository_uri,
        predicate => "rdf:type",
        object => "ep:Repository" );
    $o{graph}->add(
        subject => $repository_uri,
        predicate => "dct:title",
        object => $o{repository}->phrase( "archive_name" ),
        type => "xsd:string" );
    $o{graph}->add(
        subject => $repository_uri,
        predicate => "foaf:homepage",
        object => "<".$o{repository}->config( "base_url" )."/>" );
    $o{graph}->add(
        subject => $repository_uri,
        predicate => "ep:OAIPMH2",
        object => "<".$o{repository}->config( "base_url" )."/cgi/oai2>" );

...

    my $root_subject = $o{repository}->dataset("subject")->dataobj("ROOT");
    foreach my $top_subject ( $root_subject->get_children )
    {
        $o{graph}->add(
            subject => $repository_uri,
            predicate => "ep:hasConceptScheme",
            object => "<".$top_subject->uri."#scheme>" );
    }
});

EP_TRIGGER_BEGIN

When an EPrints::Repository object is created. (Not currently implemented).

EP_TRIGGER_BEGIN_REQUEST

At the start of an HTTP request being processed.

EP_TRIGGER_END_REQUEST

At the end of processing an HTTP request.

EP_TRIGGER_END

Just before an EPrints::Repository object is destroyed. (Not currently implemented).

EP_TRIGGER_DOC_URL_REWRITE

When rewriting an URL for a document so it can be responded to with the correct resource.

Example

Send the coversheeted version of the document rather than the original.

$c->add_trigger( EP_TRIGGER_DOC_URL_REWRITE, sub
{
    my( %args ) = @_;

    my( $request, $doc, $relations, $filename ) = @args{qw( request document relations filename )};
    return EP_TRIGGER_OK unless defined $doc && ref $doc eq "EPrints::DataObj::Document"; # To avoid missing field error with dark documents that do not get coversheeted.

    # check document is a pdf
    my $format = $doc->value( "format" ); # back compatibility
    my $mime_type = defined $doc->value( "mime_type" ) ? $doc->value( "mime_type" ) : "";
    return EP_TRIGGER_OK unless( $format eq "application/pdf" || $mime_type eq "application/pdf" || $filename =~ /\.pdf$/i );

    # ignore thumbnails e.g. http://.../8381/1.haspreviewThumbnailVersion/jacqueline-lane.pdf
    foreach my $rel ( @{$relations || []} )
    {
        return EP_TRIGGER_OK if( $rel =~ /^is\w+ThumbnailVersionOf$/ );
    }

    # ignore volatile documents
   return EP_TRIGGER_OK if $doc->has_relation( undef, "isVolatileVersionOf" );

    my $session = $doc->get_session;
    my $eprint = $doc->get_eprint;

    # search for a coversheet that can be applied to this document
    my $coversheet = EPrints::DataObj::Coversheet->search_by_eprint( $session, $eprint );
    return EP_TRIGGER_OK unless( defined $coversheet );

    # check whether there is an existing covered version and whether it needs to be regenerated
    my $current_cs_id = $doc->get_value( 'coversheetid' ) || -1; # coversheet used to cover document
    my $coverdoc; # existing covered version

    if( $coversheet->get_id == $current_cs_id )
    {
        # get the covered version of the document
        $coverdoc = $coversheet->get_coversheet_doc( $doc );
    }

    if( defined $coverdoc )
    {
        # return the covered version
        $coverdoc->set_value( "security", $doc->get_value( "security" ) );
        $request->pnotes( document => $coverdoc );
        $request->pnotes( dataobj => $coverdoc );

        # Only update request filename to coverdoc's filename if it is defined and the document filename matches that used in the request.  
        # If the document filename does not match that used in the request, then not updating the filename will mean requests with a
        # spurious filename will get a 404 error rather than returning the document under a spurious filename.
        $request->pnotes( filename => $coverdoc->get_main ) if defined $coverdoc->get_main && defined $doc->get_main && $filename eq $doc->get_main;
    }
 
    # return the uncovered document

    return EP_TRIGGER_DONE;

}, priority => 100 );

EP_TRIGGER_MEDIA_INFO

When extracting media information about a file (e.g. MIME type, resolution if image/video, duration if audio/video, etc.).

Example

Set the mime_type index for <code$epdata to whatever is determined from the file extension using the mimemap index.

$c->add_trigger( EP_TRIGGER_MEDIA_INFO, sub {
    my( %params ) = @_;

    my $epdata = $params{epdata};
    my $filename = $params{filename};
    my $repo = $params{repository};

    return 0 if defined $epdata->{mime_type};

    if( $filename=~m/\.([^.]+)$/ )
    {
        my $suffix = "\L$1";
        $epdata->{mime_type} = $repo->config( "mimemap", $suffix );
    }

    return 0;
}, priority => 5000);

EP_TRIGGER_INDEX_FIELDS

When fields are being indexed.

Example

Also index fields in the Xapian index.

$c->add_trigger( EP_TRIGGER_INDEX_FIELDS, sub {
    my( %params ) = @_;

    my $repo = $params{repository};
    my $dataobj = $params{dataobj};
    my $dataset = $dataobj->dataset;

    if( !$repo->config( 'xapian', 'enabled' ) )
    {
        return;
    }

   if( !$dataset->indexable )
   {
       $repo->log( "Xapian::Index: dataset %s is not indexable", $dataset->id );
       return;
   }

   # caches the Xapian indexing module in the current EPrints::Repository object
   # only relevant when a full dataset re-indexing is performed (to avoid re-connecting
   # to the Xapian DB too often)
   my $indexer = $repo->{_xapian_indexer};
   if( !defined $indexer )
   {
       $indexer = $repo->{_xapian_indexer} = EPrints::Xapian::Index->new(
           repository => $repo,
           dataset => $dataset,
       );
   }

   if( !defined $indexer )
   {
       $repo->log( "Failed to load EPrints::Xapian::Index" );
       return;
   }

   $indexer->index_dataobj( $dataobj );

   return EP_TRIGGER_OK;
} );

EP_TRIGGER_INDEX_REMOVED

When fields are being removed from the index.

Example

Remove indexing from Xapian index.

$c->add_trigger( EP_TRIGGER_INDEX_REMOVED, sub {
    my( %params ) = @_;

    my $repo = $params{repository};
    my $dataset = $params{dataset};
    my $id = $params{id};

    if( !exists $repo->{_xapian} )
    {
        $repo->{_xapian} = undef;
        $repo->{_xapian_limit} = 0;

        # if plugin disabled, don't continue
        return if !defined $repo->plugin( "Search::Xapian" );

        my $path = $repo->config( "variables_path" ) . "/xapian";
        EPrints->system->mkdir( $path ) if !-d $path;

        $repo->{_xapian} = eval { Search::Xapian::WritableDatabase->new(
            $path,
            Search::Xapian::DB_CREATE_OR_OPEN()
        ) };
        $repo->log( $@ ), return if $@;
    }

    my $db = $repo->{_xapian};
    return if !defined $db;

    my $key = "_id:/id/" . $dataset->base_id . "/" . $id;
    my $enq = $db->enquire( Search::Xapian::Query->new( $key ) );
    my @matches = $enq->matches( 0, 1 );
    if( @matches )
    {
        $db->delete_document( $matches[0]->get_docid );
    }
});

EP_TRIGGER_URL_REWRITE

When rewriting an URL so it can be responded to with the correct resource.

Example

Use the MePrints handler for serving profile pages.

$c->add_trigger( EP_TRIGGER_URL_REWRITE, sub
{
    my( %args ) = @_;

    my( $uri, $rc, $request ) = @args{ qw( uri return_code request ) };

   if( defined $uri && ($uri =~ m#^/profile/(.*)$# ) )
   {
       $request->handler('perl-script');
       $request->set_handlers(PerlResponseHandler => [ 'EPrints::Plugin::MePrints::MePrintsHandler' ] );
       ${$rc} = EPrints::Const::OK;
   }

   return EP_TRIGGER_OK;

}, priority => 100 );

EP_TRIGGER_VALIDATE_FIELD

When a metadata field is being validated.

Example

Validates that if a password field is submitted that it is no longer that the maximum length specified in $c->{password_maxlength} to protect against a Denial-of-Service attack, as password hashing is very expensive if the input is excessively long.

$c->add_trigger( EPrints::Const::EP_TRIGGER_VALIDATE_FIELD, sub
{
    my( %args ) = @_;
    my( $repo, $field, $user, $value, $problems ) = @args{qw( repository field dataobj value problems )};

    return unless defined $user && $user->isa( "EPrints::DataObj::User" ) && $field->type eq "secret";

    my $password;
    foreach my $key ( keys %{ $repo->{query}->{param} } )
    {
        my $fieldname = $field->name;
        if ( $key =~ m/$fieldname$/ )
        {
            $password = $repo->{query}->{param}->{$key}->[0];
            last;
        }
    }

    if ( defined $password && defined $repo->config( "password_maxlength" ) && length( $password ) > $repo->config( "password_maxlength" ) )
    {
        my $fieldname = $repo->xml->create_element( "span", class=>"ep_problem_field:".$field->get_name );
        $fieldname->appendChild( $field->render_name( $repo ) );
        my $maxlength_el = $repo->make_text( $repo->config( "password_maxlength" ) );
        push @$problems, $repo->html_phrase( "validate:password_too_long",
            fieldname => $fieldname,
            maxlength => $maxlength_el,
        );
    }
}, priority => 1000 );

EP_TRIGGER_LOCAL_SITEMAP_URLS

When generating URLs to add to the repository sitemap (i.e. sitemap.xml).

Example

Add creators browse view pages to sitemap.

$c->add_trigger( EP_TRIGGER_LOCAL_SITEMAP_URLS, sub
{
    my( %args ) = @_;

    my( $repository, $urlset ) = @args{qw( repository urlset )};

    $urlset->appendChild( EPrints::Utils::make_sitemap_url( $repository, {
        loc => $repository->config( "base_url" ).'/view/creators/',
        changefreq => 'monthly'
    } ) );

   return EP_TRIGGER_OK;
});

EP_TRIGGER_DYNAMIC_TEMPLATE

When the dynamic template for a web page response to a request is being generated.

Example

$c->add_trigger( $EPrints::Plugin::Stats::EP_TRIGGER_DYNAMIC_TEMPLATE, sub
{
    my( %args ) = @_;

    my( $repo, $pins ) = @args{qw/ repository pins/};

    my $protocol = $repo->get_secure ? 'https':'http';

    my $head = $repo->make_doc_fragment;

    $head->appendChild( $repo->make_javascript( undef,
        src => "$protocol://www.google.com/jsapi"
    ) );

    $head->appendChild( $repo->make_javascript( 'google.charts.load("current", {packages:["corechart", "geochart"]});' ) );

    if( defined $pins->{'utf-8.head'} )
    {
        $pins->{'utf-8.head'} .= $repo->xhtml->to_xhtml( $head );
    }

    if( defined $pins->{head} )
    {
        $head->appendChild( $pins->{head} );
        $pins->{head} = $head;
    }
    else
    {
        $pins->{head} = $head;
    }

    return EP_TRIGGER_OK;
} );

EP_TRIGGER_THUMBNAIL_TYPES

When determining what thumbnails types to generate for (a) file(s) associated with a document.

Dataset Triggers

EP_TRIGGER_CREATED

When a new data object is being created.

Example

Creates a default list object for a user.

$c->add_dataset_trigger(
    'user',
    EPrints::Const::EP_TRIGGER_CREATED,
    sub {
        my (%args) = @_;
        my ( $session, $user ) = @args{qw( repository dataobj )};
        EPrints::Lists::Utils::create_default_list( $session, $user );
    }
);

EP_TRIGGER_RDF

When RDF for a data object is being generated.

Example

Adds certain SKOS triples to RDF output.

$c->add_dataset_trigger( "eprint", EP_TRIGGER_RDF, sub {
    my( %o ) = @_;
    my $eprint = $o{"dataobj"};
    my $eprint_uri = "<".$eprint->uri.">";

    return if ! $eprint->dataset->has_field( "subjects" );
    return if ! $eprint->is_set( "subjects" );

    foreach my $subject_id ( @{$eprint->get_value( "subjects" )} )
    {
        my $subject = $o{repository}->dataset( "subject" )->dataobj( $subject_id );
        if( $subject )
        {
            my $subject_uri = "<".$subject->uri.">";
            $o{graph}->add(
                subject => $subject_uri,
                predicate => "rdf:type",
                object => "skos:Concept" );
            foreach my $name ( @{$subject->get_value( "name" )} )
            {
                $o{graph}->add(
                    subject => $subject_uri,
                    predicate => "skos:prefLabel",
                    object => $name->{name},
                    type => "literal",
                    lang => $name->{lang} );
            }
            $o{graph}->add(
                subject => $eprint_uri,
                predicate => "dct:subject",
                object => $subject_uri );
        }
    }
});

EP_TRIGGER_DEFAULTS

When a new data object is setting its default values. (Not currently implemented. You can use $c->{set_<DATASET>_defaults} setting in document_fields_default.pl, eprint_fields_default.pl or user_fields_default.pl instead).

EP_TRIGGER_STATUS_CHANGE

When the status of a data object changes, (e.g. when an eprint is moved to the live archive).

Example

Add an indexer task to coin/mint a DataCite DOI for the eprint.

$c->add_dataset_trigger( "eprint", EP_TRIGGER_STATUS_CHANGE , sub {
    my ( %params ) = @_;

    my $repository = $params{repository};

    return undef if (!defined $repository);

    if (defined $params{dataobj}) {
        my $dataobj = $params{dataobj};
        my $eprint_id = $dataobj->id;
        $repository->dataset( "event_queue" )->create_dataobj({
            pluginid => "Event::DataCiteEvent",
            action => "datacite_doi",
            params => [$dataobj->internal_uri],
        });
    }

});

EP_TRIGGER_BEFORE_COMMIT

Just before the changes to a data object are saved to the database.

Example

Set or otherwise tidy up the path field for the list object.

$c->add_dataset_trigger( "list", EPrints::Const::EP_TRIGGER_BEFORE_COMMIT, sub {
    my (%args) = @_;
    my ( $session, $list ) = @args{qw( repository dataobj )};
    if ( !$list->is_set("path") ) {
        $list->set_value( "path", EPrints::DataObj::List::tidy_path( $list->get_value("title") ) );
    }
    else {
        my $path = $list->get_value("path");
        my $tidy_path = EPrints::DataObj::List::tidy_path($path);
        $list->set_value( "path", $tidy_path ) if $path ne $tidy_path;
    }
});

EP_TRIGGER_AFTER_COMMIT

Just after the changes to a data object have been saved to the database.

Example

Remove the cache static page for the user's profile, so that it can be re-generated and re-cached.

$c->add_dataset_trigger( "user", EPrints::Const::EP_TRIGGER_AFTER_COMMIT, sub {
    my( %params ) = @_;

    my $repo = $params{repository};
    my $user = $params{dataobj};
    my $changed = $params{changed};

    if( scalar( keys %{$changed||{}} ) )
    {
        $user->remove_static();
    }
});

EP_TRIGGER_VALIDATE

When the data object fields are being checked for their validity.

EP_TRIGGER_WARNINGS

When a data objects is generating warnings about its state (e.g. missing or inappropriate metadata field values).

EP_TRIGGER_FILES_MODIFIED

When a file associated with a data object has been modified (e.g. new thumbnails will need to be generated if the file they are based on has changed or added).

Example

Generate special preview versions for certain documents.

$c->add_dataset_trigger( "document", EPrints::Const::EP_TRIGGER_FILES_MODIFIED, sub
{
    my( %args ) = @_;
    my( $session, $doc ) = @args{qw( repository dataobj )};

    my $eprint = $doc->get_parent;

    # abridged from bin/generate_previews
    my $dir = $session->get_repository->get_conf("archiveroot" )."/cfg/static/previews/".$eprint->get_id;

    return  unless $doc->get_value("main") =~ /\.(pdf|doc|docx|xls|xlsx|ppt|pptx|md)$/i or  $doc->get_value( "format" ) eq "code";

    $dir .= "/" . $doc->get_id;
    my $preview_source = $session->get_repository->get_conf("archiveroot" ) . "/cfg/static/previews/" . $eprint->get_id . "/" . $doc->get_id;

    # clear any old versions
    if( -e $preview_source ) 
    {
        remove_tree $preview_source;
    }

    my $file_path = $doc->local_path . '/' . $doc->get_main;

    my $script = $session->get_repository->get_conf("base_path") . "/ingredients/preview_generate/bin/previews/make_pages.sh";
    if( $doc->get_value( "format" ) eq "code" )
    {
        $script = $session->get_repository->get_conf("base_path") . /ingredients/preview_generate/bin/previews/make_code_preview.sh";
    }

    my $docid = $doc->get_id;
    `$script "$file_path" "$dir" "$docid"`;

    $eprint->generate_static();
});

EP_TRIGGER_REMOVED

When a data object is destroyed.

Example

Remove indexing from Xapian if eprint is removed.

$c->add_dataset_trigger( "eprint", EP_TRIGGER_REMOVED, sub {
    my( %params ) = @_;

    my $repo = $params{repository};
    my $dataobj = $params{dataobj};
    my $dataset = $dataobj->dataset;

    if( !$repo->config( 'xapian', 'enabled' ) )
    {
        return;
    }
    EPrints::DataObj::EventQueue->create_unique( $repo, {
        pluginid => "Event::Xapian",
        action => "remove_index",
        params => [$dataobj->get_id],
    });
    return EP_TRIGGER_OK;
});