Talk:API:EPrints

From EPrints Documentation
Revision as of 00:19, 17 September 2009 by Cjg (talk | contribs) (dataset)
Jump to: navigation, search

I'm going to use this page to get my thoughts in order. Cjg 16:58, 2 September 2009 (BST)

Current 3.1 System

Unsessioned Classes

These classes don't store a session internally resulting in methods like $foo->render( $session, ARGS ).

  • Repository
  • DataSet
  • MetaField
  • Language

Sessioned Classes

These classes store a session internally resulting in methods like $foo->render( ARGS ).

  • Session
  • DataObj
  • Plugin
  • List
  • Search
  • Database
  • Workflow
  • ScreenProcessor

API

New Plan(!)

  • Merge Session and Repository into a single class.
  • Move XML functions into their own class.
  • Move Page functions into their own class.
  • Add a link to repository for dataset and metafield.
  • Ensure cleanup when repository object goes out of scope.
  • Make EPrints->new() return an eprints object which can pass out repository objects.
    • repository objects don't have a link to the EPrints object EVER.
    • When the eprints object is DESTROY'd it takes out the repositories, datasets and metafields etc.

EPrints

$ep = EPrints->new();
$repo = $ep->repository( "devel", noise=>1 );
$repo = $ep->current_repository(); # from Apache::Request URI
EPrints->abort( $message );

repo

Yes:

$xml = $repo->xml;
$dataset = $repo->dataset( "user" );
$user = $repo->current_user;
$query = $repo->query;
$current_page_url = $repo->current_url; 
$config_element = $repo->config( $key, [@subkeys] );
$repository->log( $message ); 
$string = $repo->query->param( "X" );
$repo->redirect( $url );

Maybe:

  • tdb: language accessors
    • cjg: I think thses don't belong in the API - why do people need access to them? If I'm wrong we add them in API 1.1?
$lang = $repo->language( $langid );
$lang = $repo->default_language; # needed? nee ->language( undef )
$lang = $repo->current_language;
  • cjg: I think these would be very useful but they are inconsistent.
$eprint = $repo->eprint( 23 );
$user = $repo->user( 23 );
$user = $repo->user_by_username( "cjg" );
$user = $repo->user_by_email( 'cjg@ecs.soton.ac.uk' );

dataset

$string = $dataset->base_id; # eprint
$string = $dataset->id; # inbox
$dataobj = $datasset->create_dataobj( $data );
$user = $dataset->dataobj( 23 );

$search = $dataset->prepare_search( %options );
$list = $dataset->search( %options ); # prepare_search( %options )->execute
$list = $dataset->search; # match ALL

$metafield = $dataset->field( $fieldname );
$metafield = $dataset->key_field;
@metafields = $dataset->fields; 

$dataset->search->map( sub {}, $ctx );
$n = $dataset->search->count; 
$ids = $dataset->search->ids;

Maybe:

  • tdb: is this common enough to replace EPrints::List->new()?
    • cjg: part of the API goal IMO is to get away from classnames if at all possible, so this would become the proper way to create a list object.
$list = $dataset->list( \@ids ); 

Maybe not:

  • pm5 why not this one? i think thats handy
    • cjg: because we could use (defined $dataset->field( $fieldname ))
$bool = $dataset->has_field( $fieldname );

list

$n = $list->count;
$list->map( sub {}, $ctx );
$dataobj = $list->item( offset );
@dataobjs = $list->slice( offset, length ); 
@ids = $list->ids;

search

$search = $dataset->prepare_search( order => "-date", satisfy_all=>1 );
$search = $dataset->prepare_search( query => [], filters => [], ... );

$list = $search->execute;

# the below call needs replacing with something less sucky.
$search->add_term( %opts ); # better name needed?
$search->add_term( $ds->field( "type" ), qw/ article book /, "EQ", "ANY" );

Alternatives

I would like to make the value argument not automagically-split on ANY but explicitly use an array ref of values to match.

my $s = $dataset->prepare_search( order => "-date", satisfy_all => 1, query => [
 { fields => ["type"], values => [qw/article book/], merge => EPrints::Search::ANY },
 { fields => [qw( title abstract )], values => ["dogbert dilbert"], match => EPrints::Search::INDEX, merge => EPrints::Search::ALL },
 { fields => ["userid"], values => ["52"], match => EPrints::Search::EXACT },
] );
my $s = $dataset->prepare_search( order => "-date", satisfy_all => 1 );
$s->add_query( ["type"], [qw/article book/], merge => EPrints::Search::ANY );
$s->add_query( [qw( title abstract )], ["dogbert dilbert"], match => EPrints::Search::INDEX, merge => EPrints::Search::ALL );
$s->add_query( ["userid"], ["52"], match => EPrints::Search::EXACT );
my $list = $s->execute;

$s->add_query( [$dataset->field( "type" )], [qw/ article book/], merge => EPrints::Search::ANY );
print $list->count, "\n";
  • Cjg 18:27, 16 September 2009 (BST) : I kinda like where this is going but don't like the ugly constants. Also, it'd be much cleaner if you could allow array ref or scalar as the field list and value list.

XML

$doc = $xml->parse_string( $string );
$doc = $xml->parse_file( $filename );
$doc = $xml->parse_url( $url );
$utf8_string = $xml->to_string( $dom_node, %opts ); 
$utf8_string = $xml->to_plain_text( $dom_node, %opts ); # nee tree_to_utf8
$utf8_string = $xml->to_xhtml( $dom_node, %opts ); # remove NS prefixes, fix <script> etc.
$utf8_string = $xml->text_content( $dom_node ); # textual content?
$dom_node = $xml->clone( $dom_node ); # deep
$dom_node = $xml->clone_node( $dom_node ); # shallow
$dom_node = $xml->create_element( $name, %attr );
$dom_node = $xml->create_text_node( $value );
$dom_node = $xml->create_comment_node( $value );
$dom_node = $xml->create_document_fragment;
$xml->dispose( $dom_node );
$page = $xml->build_page( %opts );
$xhtml_dom_node = $xml->html_phrase( $phrase_id );
$utf8string = $xml->text_phrase( $phrase_id );
$xhtml_dom_node = $xml->render_ruler;
$xhtml_dom_node = $xml->render_nbsp;
$xhtml_dom_node = $xml->render_link( $url, %opts ); #nb will require clever hack if scalar @opts = 1;
#pm5 this method makes me sad. i dont think its very well named also im not sure its any less typing than doing it the hardway.
$xhtml_dom_node = $xml->render_name( $namehash, render_order=>"gf" ); # or fg 
$xhtml_dom_node = $xml->render_input_field( %opts ); # nb. noenter & hidden are now options.
$xhtml_dom_node = $xml->render_form( $method, $url );
  • Cjg: Could we shorten this to $xml->ruler $xml->text etc...?
    • TDB: move these to EPrints::XHTML and strip ^render_?
      • ah, in which case XHTML would also have: build_page, to_xhtml and text_content, html_phrase should move there and become "phrase" , in which case maybe $xml->text_phrase() should move back to $repository? Cjg 18:31, 16 September 2009 (BST)
        • pm5 firstly most of this looks good so well done. is there a reason why we have create_element but appendChild? i know the camel casing is sort of a hand me down from the various xml libraries might it be worth wrapping those or camel casing everything in this class (just for consistancy). Also im not sure about render_ what about create_ the result being createLink, createInputField etc

Page

$page->send( %options ); 
$page->write_to_file( $filename );

DataObj

$dataobj = $dataset->dataobj( $id );
$dataobj->remove;
$dataobj->commit;
$dataobj->set_value( $fieldname, $value );
$value = $dataobj->value( $fieldname );
$boolean = $dataobj->is_set( $fieldname );
$id = $dataobj->id;
$xhtml = $dataobj->render_value( $fieldname );
$xhtml = $dataobj->render_citation( $style, %opts );
($xhtml, $title, $head_elements) = $dataobj->render;
$uri = $dataobj->uri;
$url = $dataobj->url;
$string = $dataobj->export( $plugin_id, %opts );

EPrint

@documents = $eprint->documents;
$eprint->set_status( "inbox" );
$document = $eprint->add_document( $doc_data );

User

$user = $repository->dataset( "user" )->dataobj( 23 );
$user->email( .... ) ??

Subject

$subject = $repository->dataset("subject")->dataobj( "FOO" );
@subjects = $subject->children;
@subjects = $subject->parents; 
# issue with confusion with subobjects like document, or are these really subobjects?

Document

my $document = $repo->dataset( "document" )->dataobj( 23 );
foreach my $document ( $eprint->documents ) { .. }

# create a new document on $eprint 
my $doc_data = { .. };
my $document = $eprint->add_document( $doc_data );
# CJG: Could this be more generic as a way of adding subobjects? Should it?

# Add files to the document  
$success = $doc->add_file( $file, $filename );
$success = $doc->upload( $filehandle, $filename [, $preserve_path [, $filesize ] ] );
$success = $doc->upload_archive( $filehandle, $filename, $archive_format );
$success = $doc->add_archive( $file, $archive_format );
$success = $doc->add_directory( $directory );
$success = $doc->upload_url( $url );

# eprint to which this document belongs
$eprint = $doc->parent;

# delete a document object *forever*:
# pm5 this method name isnt servere enough can we have something more "be careful" sounding
# like $doc->destroy or $doc->permenantly_delete_forever
$success = $doc->remove;

$url = $doc->url( [$file] );
$path = $doc->local_path;
%files = $doc->files;

# delete a file
$success = $doc->remove_file( $filename );
# delete all files
$success = $doc->remove_all_files;

# change the file which is used as the URL for the document.
$doc->set_value( "main", $main_file );

# icons and previews???? These need work!
$xhtml = $doc->render_icon_link( %opts );
$xhtml = $doc->render_preview_link( %opts );

File

MetaField

Now has a handle on both it's repository/session AND dataset.

my $field = $dataset->field( $fieldname );
$field->set_property( $property, $value );

$name = $field->name;
$type = $field->type;
$value = $field->property( $property );
$xhtml = $field->render_name;
$xhtml = $field->render_help;
# render_value should be called via dataobj
$xhtml = $field->render_value_label( $value );
  • cjg: inconsistencey with dataset which is currently returning lists as @ not \@
$values = $field->values( %opts );
$sorted_list = $field->sort_values( $unsorted_list );