Talk:API:EPrints

I'm going to use this page to get my thoughts in order. Cjg 16:58, 2 September 2009 (BST)

Current 3.1 System

Unsessioned Classes

These classes don't store a session internally resulting in methods like $foo->render( $session, ARGS ).

Repository
DataSet
MetaField
Language

Sessioned Classes

These classes store a session internally resulting in methods like $foo->render( ARGS ).

Session
DataObj
Plugin
List
Search
Database
Workflow
ScreenProcessor

API

New Plan(!)

Merge Session and Repository into a single class.
Move XML functions into their own class.
Move Page functions into their own class.
Add a link to repository for dataset and metafield.
Ensure cleanup when repository object goes out of scope.
Make EPrints->new() return an eprints object which can pass out repository objects.
- repository objects don't have a link to the EPrints object EVER.
- When the eprints object is DESTROY'd it takes out the repositories, datasets and metafields etc.

Stage 0.9 API

We aim to finish these modules in the next few days, the 2nd list below require some more thought and are less central.

EPrints

$ep = EPrints->new();
$repo = $ep->repository( "devel", noise=>1 );
$repo = $ep->current_repository(); # from Apache::Request URI
EPrints->abort( $message );

repo

Yes:

$xml = $repo->xml;
$dataset = $repo->dataset( "user" );
$user = $repo->current_user;
$query = $repo->query;
$current_page_url = $repo->current_url; 
$config_element = $repo->config( $key, [@subkeys] );
$repository->log( $message ); 
$string = $repo->query->param( "X" );
$repo->redirect( $url );

Maybe:

tdb: language accessors
- cjg: I think thses don't belong in the API - why do people need access to them? If I'm wrong we add them in API 1.1?

$lang = $repo->language( $langid );
$lang = $repo->default_language; # needed? nee ->language( undef )
$lang = $repo->current_language;

cjg: I think these would be very useful but they are inconsistent.

$eprint = $repo->eprint( 23 );
$user = $repo->user( 23 );
$user = $repo->user_by_username( "cjg" );
$user = $repo->user_by_email( 'cjg@ecs.soton.ac.uk' );

dataset

$string = $dataset->base_id; # eprint
$string = $dataset->id; # inbox
$dataobj = $datasset->create_dataobj( $data );
$user = $dataset->dataobj( 23 );

$search = $dataset->prepare_search( %options );
$list = $dataset->search( %options ); # prepare_search( %options )->execute
$list = $dataset->search; # match ALL

$metafield = $dataset->field( $fieldname );
$metafield = $dataset->key_field;
@metafields = $dataset->fields; 

$dataset->search->map( sub {}, $ctx );
$n = $dataset->search->count; 
$ids = $dataset->search->ids;

Maybe:

tdb: is this common enough to replace EPrints::List->new()?
- cjg: part of the API goal IMO is to get away from classnames if at all possible, so this would become the proper way to create a list object.

$list = $dataset->list( \@ids );

Maybe not:

pm5 why not this one? i think thats handy
- cjg: because we could use (defined $dataset->field( $fieldname ))

$bool = $dataset->has_field( $fieldname );

list

$n = $list->count;
$list->map( sub {}, $ctx );
$dataobj = $list->item( offset );
@dataobjs = $list->slice( offset, length ); 
# tdb: consistent with creating a list: unbounded returns should use array ref?
\@ids = $list->ids;

XML

$doc = $xml->parse_string( $string );
$doc = $xml->parse_file( $filename );
$doc = $xml->parse_url( $url );

$utf8_string = $xml->to_string( $dom_node, %opts );
# tdb: to_text needs a better name
$utf8_string = $xml->to_text( $dom_node, %opts ); # nee tree_to_utf8

$dom_node = $xml->clone( $dom_node ); # deep
$dom_node = $xml->clone_node( $dom_node ); # shallow

$dom_node = $xml->contents_of( $dom_node ); # clone and return child nodes
$utf8_string = $xml->text_contents_of( $dom_node ); # clone and return text child nodes

$dom_node = $xml->create_element( $name, %attr );
$dom_node = $xml->create_text_node( $value );
$dom_node = $xml->create_comment_node( $value );
$dom_node = $xml->create_document_fragment;

$xml->dispose( $dom_node );

- - - pm5 firstly most of this looks good so well done. is there a reason why we have create_element but appendChild? i know the camel casing is sort of a hand me down from the various xml libraries might it be worth wrapping those or camel casing everything in this class (just for consistancy). Also im not sure about render_ what about create_ the result being createLink, createInputField etc
      - cjg: because appendChild is a DOM method, not part of the EPrints API. All EPrints API methods use the "foo_bar" style of method name. DOM is a defined standard we can't alter.

XHTML

$xhtml = $repo->xhtml;

$utf8_string = $xhtml->to_xhtml( $dom_node, %opts ); # remove NS prefixes, fix <script> etc.

$xhtml_dom_node = $xhtml->input_field( $name, $value, %opts ); # nb. type & noenter are now options.
$xhtml_dom_node = $xhtml->hidden_field( $name, $value, %opts ); # tdb: this is used *a lot* and is well defined
$xhtml_dom_node = $xhtml->text_area_field( $name, $value, %opts ); # tdb: value becomes a child text node 
$xhtml_dom_node = $xhtml->form( $method, $url );

$page = $xhtml->build_page( %opts );

# $repo->xhtml->phrase( $phrase_id )
$xhtml_dom_node = $xhtml->phrase( $phrase_id, %pins );

move this to repository? Cjg 20:03, 17 September 2009 (BST)

# is this an XHTML function or move to Repository?
$utf8string = $xhtml->text_phrase( $phrase_id, %pins );

# Maybe not (due to making Patrick sad, and just as easy to do with create_element( "a", href=>... )
$xhtml_dom_node = $xhtml->link( $url, %opts ); #nb will require clever hack if scalar @opts = 1;

tdb: these are really not useful - nbsp is a kludge, ruler should be done by CSS
- Cjg: nbsp is just a convenient way of inserting a unicode character which is frequently used, so I vote to keep it in, name() is also useful IMO. Ruler can be easily replaced with a phrase call, and isn't used much in EP3+ so I can happily see that go into the west and diminish.

$xhtml_dom_node = $xhtml->ruler;
$xhtml_dom_node = $xhtml->nbsp;
$xhtml_dom_node = $xhtml->name( $namehash, render_order=>"gf" ); # or fg

Page

$page->send( %options ); 
$page->write_to_file( $filename );

DataObj

$dataobj = $dataset->dataobj( $id );
$dataobj->delete;
$dataobj->commit( $force );
$dataobj->set_value( $fieldname, $value );
$value = $dataobj->value( $fieldname );
\@value = $dataobj->value( $fieldname ); # multiple
$boolean = $dataobj->is_set( $fieldname );
$id = $dataobj->id;
$xhtml = $dataobj->render_value( $fieldname );
$xhtml = $dataobj->render_citation( $style, %opts );
($xhtml, $title, $head_elements) = $dataobj->render;
$uri = $dataobj->uri;
$url = $dataobj->url;
$string = $dataobj->export( $plugin_id, %opts );
$dataobj = $dataobj->create_subobject( $fieldname, $epdata );

cjg: would render() be better replaced with something which returns a hash not an array?

MetaField

Now has a handle on both it's repository/session AND dataset.

my $field = $dataset->field( $fieldname );
$field->set_property( $property, $value );

$name = $field->name;
$type = $field->type;
$value = $field->property( $property );

$xhtml = $field->render_name;
$xhtml = $field->render_help;
$xhtml = $field->render_value_label( $value );

$values = $field->values( %opts );
$sorted_list = $field->sort_values( $unsorted_list );

Stage 1.0 API

This is still required to make the basic API complete, but we'll worry about once we've got the stuff above finalisedish.

search

TERM_EQUALS = "EQ"
TERM_INDEX = "IN"
TERM_EXACT = "EX"

MATCH_ALL = "ALL"
MATCH_ANY = "ANY"

$search = $dataset->prepare_search( order => "-date", satisfy_all=>1 );
$search = $dataset->prepare_search( terms => [], filters => [], ... );

$list = $search->execute;

# constructor interface
$s = $dataset->prepare_search( order => "-date", satisfy_all => 1, terms => [
 { fields => "type", values => [qw( article book )], op => TERM_EQUALS, match => MATCH_ANY },
 { fields => [qw( title abstract )], values => ["dogbert dilbert"], op => TERM_INDEX, match => MATCH_ALL },
 { fields => "userid", values => "52", op => TERM_EXACT },
] );

tdb: meta_fields/fields/field, value/values?
- cjg: could have choice of fields=>["",""] or field=>""

# method interface
$s = $dataset->prepare_search( order => "-date", satisfy_all => 1 );
$s->add_term( "type", [qw( article book )], op => TERM_EQUALS, match => MATCH_ANY );
$s->add_term( [qw( title abstract )], ["dogbert dilbert"], op => TERM_INDEX, match => MATCH_ALL );
$s->add_term( "userid", "52", op => TERM_EXACT );
$list = $s->execute;

print $list->count, "\n";

tdb: I would like to make the value argument not automagically-split on ANY but explicitly use an array ref of values to match.
- cjg: Agreed.

Maybe

$s->add_term( $dataset->field( "type" ), [qw/ article book/], merge => MERGE_ANY );

cjg: I'm not convinced we need this.

SubObject

tdb: superclass of Document, File, History, SavedSearch

$dataobj = $dataobj->parent;

EPrint

$eprint->set_status( "inbox" );

tdb: consistency - ->documents ~ ->value( "documents" ) and ->value() returns array ref for multiple

my $docs = $eprint->documents( %opts ); # isVolatileVersionOf => 0 ?
foreach my $document ( @$docs ) { .. }

# create a new document on $eprint 
$doc_data = { ... };
$doc = $eprint->create_subobject( "documents", $doc_data ); 
# eprint to which this document belongs
$eprint = $doc->parent;

User

$user = $repository->dataset( "user" )->dataobj( 23 );

tdb: is this accessor or action? c.f. send_page()
- It's an action. Cjg 20:04, 17 September 2009 (BST)

$user->send_email( .... )

Subject

$subject = $repository->dataset("subject")->dataobj( "FOO" );
\@subjects = $subject->children;
\@subjects = $subject->parents;

CJG: issue with confusion with subobjects like document, or are these really subobjects? Maybe child_subjects parent_subjects to avoid overlapping names?
tdb: subobject relation = if you remove parent, children are removed - is this true of subjects? Could extend subobject concept to be M:N
- $subject->value( "children" ) = ->children
- $subject->value( "parents" ) = ->parents

Document

$doc = $repo->dataset( "document" )->dataobj( 23 );

# delete a document object *forever*:
$ok = $doc->delete;

$url = $doc->url( [$file] );

tdb: is $file ^^ a file object or filename?
- Cjg 20:05, 17 September 2009 (BST): Duh, we should just use $doc->file_by_filename( $file )->get_url;

# change the file which is used as the URL for the document.
$doc->set_value( "main", "foo.html" );

# dealing with files
$file = $doc->file_by_filename( $filename ); # nee get_stored_file
\@files = $doc->value( "files" );
$file->value( "filename" );
$file->value( "filesize" );

# delete a file
$doc->file_by_filename( $filename )->delete;

# delete all files
$files = $doc->value( "files" );
foreach my $file (@$files) { $file->delete; } 

# icons and previews???? These need work!
$xhtml = $doc->render_icon_link( %opts );
$xhtml = $doc->render_preview_link( %opts );

Maybe

# Add files to the document
# tdb: these should be deprecated from Document and moved to Import/Screen plugins
$success = $doc->upload( $filehandle, $filename [, $preserve_path [, $filesize ] ] );
$success = $doc->upload_archive( $filehandle, $filename, $archive_format );
$success = $doc->add_archive( $file, $archive_format );
$success = $doc->add_directory( $directory );
$success = $doc->upload_url( $url );

File

$file = $history->file_by_filename( "dataobj.xml" );
$file->retrieve( sub { ... }, $info ); # callback-based file content retrieval

I don't know enough about this method to know what we *must* have in. Less is better IMO, though. Cjg 01:23, 17 September 2009 (BST)
tdb: files should probably be only created via a parent object

Talk:API:EPrints

Contents

Current 3.1 System

Unsessioned Classes

Sessioned Classes

API

Stage 0.9 API

EPrints

repo

dataset

list

XML

XHTML

Page

DataObj

MetaField

Stage 1.0 API

search

Maybe

SubObject

EPrint

User

Subject

Document

Maybe

File

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Wiki

Tools