API:EPrints/DataSet

From EPrints Documentation
Revision as of 15:12, 25 February 2010 by Tdb01r (talk | contribs)
Jump to: navigation, search

EPrints 3 Reference: Directory Structure - Metadata Fields - Repository Configuration - XML Config Files - XML Export Format - EPrints data structure - Core API - Data Objects


API: Core API

Latest Source Code (3.4, 3.3) | Revision Log | Before editing this page please read Pod2Wiki


NAME

EPrints::DataSet - a dataset is a set of records in the eprints system with the same metadata.

User Comments


SYNOPSIS

 my $dataset = $repository->get_dataset( "inbox" );
 
 print sprintf("There are %d records in the inbox\n",
   $dataset->count);
 

User Comments


DESCRIPTION

This module describes an EPrint dataset.

A repository has several datasets that make up the repository's database. The list of dataset ids can be obtained from the repository object (see EPrints::Repository).

A normal dataset (eg. "user") has a package associated with it (eg. EPrints::DataObj::User) which must be a subclass of EPrints::DataObj and a number of SQL tables which are prefixed with the dataset name. Most datasets also have a set of associated EPrints::MetaField's which may be optional or compulsary depending on the type eg. books have editors but posters don't but they are both EPrints.

The fields contained in a dataset are defined by the data object and by any additional fields defined in cfg.d. Some datasets don't have any fields while others may just be "virtual" datasets made from others.

User Comments


cachemap, counter

Don't have a package or metadata fields associated.

User Comments


archive, buffer, inbox, deletion

All have the same package and metadata fields as eprints, but are filtered by eprint_status.

User Comments


METHODS

User Comments


Class Methods

User Comments


new

$ds = EPrints::DataSet->new( %properties )

Creates and returns a new dataset based on %properties.

Requires at least repository and name properties.

Available properties:

User Comments


repository

repository OBJ

Reference to the repository object.

User Comments


name

name STRING

Name of the dataset.

User Comments


confid

confid STRING

Name of the dataset this dataset is a subset of (e.g. 'archive' is a subset of 'eprint'). If defined requires dataset_id_field.

User Comments


dataset_id_field

dataset_id_field

Name of the text field that contains the subset dataset id.

User Comments


sql_name

sql_name STRING

Name of the primary database table.

User Comments


virtual

virtual BOOL

Set to 1 if this dataset doesn't require it's own database tables.

User Comments


type

type STRING

Type of data object the dataset contains e.g. for EPrints::DataObj::EPrint specify "EPrint".

User Comments


class

class STRING

Explicit class to use for data objects. To use the default object specify EPrints::DataObj.

User Comments


filters

filters ARRAYREF

Filters to apply to this dataset before searching (see EPrints::Search).

User Comments


datestamp

datestamp STRING

The field name that contains a datestamp to order this dataset by.

User Comments


index

index BOOL

Whether this dataset should be indexed.

User Comments


import

import BOOL

Whether you can import into this dataset.

User Comments


get_system_dataset_info

$info = EPrints::DataSet::get_system_dataset_info()

Returns a hash reference of core system datasets.

User Comments


Object Methods

User Comments


base_id

$id = $ds->base_id
 $ds = $repo->dataset( "inbox" );
 $id = $ds->base_id; # returns "eprint"
 

Returns the identifier of the base dataset for this dataset (same as /id unless this dataset is virtual).

User Comments


process_field

$field = $ds->process_field( $data [, $system ] )

Creates a new field in this dataset based on $data. If $system is true defines the new field as a "core" field.

User Comments


register_field

$ds->register_field( $field [, $system ] )

Register a new field with this dataset.

User Comments


unregister_field

$ds->unregister_field( $field )

Unregister a field from this dataset.

User Comments


field

$metafield = $ds->field( $fieldname )

Returns the EPrints::MetaField from this dataset with the given name, or undef.

User Comments


has_field

$bool = $ds->has_field( $fieldname )

True if the dataset has a field of that name.

User Comments


default_order

$ordertype = $ds->default_order

Return the id string of the default order for this dataset.

For example "bytitle" for eprints.

User Comments


id

$id = $ds->id

Return the id of this dataset.

User Comments


count

$n = $ds->count( $session )

Return the number of records in this dataset.

User Comments


get_sql_table_name

$tablename = $ds->get_sql_table_name

Return the name of the main SQL Table containing this dataset. the other SQL tables names are based on this name.

User Comments


get_sql_index_table_name

$tablename = $ds->get_sql_index_table_name

Return the name of the SQL table which contains the free text indexing information.

User Comments


get_sql_grep_table_name

$tablename = $ds->get_sql_grep_table_name

Reutrn the name of the SQL table which contains the strings to be used with LIKE in a final pass of a search.

User Comments


get_sql_rindex_table_name

$tablename = $ds->get_sql_rindex_table_name

Reutrn the name of the SQL table which contains the reverse text indexing information. (Used for deleting freetext indexes when removing a record).

User Comments


get_ordervalues_table_name

$tablename = $ds->get_ordervalues_table_name( $langid )

Return the name of the SQL table containing values used for ordering this dataset.

User Comments


get_sql_sub_table_name

$tablename = $ds->get_sql_sub_table_name( $field )

Returns the name of the SQL table which contains the information on the "multiple" field. $field is an EPrints::MetaField belonging to this dataset.

User Comments


fields

@fields = $ds->fields

Returns a list of the EPrints::Metafields belonging to this dataset.

User Comments


key_field

$field = $ds->key_field

Return the EPrints::MetaField representing the primary key field.

Always the first field.

User Comments


make_object

$obj = $ds->make_object( $session, $data )

Return an object of the class associated with this dataset, always a subclass of EPrints::DataObj.

$data is a hash of values for fields of a record in this dataset.

Return $data if no class associated with this dataset.

User Comments


create_dataobj

$obj = $ds->create_dataobj( $data )

Returns a new object in this dataset based on $data or undef on failure.

If $data describes sub-objects then those will also be created.

User Comments


get_object_class

$class = $ds->get_object_class;

Return the perl class to which objects in this dataset belong.

User Comments


get_object

$obj = $ds->get_object( $session, $id );

Return the object from this dataset with the given id, or undefined.

User Comments


dataobj

$dataobj = $ds->dataobj( $id )

Returns the object from this dataset with the given id, or undefined.

User Comments


get_object_from_uri

$dataobj = EPrints::DataSet->get_object_from_uri( $session, $uri )

Returns a the dataobj identified by internal URI $uri.

Returns undef if $uri isn't an internal URI or the object is no longer available.

User Comments


render_name

$xhtml = $ds->render_name( $session )

Return a piece of XHTML describing this dataset, in the language of the current session.

User Comments


map

$ds->map( $session, $fn, $info )

Maps the function $fn onto every record in this dataset. See Search for a full explanation.

User Comments


repository

$repository = $ds->repository

Returns the EPrints::Repository to which this dataset belongs.

User Comments


reindex

$ds->reindex( $session )

Recommits all the items in this dataset. This could take a real long time on a large set of records.

Really should not be called reindex anymore as it doesn't.

User Comments


get_dataset_ids

@ids = EPrints::DataSet::get_dataset_ids()

Deprecated, use $repository->get_dataset_ids().

User Comments


get_sql_dataset_ids

@ids = EPrints::DataSet::get_sql_dataset_ids()

Deprecated, use $repository->get_sql_dataset_ids().

User Comments


count_indexes

$n = $ds->count_indexes

Return the number of indexes required for the main SQL table of this dataset. Used to check it's not over 32 (the current maximum allowed by MySQL)

Assumes things either have 1 or 0 indexes which might not always be true.

User Comments


get_item_ids

@ids = $dataset->get_item_ids( $session )

Return a list of the id's of all items in this set.

User Comments


is_virtual

$bool = $dataset->is_virtual()

Returns whether this dataset is virtual (i.e. has no database tables).

User Comments


get_datestamp_field

$field = $dataset->get_datestamp_field()

Returns the datestamp field for this dataset which may be used for incremental harvesting. Returns undef if no such field is available.

User Comments


prepare_search

$searchexp = $ds->prepare_search( %options )

Returns a EPrints::Search for this dataset with %options.

User Comments


search

$list = $ds->search( %options )

Short-cut to /prepare_search( %options )->execute.

User Comments


list

$list = $ds->list( $ids )

Returns a EPrints::List for this dataset for the given $ids list.

User Comments


columns

$fields = $dataset->columns()

Returns the default list of fields to show the user when browsing this dataset in a table. Returns an array ref of EPrints::MetaField objects.

User Comments