API:EPrints/DataSet
EPrints 3 Reference: Directory Structure - Metadata Fields - Repository Configuration - XML Config Files - XML Export Format - EPrints data structure - Core API - Data Objects
Latest Source Code (3.4, 3.3) | Revision Log | Before editing this page please read Pod2Wiki
Contents
- 1 NAME
- 2 SYNOPSIS
- 3 DESCRIPTION
- 4 METHODS
- 4.1 Class Methods
- 4.2 Object Methods
- 4.2.1 base_id
- 4.2.2 process_field
- 4.2.3 register_field
- 4.2.4 unregister_field
- 4.2.5 field
- 4.2.6 has_field
- 4.2.7 default_order
- 4.2.8 id
- 4.2.9 count
- 4.2.10 get_sql_table_name
- 4.2.11 get_sql_index_table_name
- 4.2.12 get_sql_grep_table_name
- 4.2.13 get_sql_rindex_table_name
- 4.2.14 get_ordervalues_table_name
- 4.2.15 get_sql_sub_table_name
- 4.2.16 fields
- 4.2.17 key_field
- 4.2.18 make_object
- 4.2.19 create_dataobj
- 4.2.20 get_object_class
- 4.2.21 get_object
- 4.2.22 dataobj
- 4.2.23 get_object_from_uri
- 4.2.24 render_name
- 4.2.25 map
- 4.2.26 repository
- 4.2.27 reindex
- 4.2.28 get_dataset_ids
- 4.2.29 get_sql_dataset_ids
- 4.2.30 count_indexes
- 4.2.31 get_item_ids
- 4.2.32 is_virtual
- 4.2.33 get_datestamp_field
- 4.2.34 prepare_search
- 4.2.35 search
- 4.2.36 list
- 4.2.37 columns
NAME
EPrints::DataSet - a dataset is a set of records in the eprints system with the same metadata.
SYNOPSIS
my $dataset = $repository->get_dataset( "inbox" ); print sprintf("There are %d records in the inbox\n", $dataset->count);
DESCRIPTION
This module describes an EPrint dataset.
A repository has several datasets that make up the repository's database. The list of dataset ids can be obtained from the repository object (see EPrints::Repository).
A normal dataset (eg. "user") has a package associated with it (eg. EPrints::DataObj::User) which must be a subclass of EPrints::DataObj and a number of SQL tables which are prefixed with the dataset name. Most datasets also have a set of associated EPrints::MetaField's which may be optional or compulsary depending on the type eg. books have editors but posters don't but they are both EPrints.
The fields contained in a dataset are defined by the data object and by any additional fields defined in cfg.d. Some datasets don't have any fields while others may just be "virtual" datasets made from others.
cachemap, counter
Don't have a package or metadata fields associated.
archive, buffer, inbox, deletion
All have the same package and metadata fields as eprints, but are filtered by eprint_status.
METHODS
Class Methods
new
$ds = EPrints::DataSet->new( %properties )
Creates and returns a new dataset based on %properties.
Requires at least repository and name properties.
Available properties:
repository
repository OBJ
Reference to the repository object.
name
name STRING
Name of the dataset.
confid
confid STRING
Name of the dataset this dataset is a subset of (e.g. 'archive' is a subset of 'eprint'). If defined requires dataset_id_field.
dataset_id_field
dataset_id_field
Name of the text field that contains the subset dataset id.
sql_name
sql_name STRING
Name of the primary database table.
virtual
virtual BOOL
Set to 1 if this dataset doesn't require it's own database tables.
type
type STRING
Type of data object the dataset contains e.g. for EPrints::DataObj::EPrint specify "EPrint".
class
class STRING
Explicit class to use for data objects. To use the default object specify EPrints::DataObj.
filters
filters ARRAYREF
Filters to apply to this dataset before searching (see EPrints::Search).
datestamp
datestamp STRING
The field name that contains a datestamp to order this dataset by.
index
index BOOL
Whether this dataset should be indexed.
import
import BOOL
Whether you can import into this dataset.
get_system_dataset_info
$info = EPrints::DataSet::get_system_dataset_info()
Returns a hash reference of core system datasets.
Object Methods
base_id
$id = $ds->base_id $ds = $repo->dataset( "inbox" ); $id = $ds->base_id; # returns "eprint"
Returns the identifier of the base dataset for this dataset (same as /id unless this dataset is virtual).
process_field
$field = $ds->process_field( $data [, $system ] )
Creates a new field in this dataset based on $data. If $system is true defines the new field as a "core" field.
register_field
$ds->register_field( $field [, $system ] )
Register a new field with this dataset.
unregister_field
$ds->unregister_field( $field )
Unregister a field from this dataset.
field
$metafield = $ds->field( $fieldname )
Returns the EPrints::MetaField from this dataset with the given name, or undef.
has_field
$bool = $ds->has_field( $fieldname )
True if the dataset has a field of that name.
default_order
$ordertype = $ds->default_order
Return the id string of the default order for this dataset.
For example "bytitle" for eprints.
id
$id = $ds->id
Return the id of this dataset.
count
$n = $ds->count( $session )
Return the number of records in this dataset.
get_sql_table_name
$tablename = $ds->get_sql_table_name
Return the name of the main SQL Table containing this dataset. the other SQL tables names are based on this name.
get_sql_index_table_name
$tablename = $ds->get_sql_index_table_name
Return the name of the SQL table which contains the free text indexing information.
get_sql_grep_table_name
$tablename = $ds->get_sql_grep_table_name
Reutrn the name of the SQL table which contains the strings to be used with LIKE in a final pass of a search.
get_sql_rindex_table_name
$tablename = $ds->get_sql_rindex_table_name
Reutrn the name of the SQL table which contains the reverse text indexing information. (Used for deleting freetext indexes when removing a record).
get_ordervalues_table_name
$tablename = $ds->get_ordervalues_table_name( $langid )
Return the name of the SQL table containing values used for ordering this dataset.
get_sql_sub_table_name
$tablename = $ds->get_sql_sub_table_name( $field )
Returns the name of the SQL table which contains the information on the "multiple" field. $field is an EPrints::MetaField belonging to this dataset.
fields
@fields = $ds->fields
Returns a list of the EPrints::Metafields belonging to this dataset.
key_field
$field = $ds->key_field
Return the EPrints::MetaField representing the primary key field.
Always the first field.
make_object
$obj = $ds->make_object( $session, $data )
Return an object of the class associated with this dataset, always a subclass of EPrints::DataObj.
$data is a hash of values for fields of a record in this dataset.
Return $data if no class associated with this dataset.
create_dataobj
$obj = $ds->create_dataobj( $data )
Returns a new object in this dataset based on $data or undef on failure.
If $data describes sub-objects then those will also be created.
get_object_class
$class = $ds->get_object_class;
Return the perl class to which objects in this dataset belong.
get_object
$obj = $ds->get_object( $session, $id );
Return the object from this dataset with the given id, or undefined.
dataobj
$dataobj = $ds->dataobj( $id )
Returns the object from this dataset with the given id, or undefined.
get_object_from_uri
$dataobj = EPrints::DataSet->get_object_from_uri( $session, $uri )
Returns a the dataobj identified by internal URI $uri.
Returns undef if $uri isn't an internal URI or the object is no longer available.
render_name
$xhtml = $ds->render_name( $session )
Return a piece of XHTML describing this dataset, in the language of the current session.
map
$ds->map( $session, $fn, $info )
Maps the function $fn onto every record in this dataset. See Search for a full explanation.
repository
$repository = $ds->repository
Returns the EPrints::Repository to which this dataset belongs.
reindex
$ds->reindex( $session )
Recommits all the items in this dataset. This could take a real long time on a large set of records.
Really should not be called reindex anymore as it doesn't.
get_dataset_ids
@ids = EPrints::DataSet::get_dataset_ids()
Deprecated, use $repository->get_dataset_ids().
get_sql_dataset_ids
@ids = EPrints::DataSet::get_sql_dataset_ids()
Deprecated, use $repository->get_sql_dataset_ids().
count_indexes
$n = $ds->count_indexes
Return the number of indexes required for the main SQL table of this dataset. Used to check it's not over 32 (the current maximum allowed by MySQL)
Assumes things either have 1 or 0 indexes which might not always be true.
get_item_ids
@ids = $dataset->get_item_ids( $session )
Return a list of the id's of all items in this set.
is_virtual
$bool = $dataset->is_virtual()
Returns whether this dataset is virtual (i.e. has no database tables).
get_datestamp_field
$field = $dataset->get_datestamp_field()
Returns the datestamp field for this dataset which may be used for incremental harvesting. Returns undef if no such field is available.
prepare_search
$searchexp = $ds->prepare_search( %options )
Returns a EPrints::Search for this dataset with %options.
search
$list = $ds->search( %options )
Short-cut to /prepare_search( %options )->execute.
list
$list = $ds->list( $ids )
Returns a EPrints::List for this dataset for the given $ids list.
columns
$fields = $dataset->columns()
Returns the default list of fields to show the user when browsing this dataset in a table. Returns an array ref of EPrints::MetaField objects.