API:EPrints/DataSet

From EPrints Documentation
Revision as of 18:29, 11 August 2009 by Tdb01r (talk | contribs) (New page: <!-- Pod2Wiki=_preamble_ This page has been automatically generated from the EPrints source. Any wiki changes made between the 'Pod2Wiki=*' and 'End of Pod2Wiki' comments will be lost. -...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Latest Source Code (3.4, 3.3) | Revision Log | Before editing this page please read Pod2Wiki

NAME

EPrints::DataSet - a dataset is a set of records in the eprints system with the same metadata.

DESCRIPTION

This module describes an EPrint dataset.

An repository has one of each type of dataset:

cachemap, counter, user, archive, buffer, inbox, document, subject, saved_search, deletion, eprint, access.

A normal dataset (eg. "user") has a package associated with it (eg. EPrints::DataObj::User) which must be a subclass of EPrints::DataObj and a number of SQL tables which are prefixed with the dataset name. Most datasets also have a set of associated EPrints::MetaField's which may be optional or compulsary depending on the type eg. books have editors but posters don't but they are both EPrints.

Datasets have some default fields plus additional ones configured in Fields.pm.

But there are some exceptions:

cachemap_counter

 cachemap, counter

Don't have a package or metadata fields associated.

archive_buffer_inbox_deletion

 archive, buffer, inbox, deletion

All have the same package and metadata fields as eprints, but are filtered by eprint_status.

new_stub

 $ds = EPrints::DataSet->new_stub( $id )

Creates a dataset object without any fields. Useful to avoid problems with something a dataset does depending on loading the dataset. It can still be queried about other things, such as SQL table names.

new

 $ds = EPrints::DataSet->new( $repository, $id )

Return the dataset specified by $id.

Note that dataset know $repository and vice versa - which means they will not get garbage collected.

get_field

 $metafield = $ds->get_field( $fieldname )

Return a MetaField object describing the asked for field in this dataset, or undef if there is no such field.

has_field

 $bool = $ds->has_field( $fieldname )

True if the dataset has a field of that name.

default_order

 $ordertype = $ds->default_order

Return the id string of the default order for this dataset.

For example "bytitle" for eprints.

confid

 $confid = $ds->confid

Return the string to use when getting configuration for this dataset.

archive, buffer, inbox and deletion all return "eprint" as they must have the same configuration.

id

 $id = $ds->id

Return the id of this dataset.

count

 $n = $ds->count( $session )

Return the number of records in this dataset.

get_sql_table_name

 $tablename = $ds->get_sql_table_name

Return the name of the main SQL Table containing this dataset. the other SQL tables names are based on this name.

get_sql_index_table_name

 $tablename = $ds->get_sql_index_table_name

Return the name of the SQL table which contains the free text indexing information.

get_sql_grep_table_name

 $tablename = $ds->get_sql_grep_table_name

Reutrn the name of the SQL table which contains the strings to be used with LIKE in a final pass of a search.

get_sql_rindex_table_name

 $tablename = $ds->get_sql_rindex_table_name

Reutrn the name of the SQL table which contains the reverse text indexing information. (Used for deleting freetext indexes when removing a record).

get_ordervalues_table_name

 $tablename = $ds->get_ordervalues_table_name( $langid )

Return the name of the SQL table containing values used for ordering this dataset.

get_sql_sub_table_name

 $tablename = $ds->get_sql_sub_table_name( $field )

Returns the name of the SQL table which contains the information on the "multiple" field. $field is an EPrints::MetaField belonging to this dataset.

get_fields

 $fields = $ds->get_fields

Returns a list of the EPrints::Metafields belonging to this dataset.

get_key_field

 $field = $ds->get_key_field

Return the EPrints::MetaField representing the primary key field. Always the first field.

make_object

 $obj = $ds->make_object( $session, $data )

Return an object of the class associated with this dataset, always a subclass of EPrints::DataObj.

$data is a hash of values for fields of a record in this dataset.

Return $data if no class associated with this dataset.

create_object

 $obj = $ds->create_object( $session, $data )

Create a new object in the given dataset. Return the new object.

Return undef if the object could not be created.

If $data describes sub-objects too then those will also be created.

get_object_class

 $class = $ds->get_object_class;

Return the perl class to which objects in this dataset belong.

get_object

 $obj = $ds->get_object( $session, $id );

Return the object from this dataset with the given id, or undefined.

render_name

 $xhtml = $ds->render_name( $session )

Return a piece of XHTML describing this dataset, in the language of the current session.

map

 $ds->map( $session, $fn, $info )

Maps the function $fn onto every record in this dataset. See Search for a full explanation.

get_repository

 $repository = $ds->get_repository

Returns the EPrints::Repository to which this dataset belongs.

reindex

 $ds->reindex( $session )

Recommits all the items in this dataset. This could take a real long time on a large set of records.

Really should not be called reindex anymore as it doesn't.

get_dataset_ids

 @ids = EPrints::DataSet::get_dataset_ids( get_dataset_ids )

Return a list of all dataset ids.

@ids

 @ids = EPrints::DataSet::get_sql_dataset_ids

Return a list of all dataset ids of datasets which are directly mapped into SQL (not counters or cache which work a bit differently).

count_indexes

 $n = $ds->count_indexes

Return the number of indexes required for the main SQL table of this dataset. Used to check it's not over 32 (the current maximum allowed by MySQL)

Assumes things either have 1 or 0 indexes which might not always be true.

get_item_ids

 @ids = $dataset->get_item_ids( $session )

Return a list of the id's of all items in this set.

get_datestamp_field

 $field = $dataset->get_datestamp_field()

Returns the datestamp field for this dataset which may be used for incremental harvesting. Returns undef if no such field is available.

UNDOCUMENTED METHODS

Warning These methods were found in the source code but didn't have any POD associated with them. This may be because we haven't got around to documenting them yet or it could be because they are internal to the API and not intended for use by other parts of EPrints.

get_archive

get_dataset_id_field

get_filters

get_page_fields

get_required_type_fields

get_type_fields

get_type_name

get_type_names

get_type_pages

get_types

indexable

is_valid_type

load_workflows

process_field

render_type_name