Difference between revisions of "API:EPrints/DataSet"

From EPrints Documentation
Jump to: navigation, search
(New page: <!-- Pod2Wiki=_preamble_ This page has been automatically generated from the EPrints source. Any wiki changes made between the 'Pod2Wiki=*' and 'End of Pod2Wiki' comments will be lost. -...)
 
 
(14 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
<!-- Pod2Wiki=_preamble_  
 
<!-- Pod2Wiki=_preamble_  
This page has been automatically generated from the EPrints source. Any wiki changes made between the 'Pod2Wiki=*' and 'End of Pod2Wiki' comments will be lost.
+
This page has been automatically generated from the EPrints 3.2 source. Any wiki changes made between the 'Pod2Wiki=*' and 'Edit below this comment' comments will be lost.
  -->{{Pod2Wiki}}{{API:Source|file=EPrints/DataSet.pm|package_name=EPrints::DataSet}}[[Category:API|DataSet]]<!-- End of Pod2Wiki -->
+
  -->{{API}}{{Pod2Wiki}}{{API:Source|file=perl_lib/EPrints/DataSet.pm|package_name=EPrints::DataSet}}[[Category:API|DATASET]][[Category:API:EPrints/DataSet|DATASET]]<div><!-- Edit below this comment -->
<!-- Pod2Wiki=head_name -->=NAME=
 
'''EPrints::DataSet''' - a dataset is a set of records in the eprints system with the same metadata.
 
  
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=head_description -->=DESCRIPTION=
 
This module describes an EPrint dataset.
 
  
An repository has one of each type of dataset:
+
<!-- Pod2Wiki=_private_ --><!-- Pod2Wiki=head_name -->
 +
==NAME==
 +
'''EPrints::DataSet''' - a set of records with the same metadata scheme
  
cachemap, counter, user, archive, buffer, inbox, document, subject, saved_search, deletion, eprint, access.
+
<!-- Edit below this comment -->
  
A normal dataset (eg. "user") has a package associated with it  (eg. EPrints::DataObj::User) which must be a subclass of EPrints::DataObj  and a number of SQL tables which are prefixed with the dataset name. Most datasets also have a set of associated EPrints::MetaField's which may be optional or compulsary depending on the type eg. books have editors but posters don't but they are both EPrints.
 
  
Datasets have some default fields plus additional ones configured in Fields.pm.
+
<!-- Pod2Wiki= -->
 +
<!-- Pod2Wiki=head_synopsis -->
 +
==SYNOPSIS==
 +
<source lang="perl">my $dataset = $repository->dataset( "inbox" );
  
But there are some exceptions:
+
print sprintf("There are %d records in the inbox\n",
 +
$dataset->count);
  
<!-- End of Pod2Wiki -->
+
$string = $dataset->base_id; # eprint
<!-- Pod2Wiki=item_cachemap_counter -->==cachemap_counter==
+
$string = $dataset->id; # inbox
  
  cachemap, counter
+
$dataobj = $dataset->create_dataobj( $data );
 +
$user = $dataset->dataobj( 23 );
  
Don't have a package or metadata fields associated.
+
$search = $dataset->prepare_search( %options );
 +
$list = $dataset->search( %options ); # prepare_search( %options )->execute
 +
$list = $dataset->search; # match ALL
  
<!-- End of Pod2Wiki -->
+
$metafield = $dataset->field( $fieldname );
<!-- Pod2Wiki=item_archive_buffer_inbox_deletion -->==archive_buffer_inbox_deletion==
+
$metafield = $dataset->key_field;
 +
@metafields = $dataset->fields;
  
  archive, buffer, inbox, deletion
+
$dataset->search->map( sub {}, $ctx );
 +
$n = $dataset->search->count;
 +
$ids = $dataset->search->ids;
 +
$list = $dataset->list( \@ids );</source>
  
All have the same package and metadata fields as eprints, but are filtered by eprint_status.
+
<!-- Edit below this comment -->
  
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_new_stub -->==new_stub==
 
  
  $ds = EPrints::DataSet-&gt;new_stub( $id )
+
<!-- Pod2Wiki= -->
 +
<!-- Pod2Wiki=head_description -->
 +
==DESCRIPTION==
 +
This module describes a dataset.
  
Creates a dataset object without any fields. Useful to avoid problems with something a dataset does depending on loading the dataset. It can still be queried about other things, such as SQL table names.  
+
A repository has several datasets that make up the repository's metadata schema. The list of dataset ids can be obtained from the repository object (see [[API:EPrints/Repository|EPrints::Repository]]).
  
<!-- End of Pod2Wiki -->
+
A normal dataset (eg. "user") has a package associated with it  (eg. [[API:EPrints/DataObj/User|EPrints::DataObj::User]]) which must be a subclass of [[API:EPrints/DataObj|EPrints::DataObj]]  and a number of SQL tables which are prefixed with the dataset name. Most datasets also have a set of associated [[API:EPrints/MetaField|EPrints::MetaField]]'s which may be optional or required depending on the type eg. books have editors but posters don't but they are both EPrints.
<!-- Pod2Wiki=item_new -->==new==
 
  
  $ds = EPrints::DataSet-&gt;new( $repository, $id )
+
The fields contained in a dataset are defined by the data object and by any additional fields defined in cfg.d. Some datasets don't have any fields.
  
Return the dataset specified by $id.
+
Some datasets are "virtual" datasets made from others. Examples include  "inbox", "archive", "buffer" and "deletion" which are all virtual datasets  of of the "eprint" dataset. That is to say "inbox" is a subset of "eprint"  and by inference contains [[API:EPrints/DataObj/EPrint|EPrints::DataObj::EPrint]]. You can define your  own virtual datasets which opperate on existing datasets.
  
Note that dataset know $repository and vice versa - which means they will not get garbage collected.
+
<!-- Edit below this comment -->
  
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_field -->==get_field==
 
  
  $metafield = $ds-&gt;get_field( $fieldname )
+
<!-- Pod2Wiki= -->
 +
<!-- Pod2Wiki=head_creating_custom_datasets -->
 +
==CREATING CUSTOM DATASETS==
 +
New datasets can be defined in a configuration file, e.g.
  
Return a MetaField object describing the asked for field in this dataset, or undef if there is no such field.
+
<pre>  $c-&gt;{datasets}-&gt;{bread} = {
 +
    class =&gt; "EPrints::DataObj::Bread",
 +
    sqlname =&gt; "bread",
 +
  };</pre>
  
<!-- End of Pod2Wiki -->
+
This defines a dataset with the id <code>bread</code> (must be unique). The dataobj package (class) to instantiate objects with is <code>EPrints::DataObj::Bread</code>, which must be a sub-class of [[API:EPrints/DataObj|EPrints::DataObj]]. Lastly, the database tables used by the dataset will be called 'bread' or prefixed 'bread_'.
<!-- Pod2Wiki=item_has_field -->==has_field==
 
  
  $bool = $ds-&gt;has_field( $fieldname )
+
Other optional properties:
  
True if the dataset has a field of that name.
+
<pre>  columns - an array ref of field ids to default the user view to
 +
  datestamp - field id to use to sort this dataset
 +
  import - is the dataset importable?
 +
  index - is the dataset text-indexed?
 +
  order - is the dataset orderable?
 +
  virtual - completely virtual dataset (no database tables)</pre>
  
<!-- End of Pod2Wiki -->
+
To make one dataset a virtual dataset of another (as 'inbox' is to 'eprint') use the following properties:
<!-- Pod2Wiki=item_default_order -->==default_order==
 
  
   $ordertype = $ds-&gt;default_order
+
<pre>  confid - the super-dataset this is a virtual sub-dataset of
 +
  dataset_id_field - the field containing the sub-dataset id
 +
   filters - an array ref of filters to apply when retrieving records</pre>
  
Return the id string of the default order for this dataset.
+
As with system datasets, the [[API:EPrints/MetaField|EPrints::MetaField]]s can be defined via [[API:EPrints/DataObj#get_system_field_info|EPrints::DataObj/get_system_field_info]] or via configuration:
  
For example "bytitle" for eprints.
+
<pre>  $c-&gt;add_dataset_field(
 +
    "bread",
 +
    { name =&gt; "breadid", type =&gt; "counter", sql_counter =&gt; "bread" }
 +
  );
 +
  $c-&gt;add_dataset_field(
 +
    "bread",
 +
    { name =&gt; "toasted", type =&gt; "bool", }
 +
  );
 +
  $c-&gt;add_dataset_field(
 +
    "bread",
 +
    { name =&gt; "description", type =&gt; "text", }
 +
  );</pre>
  
<!-- End of Pod2Wiki -->
+
See [[API:EPrints/RepositoryConfig#add_dataset_field|EPrints::RepositoryConfig/add_dataset_field]] for details on <code>add_dataset_field</code>.
<!-- Pod2Wiki=item_confid -->==confid==
 
  
  $confid = $ds-&gt;confid
+
Creating a fully-operational dataset will require more configuration files. You will probably want at least a [[API:EPrints/Workflow|workflow]], [[API:EPrints/Citation|citations]] for the summary page, search results etc, and permissions and searching settings:
  
Return the string to use when getting configuration for this dataset.
+
<pre>  push @{$c-&gt;{user_roles}-&gt;{admin}}, qw(
 +
    +bread/create
 +
    +bread/edit
 +
    +bread/view
 +
    +bread/destroy
 +
    +bread/details
 +
  );
 +
  push @{$c-&gt;{plugins}-&gt;{"Export::SummaryPage"}-&gt;{params}-&gt;{accept}}, qw(
 +
    dataobj/bread
 +
  );
 +
  $c-&gt;{datasets}-&gt;{bread}-&gt;{search}-&gt;{simple} = {
 +
    search_fields =&gt; {
 +
      id =&gt; "q",
 +
      meta_fields =&gt; [qw(
 +
        breadid
 +
        description
 +
      )],
 +
    },
 +
  };</pre>
  
archive, buffer, inbox and deletion all return "eprint" as they must have the same configuration.
+
<!-- Edit below this comment -->
  
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_id -->==id==
 
  
  $id = $ds-&gt;id
+
<!-- Pod2Wiki= -->
 +
<!-- Pod2Wiki=head_methods -->
 +
==METHODS==
 +
<!-- Edit below this comment -->
  
Return the id of this dataset.
 
  
<!-- End of Pod2Wiki -->
+
<!-- Pod2Wiki= -->
<!-- Pod2Wiki=item_count -->==count==
+
<!-- Pod2Wiki=head_class_methods -->
 +
===Class Methods===
 +
<!-- Edit below this comment -->
  
  $n = $ds-&gt;count( $session )
 
  
Return the number of records in this dataset.
+
<!-- Pod2Wiki= -->
 +
<!-- Pod2Wiki=head_object_methods -->
 +
===Object Methods===
 +
$id = $ds-&gt;base_id
 +
<pre>  $ds = $repo-&gt;dataset( "inbox" );
 +
  $id = $ds-&gt;base_id; # returns "eprint"</pre>
  
<!-- End of Pod2Wiki -->
+
Returns the identifier of the base dataset for this dataset (same as [[API:EPrints/DataSet#id|id]] unless this dataset is virtual).
<!-- Pod2Wiki=item_get_sql_table_name -->==get_sql_table_name==
 
  
  $tablename = $ds-&gt;get_sql_table_name
+
$metafield = $ds-&gt;field( $fieldname )
 +
Returns the [[API:EPrints/MetaField|EPrints::MetaField]] from this dataset with the given name, or undef.
  
Return the name of the main SQL Table containing this dataset. the other SQL tables names are based on this name.
+
$id = $ds-&gt;id
 +
Return the id of this dataset.
  
<!-- End of Pod2Wiki -->
+
$n = $ds-&gt;count( $session )
<!-- Pod2Wiki=item_get_sql_index_table_name -->==get_sql_index_table_name==
+
Return the number of records in this dataset.
 
 
  $tablename = $ds-&gt;get_sql_index_table_name
 
 
 
Return the name of the SQL table which contains the free text indexing information.
 
 
 
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_sql_grep_table_name -->==get_sql_grep_table_name==
 
 
 
  $tablename = $ds-&gt;get_sql_grep_table_name
 
 
 
Reutrn the name of the SQL table which contains the strings to be used with LIKE in a final pass of a search.
 
 
 
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_sql_rindex_table_name -->==get_sql_rindex_table_name==
 
 
 
  $tablename = $ds-&gt;get_sql_rindex_table_name
 
 
 
Reutrn the name of the SQL table which contains the reverse text indexing information. (Used for deleting freetext indexes when removing a record).
 
 
 
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_ordervalues_table_name -->==get_ordervalues_table_name==
 
 
 
  $tablename = $ds-&gt;get_ordervalues_table_name( $langid )
 
 
 
Return the name of the SQL table containing values used for ordering this dataset.
 
 
 
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_sql_sub_table_name -->==get_sql_sub_table_name==
 
 
 
  $tablename = $ds-&gt;get_sql_sub_table_name( $field )
 
 
 
Returns the name of the SQL table which contains the information on the "multiple" field. $field is an EPrints::MetaField belonging to this dataset.
 
 
 
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_fields -->==get_fields==
 
 
 
  $fields = $ds-&gt;get_fields
 
 
 
Returns a list of the EPrints::Metafields belonging to this dataset.
 
 
 
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_key_field -->==get_key_field==
 
 
 
  $field = $ds-&gt;get_key_field
 
 
 
Return the EPrints::MetaField representing the primary key field. Always the first field.
 
 
 
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_make_object -->==make_object==
 
 
 
  $obj = $ds-&gt;make_object( $session, $data )
 
 
 
Return an object of the class associated with this dataset, always a subclass of EPrints::DataObj.
 
 
 
$data is a hash of values for fields of a record in this dataset.
 
 
 
Return $data if no class associated with this dataset.
 
 
 
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_create_object -->==create_object==
 
 
 
  $obj = $ds-&gt;create_object( $session, $data )
 
 
 
Create a new object in the given dataset. Return the new object.
 
 
 
Return undef if the object could not be created.
 
 
 
If $data describes sub-objects too then those will also be created.
 
 
 
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_object_class -->==get_object_class==
 
 
 
  $class = $ds-&gt;get_object_class;
 
 
 
Return the perl class to which objects in this dataset belong.
 
 
 
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_object -->==get_object==
 
 
 
  $obj = $ds-&gt;get_object( $session, $id );
 
 
 
Return the object from this dataset with the given id, or undefined.
 
 
 
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_render_name -->==render_name==
 
 
 
  $xhtml = $ds-&gt;render_name( $session )
 
  
Return a piece of XHTML describing this dataset, in the language of the current session.
+
@fields = $ds-&gt;fields
 +
Returns a list of the [[API:EPrints/MetaField|EPrints::MetaField]]s belonging to this dataset.
  
<!-- End of Pod2Wiki -->
+
$field = $ds-&gt;key_field
<!-- Pod2Wiki=item_map -->==map==
+
Return the [[API:EPrints/MetaField|EPrints::MetaField]] representing the primary key field.
  
  $ds-&gt;map( $session, $fn, $info )
+
Always the first field.
  
Maps the function $fn onto every record in this dataset. See  Search for a full explanation.
+
$dataobj = $ds-&gt;make_dataobj( $epdata )
 +
Return an object of the class associated with this dataset, always a subclass of [[API:EPrints/DataObj|EPrints::DataObj]].
  
<!-- End of Pod2Wiki -->
+
$epdata is a hash of values for fields in this dataset.
<!-- Pod2Wiki=item_get_repository -->==get_repository==
 
  
  $repository = $ds-&gt;get_repository
+
Returns $epdata if no class is associated with this dataset.
  
Returns the EPrints::Repository to which this dataset belongs.
+
$obj = $ds-&gt;create_dataobj( $data )
 +
Returns a new object in this dataset based on $data or undef on failure.
  
<!-- End of Pod2Wiki -->
+
If $data describes sub-objects then those will also be created.
<!-- Pod2Wiki=item_reindex -->==reindex==
 
  
  $ds-&gt;reindex( $session )
+
$dataobj = $ds-&gt;dataobj( $id )
 +
Returns the object from this dataset with the given id, or undefined.
  
Recommits all the items in this dataset. This could take a real long  time on a large set of records.
+
$repository = $ds-&gt;repository
 +
Returns the [[API:EPrints/Repository|EPrints::Repository]] to which this dataset belongs.
  
Really should not be called reindex anymore as it doesn't.
+
$searchexp = $ds-&gt;prepare_search( %options )
 +
Returns a [[API:EPrints/Search|EPrints::Search]] for this dataset with %options.
  
<!-- End of Pod2Wiki -->
+
$list = $ds-&gt;search( %options )
<!-- Pod2Wiki=item_get_dataset_ids -->==get_dataset_ids==
+
Short-cut to [[API:EPrints/DataSet#prepare_search|prepare_search]]( %options )-&gt;execute.
  
   @ids = EPrints::DataSet::get_dataset_ids( get_dataset_ids )
+
* satisfy_all
 +
<pre>   satisfy_all"=&gt;1</pre>
  
Return a list of all dataset ids.
+
: Satify all conditions specified. 0 means satisfy any of the conditions specified. Default is 1
  
<!-- End of Pod2Wiki -->
+
* staff
<!-- Pod2Wiki=item_@ids -->==@ids==
+
<pre> "staff"=&gt;1</pre>
  
  @ids = EPrints::DataSet::get_sql_dataset_ids
+
: Do search as an adminstrator means you get everything back
  
Return a list of all dataset ids of datasets which are directly mapped into SQL (not counters or cache which work a bit differently).
+
* custom_order
 +
<pre>  "custom_order" =&gt; "field1/-field2/field3"</pre>
  
<!-- End of Pod2Wiki -->
+
: Order the search results by field order. prefixing the field name with a "-" results in reverse ordering
<!-- Pod2Wiki=item_count_indexes -->==count_indexes==
 
  
  $n = $ds-&gt;count_indexes
+
* filters
 +
<pre>  "filters" =&gt; \@(
 +
                        { meta_fields=&gt;[ "field1", "field2" "document.field3" ],
 +
                          merge=&gt;"ANY", match=&gt;"EX",
 +
                          value=&gt;"bees"
 +
                        },
 +
                        { meta_fields=&gt;[ "field4" ],
 +
                          value=&gt; qw( honey ),
 +
                          match=&gt;"IN"
 +
                        }
 +
                      );</pre>
  
Return the number of indexes required for the main SQL table of this dataset. Used to check it's not over 32 (the current maximum allowed by MySQL)
+
: This searchs for 'bees' in <code>field1</code> or <code>field2</code> or <code>document.field3</code>, and 'honey' in <code>field4</code>
  
Assumes things either have 1 or 0 indexes which might not always be true.
+
: For details on the <code>merge</code> and <code>match</code> parameters, refer to [[API:EPrints/Search/Field|EPrints::Search::Field]]
  
<!-- End of Pod2Wiki -->
+
<pre> "limit" =&gt; 10</pre>
<!-- Pod2Wiki=item_get_item_ids -->==get_item_ids==
 
  
  @ids = $dataset-&gt;get_item_ids( $session )
+
Only return 10 results
  
Return a list of the id's of all items in this set.
+
<!-- Edit below this comment -->
  
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_datestamp_field -->==get_datestamp_field==
 
  
  $field = $dataset-&gt;get_datestamp_field()
+
<!-- Pod2Wiki= -->
 +
<!-- Pod2Wiki=head_list -->
 +
===list===
  
Returns the datestamp field for this dataset which may be used for incremental harvesting. Returns undef if no such field is available.
+
<source lang="perl">$list = $ds->list( $ids )
  
<!-- End of Pod2Wiki -->
+
</source>
<!-- Pod2Wiki=head_undocumented_methods -->=UNDOCUMENTED METHODS=
+
Returns a [[API:EPrints/List|EPrints::List]] for this dataset for the given $ids list.
{{API:Undocumented Methods}}<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_archive -->==get_archive==
 
  
<!-- End of Pod2Wiki -->
+
<!-- Edit below this comment -->
<!-- Pod2Wiki=item_get_dataset_id_field -->==get_dataset_id_field==
 
  
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_filters -->==get_filters==
 
  
<!-- End of Pod2Wiki -->
+
<!-- Pod2Wiki= -->
<!-- Pod2Wiki=item_get_page_fields -->==get_page_fields==
+
<!-- Pod2Wiki=head_search_config -->
 +
===search_config===
  
<!-- End of Pod2Wiki -->
+
<source lang="perl">$sconf = $dataset->search_config( $searchid )
<!-- Pod2Wiki=item_get_required_type_fields -->==get_required_type_fields==
 
  
<!-- End of Pod2Wiki -->
+
</source>
<!-- Pod2Wiki=item_get_type_fields -->==get_type_fields==
+
Retrieve the search configuration $searchid for this dataset. This typically contains a set of fields to search over, order values and rendering parameters.
  
<!-- End of Pod2Wiki -->
+
<!-- Edit below this comment -->
<!-- Pod2Wiki=item_get_type_name -->==get_type_name==
 
  
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_get_type_names -->==get_type_names==
 
  
<!-- End of Pod2Wiki -->
+
<!-- Pod2Wiki= -->
<!-- Pod2Wiki=item_get_type_pages -->==get_type_pages==
+
<!-- Pod2Wiki=head_copyright -->
 +
==COPYRIGHT==
 +
Copyright 2000-2011 University of Southampton.
  
<!-- End of Pod2Wiki -->
+
This file is part of EPrints http://www.eprints.org/.
<!-- Pod2Wiki=item_get_types -->==get_types==
 
  
<!-- End of Pod2Wiki -->
+
EPrints is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
<!-- Pod2Wiki=item_indexable -->==indexable==
 
  
<!-- End of Pod2Wiki -->
+
EPrints is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more details.
<!-- Pod2Wiki=item_is_valid_type -->==is_valid_type==
 
  
<!-- End of Pod2Wiki -->
+
You should have received a copy of the GNU Lesser General Public License along with EPrints.  If not, see http://www.gnu.org/licenses/.
<!-- Pod2Wiki=item_load_workflows -->==load_workflows==
 
  
<!-- End of Pod2Wiki -->
+
<!-- Edit below this comment -->
<!-- Pod2Wiki=item_process_field -->==process_field==
 
  
<!-- End of Pod2Wiki -->
 
<!-- Pod2Wiki=item_render_type_name -->==render_type_name==
 
  
<!-- End of Pod2Wiki -->
+
<!-- Pod2Wiki= -->
<!-- Pod2Wiki=_postamble_ --><!-- End of Pod2Wiki -->
+
<!-- Pod2Wiki=_postamble_ -->
 +
<!-- Edit below this comment -->

Latest revision as of 09:56, 22 January 2013

EPrints 3 Reference: Directory Structure - Metadata Fields - Repository Configuration - XML Config Files - XML Export Format - EPrints data structure - Core API - Data Objects


API: Core API

Latest Source Code (3.4, 3.3) | Revision Log | Before editing this page please read Pod2Wiki


NAME

EPrints::DataSet - a set of records with the same metadata scheme


SYNOPSIS

my $dataset = $repository->dataset( "inbox" );

print sprintf("There are %d records in the inbox\n",
	$dataset->count);

$string = $dataset->base_id; # eprint
$string = $dataset->id; # inbox

$dataobj = $dataset->create_dataobj( $data );
$user = $dataset->dataobj( 23 );

$search = $dataset->prepare_search( %options );
$list = $dataset->search( %options ); # prepare_search( %options )->execute
$list = $dataset->search; # match ALL

$metafield = $dataset->field( $fieldname );
$metafield = $dataset->key_field;
@metafields = $dataset->fields;

$dataset->search->map( sub {}, $ctx );
$n = $dataset->search->count; 
$ids = $dataset->search->ids;
$list = $dataset->list( \@ids );


DESCRIPTION

This module describes a dataset.

A repository has several datasets that make up the repository's metadata schema. The list of dataset ids can be obtained from the repository object (see EPrints::Repository).

A normal dataset (eg. "user") has a package associated with it (eg. EPrints::DataObj::User) which must be a subclass of EPrints::DataObj and a number of SQL tables which are prefixed with the dataset name. Most datasets also have a set of associated EPrints::MetaField's which may be optional or required depending on the type eg. books have editors but posters don't but they are both EPrints.

The fields contained in a dataset are defined by the data object and by any additional fields defined in cfg.d. Some datasets don't have any fields.

Some datasets are "virtual" datasets made from others. Examples include "inbox", "archive", "buffer" and "deletion" which are all virtual datasets of of the "eprint" dataset. That is to say "inbox" is a subset of "eprint" and by inference contains EPrints::DataObj::EPrint. You can define your own virtual datasets which opperate on existing datasets.


CREATING CUSTOM DATASETS

New datasets can be defined in a configuration file, e.g.

  $c->{datasets}->{bread} = {
    class => "EPrints::DataObj::Bread",
    sqlname => "bread",
  };

This defines a dataset with the id bread (must be unique). The dataobj package (class) to instantiate objects with is EPrints::DataObj::Bread, which must be a sub-class of EPrints::DataObj. Lastly, the database tables used by the dataset will be called 'bread' or prefixed 'bread_'.

Other optional properties:

  columns - an array ref of field ids to default the user view to
  datestamp - field id to use to sort this dataset
  import - is the dataset importable?
  index - is the dataset text-indexed?
  order - is the dataset orderable?
  virtual - completely virtual dataset (no database tables)

To make one dataset a virtual dataset of another (as 'inbox' is to 'eprint') use the following properties:

  confid - the super-dataset this is a virtual sub-dataset of
  dataset_id_field - the field containing the sub-dataset id
  filters - an array ref of filters to apply when retrieving records

As with system datasets, the EPrints::MetaFields can be defined via EPrints::DataObj/get_system_field_info or via configuration:

  $c->add_dataset_field(
    "bread",
    { name => "breadid", type => "counter", sql_counter => "bread" }
  );
  $c->add_dataset_field(
    "bread",
    { name => "toasted", type => "bool", }
  );
  $c->add_dataset_field(
    "bread",
    { name => "description", type => "text", }
  );

See EPrints::RepositoryConfig/add_dataset_field for details on add_dataset_field.

Creating a fully-operational dataset will require more configuration files. You will probably want at least a workflow, citations for the summary page, search results etc, and permissions and searching settings:

  push @{$c->{user_roles}->{admin}}, qw(
    +bread/create
    +bread/edit
    +bread/view
    +bread/destroy
    +bread/details
  );
  push @{$c->{plugins}->{"Export::SummaryPage"}->{params}->{accept}}, qw(
    dataobj/bread
  );
  $c->{datasets}->{bread}->{search}->{simple} = {
    search_fields => {
      id => "q",
      meta_fields => [qw(
        breadid
        description
      )],
    },
  };


METHODS

Class Methods

Object Methods

$id = $ds->base_id
  $ds = $repo->dataset( "inbox" );
  $id = $ds->base_id; # returns "eprint"

Returns the identifier of the base dataset for this dataset (same as id unless this dataset is virtual).

$metafield = $ds->field( $fieldname )

Returns the EPrints::MetaField from this dataset with the given name, or undef.

$id = $ds->id

Return the id of this dataset.

$n = $ds->count( $session )

Return the number of records in this dataset.

@fields = $ds->fields

Returns a list of the EPrints::MetaFields belonging to this dataset.

$field = $ds->key_field

Return the EPrints::MetaField representing the primary key field.

Always the first field.

$dataobj = $ds->make_dataobj( $epdata )

Return an object of the class associated with this dataset, always a subclass of EPrints::DataObj.

$epdata is a hash of values for fields in this dataset.

Returns $epdata if no class is associated with this dataset.

$obj = $ds->create_dataobj( $data )

Returns a new object in this dataset based on $data or undef on failure.

If $data describes sub-objects then those will also be created.

$dataobj = $ds->dataobj( $id )

Returns the object from this dataset with the given id, or undefined.

$repository = $ds->repository

Returns the EPrints::Repository to which this dataset belongs.

$searchexp = $ds->prepare_search( %options )

Returns a EPrints::Search for this dataset with %options.

$list = $ds->search( %options )

Short-cut to prepare_search( %options )->execute.

  • satisfy_all
   satisfy_all"=>1
Satify all conditions specified. 0 means satisfy any of the conditions specified. Default is 1
  • staff
  "staff"=>1
Do search as an adminstrator means you get everything back
  • custom_order
  "custom_order" => "field1/-field2/field3"
Order the search results by field order. prefixing the field name with a "-" results in reverse ordering
  • filters
  "filters" => \@(
                         { meta_fields=>[ "field1", "field2" "document.field3" ],
                           merge=>"ANY", match=>"EX",
                           value=>"bees"
                         },
                         { meta_fields=>[ "field4" ],
                           value=> qw( honey ),
                           match=>"IN"
                         }
                       );
This searchs for 'bees' in field1 or field2 or document.field3, and 'honey' in field4
For details on the merge and match parameters, refer to EPrints::Search::Field
  "limit" => 10

Only return 10 results


list

$list = $ds->list( $ids )

Returns a EPrints::List for this dataset for the given $ids list.


search_config

$sconf = $dataset->search_config( $searchid )

Retrieve the search configuration $searchid for this dataset. This typically contains a set of fields to search over, order values and rendering parameters.


COPYRIGHT

Copyright 2000-2011 University of Southampton.

This file is part of EPrints http://www.eprints.org/.

EPrints is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

EPrints is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with EPrints. If not, see http://www.gnu.org/licenses/.