Difference between revisions of "Accessing Metdata Fields"

From EPrints Documentation
Jump to: navigation, search
(The Structure of Values)
(The Structure of Values)
Line 91: Line 91:
 
On an eprint, the title is generally a simple metadata field.  When $dataobj->value is called, it returns a scalar value.
 
On an eprint, the title is generally a simple metadata field.  When $dataobj->value is called, it returns a scalar value.
  
EPrints has four types of metadata field:
+
EPrints has two types of metadata field:
  
 
* Simple
 
* Simple
 
* Compound
 
* Compound
* Multiple
+
 
* Multiple Compound
+
Both types can either be a single or multiple values.  An example of a compound multiple field is the creators field:
 +
 
 +
<pre>
 +
my $creators = $dataobj->value('creators');
 +
use Data::Dumper;
 +
print Dumper $creators;
 +
</pre>
 +
 
 +
Data::Dumper is a very useful library that will output the a perl datastructure.  In the above case, the output may look something like this:
 +
 
 +
<pre>
 +
$VAR1 = [
 +
          {
 +
            'name' => {
 +
                        'lineage' => '',
 +
                        'given' => 'Noura',
 +
                        'honourific' => '',
 +
                        'family' => 'Abbas'
 +
                      },
 +
            'id' => '10363'
 +
          },
 +
          {
 +
            'name' => {
 +
                        'lineage' => '',
 +
                        'given' => 'Andrew',
 +
                        'honourific' => '',
 +
                        'family' => 'Gravell'
 +
                      },
 +
            'id' => '22'
 +
          },
 +
          {
 +
            'name' => {
 +
                        'lineage' => '',
 +
                        'given' => 'Gary',
 +
                        'honourific' => '',
 +
                        'family' => 'Wills'
 +
                      },
 +
            'id' => '395'
 +
          }
 +
        ];
 +
</pre>
 +
 
 +
The above output shows an array of hashes.  The structure can be compared to the configuration of the creators field (see cfg/cfg.d/eprint_fields.pl):
 +
 
 +
<pre>
 +
          {
 +
            'name' => 'creators',
 +
            'type' => 'compound',
 +
            'multiple' => 1,
 +
            'fields' => [
 +
                          {
 +
                            'sub_name' => 'name',
 +
                            'type' => 'name',
 +
                            'hide_honourific' => 1,
 +
                            'hide_lineage' => 1,
 +
                            'family_first' => 1,
 +
                          },
 +
                          {
 +
                            'sub_name' => 'id',
 +
                            'type' => 'text',
 +
                            'input_cols' => 20,
 +
                            'allow_null' => 1,
 +
                          }
 +
                        ],
 +
            'input_boxes' => 4,
 +
          },
 +
</pre>
 +
 
 +
Each element in the array consists of a hash that has a key for each sub field of the creator field.  Note the structure of the name part of the data dump.  The name datatype is also a hash.

Revision as of 18:38, 12 October 2011

This page proves an overview of the API calls you can use to access the data in a DataObj. The example framing this is that of an export plugin.

The Plugin

Below is a very simple export plugin, which outputs a single eprint or list of eprints as Text citations.

package EPrints::Plugin::Export::Text;

use EPrints::Plugin::Export::TextFile;

@ISA = ( "EPrints::Plugin::Export::TextFile" );

use strict;

sub new
{
        my( $class, %opts ) = @_;

        my $self = $class->SUPER::new( %opts );

        $self->{name} = "ASCII Citation";
        $self->{accept} = [ 'dataobj/eprint', 'list/eprint' ];
        $self->{visible} = "all";

        return $self;
}


sub output_dataobj
{
        my( $plugin, $dataobj ) = @_;

        my $cite = $dataobj->render_citation;

        return EPrints::Utils::tree_to_utf8( $cite )."\n\n";
}

1;

Note the output_dataobj function. In an export plugin, this will be called on every item in the list that is being exported, and the results for all items aggregated and outputted.

There are two function calls of particular interest that aid in retrieving and managing data:

my $cite = $dataobj->render_citation;

This returns an HTML DOM object containing the citation of the dataobj as specified in the configuration files (see cfg/citations/eprint/default.xml). Given an HTML DOM object, the following call will convert it into a string:

my $text = EPrints::Utils::tree_to_utf8( $html_dom )

Accessing Metadata

A number of functions exist to aid in accessing and rendering values in a dataobj.

my $title = $dataobj->value('title');

$title will now be a scalar containing the value stored in the title field of the dataobj. A function is provided to enable testing first:

if ($dataobj->is_set('title'))
{
     $title = $dataobj->value('title');
}

It is also possible to find out the fields that an item does have by querying the item's dataset:

my $ds = $dataobj->dataset;
my @fields = $ds->fields;
my %fieldvalues
foreach my $field (@field)
{
     my $fieldname = $field->name;
     if ($dataobj->is_set($fieldname))
     {
          $fieldvalues{$fieldname} = $dataobj->value($fieldname);
     }
}

The Structure of Values

On an eprint, the title is generally a simple metadata field. When $dataobj->value is called, it returns a scalar value.

EPrints has two types of metadata field:

  • Simple
  • Compound

Both types can either be a single or multiple values. An example of a compound multiple field is the creators field:

my $creators = $dataobj->value('creators');
use Data::Dumper;
print Dumper $creators;

Data::Dumper is a very useful library that will output the a perl datastructure. In the above case, the output may look something like this:

$VAR1 = [
          {
            'name' => {
                        'lineage' => '',
                        'given' => 'Noura',
                        'honourific' => '',
                        'family' => 'Abbas'
                      },
            'id' => '10363'
          },
          {
            'name' => {
                        'lineage' => '',
                        'given' => 'Andrew',
                        'honourific' => '',
                        'family' => 'Gravell'
                      },
            'id' => '22'
          },
          {
            'name' => {
                        'lineage' => '',
                        'given' => 'Gary',
                        'honourific' => '',
                        'family' => 'Wills'
                      },
            'id' => '395'
          }
        ];

The above output shows an array of hashes. The structure can be compared to the configuration of the creators field (see cfg/cfg.d/eprint_fields.pl):

          {
            'name' => 'creators',
            'type' => 'compound',
            'multiple' => 1,
            'fields' => [
                          {
                            'sub_name' => 'name',
                            'type' => 'name',
                            'hide_honourific' => 1,
                            'hide_lineage' => 1,
                            'family_first' => 1,
                          },
                          {
                            'sub_name' => 'id',
                            'type' => 'text',
                            'input_cols' => 20,
                            'allow_null' => 1,
                          }
                        ],
            'input_boxes' => 4,
          },

Each element in the array consists of a hash that has a key for each sub field of the creator field. Note the structure of the name part of the data dump. The name datatype is also a hash.