Difference between revisions of "Metadata"
(→Properties) |
(Using transclusion rather than redirection.) |
||
(31 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
− | + | {{Manual}} | |
− | + | {{EPrints Metadata Fields Content}} | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | { | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | { | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | { | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | { | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− |
Latest revision as of 13:11, 8 September 2017
Manual Sections | ||
|
Contents
Metadata Field Types
There are many different types of metadata field. The type controls how a field is rendered, indexed, searched and so forth. A field always has a type and a name property, and usually has several more. Most properties are documented on this page, but some properties are only available to certain types of field, and they are listed on the page for that field.
Some of these subclasses provide very rich features, others very simple. For example the url field works just like the text field except that it's only valid if it looks like a url and when rendered it is a hyper-link.
A metadata field describes one field of data in one type of Data Object. For example the "title" field of an EPrint Object or the "email" field in a User Object.
Every Data Object has system fields (which are set by the system, and not alterable), but the User Object and EPrint Object have additional fields which are configured on a per-repository basis.
These can be customised in the user_fields.pl and eprint_fields.pl files. Note that changing these files does not automatically modify the underlying database so should (generally) only be done before the database is created. Some metadata properties do not affect the database, and are marked as such.
If you add or remove fields, or modify a property which affects the database then you'll need to alter the database to match. In 3.0 this must be done by hand, but we have plans to build a tool to do this for you.
Inheritance
This is the list of useful field types. Under it is listed the other field types which are just included for completeness and are not intended to be used as part of the configuration.
Some field types inherit the properties of another, and then modify them in some way. For example the namedset field works like a set field except that it gets its options from a namedsets file not from the options=>[] in the field properties.
- Basic metadata field - this is abstract, fields must be one of the types listed below...
- Boolean - TRUE or FALSE (or can be unset, of course).
- Compound - virtual field, joins together several "multiple" fields, e.g. author_name and author_email.
- Dataobjref - references another data object.
- Multilang - allows language variants of a field, e.g. titles in French, German and/or English.
- Relation - stores a typed relationship with something represented by a URI.
- Date - stores a date
- Float - stores a floating-point value
- Decimal - stores a decimal number. Specifying the length of number before and after the decimal point.
- Id - like basic text field but search only finds exact matches
- Id (case-insensive) - like Id field but search find exact matches ignoring case (use for usernames, email addresses, etc.)
- Email - an email address
- Keywords - stores as longtext but searchable as exact individual keyword phrases
- Recaptcha - virtual field to display a reCAPTCHA to prevent spamming of public input forms.
- Text - the basic text field. Maximum 255 bytes. nb. uft-8 means some chars take more than one byte.
- Url - stores a URL.
- Uuid - stores a UUID.
- Id (case-insensive) - like Id field but search find exact matches ignoring case (use for usernames, email addresses, etc.)
- Int - an integer value
- Bigint - a large integer value (can be greater than 2,147,483,647 or less than -2,147,483.647).
- Counter - an auto-incrementing integer value.
- Itemref - a reference to another Data Object (e.g. a user or other eprint)
- Pagerange - a pagerange, e.g. 122-130
- Multipart - Stores a mutiple sub-fields like a person's name.
- Name - Stores a person's name broken up into logical parts.
- Subobject - Stores another data object under a parent data object.
Internal-use and Deprecated Field Types
- Basic metadata field
- File - DEPRECATED. Use Subobject field instead.
- Id
- Text
- Set
- Arclanguage - as for set, but the options are the valid languages of this repository. Probably better to use Multilang field.
- Fields - as for set, but the options are the fields in a dataset.
- Langid - used internally by Multilang fields to store the language ID.
- Longtext
- Search - a serialised search
- Set
- Text
- Int
- Year - DEPRECATED. Use Date field instead.
- Storable - Stores a serialization of a Perl data structure.
Properties
Note that true/false properties use 1 and 0 to indicate their setting.
Some properties can be temporarily set or overridden by the Workflow Format and Citation Format files.
Core Properties
Name | Default Value | Required | Description | Notes |
---|---|---|---|---|
name | n/a | YES | This is the internal name of the field. It should only contain alphanumeric characters and underscores. It will be used to identify this field in scripts, other configuration files, in the database, and in the XML export/import system, etc. | This property is not required when defining sub-fields of Compound fields where sub_name should be used. This property affects the database structure. It must be unique within the Data Object (so the EPrint Object cannot have two fields called email but the EPrint Object and User Object can each have a field with the same name. |
type | n/a | YES | This sets the type of the metafield, which in turn affects what other properties it may have. | This property affects the database structure. The value must be one of the metafield types listed above. |
multiple | 0 | NO | This indicates if this field is a single value or a list of values. E.g. title is only a single Longtext field but creators is a multiple Compound field. | This property affects the database structure. In the database a non-multiple field is stored in one (or more) columns in the main object table, but a multiple field gets its own table. |
readonly | 0 | NO | Whether to not make this field editable in the workflow. | This is useful if you want to display the pre-generated value(s) for this field for reference whilst other fields are being edited. |
sql_index | 1 | NO | When the database is created this field indicates that an SQL index should be created to speed searching. | This property affects the database structure. Different field types override the default value with the sensible option for that type of field. It is not worth putting a SQL index on a field that is only ever searched for words in it (like title or abstract) but it is worth indexing fields who's values are explicitly searched for, or where ranges are searched (e.g. Date fields, Set fields etc.). It is unlikely you will need to set this by hand. You could change it after the database has been created but this will not update the database nor have any other effect. |
sub_name | undef | YES | This is a special property which is required instead of the name property for the sub-fields inside Compound fields. | This property affects the database structure. The actual name of these fields is then forced to be parent field name + '_' + sub_name. E.g. Compound field creators is a sub-field with sub_name => 'name'. In this case the actual name of the name field in the system, database etc. is creators_name/tt>. |
virtual | 0 | NO | Whether this field calculates a value or stores in in the database. | Compound fields are virtual fields, whereas there sub-fields are not as they store values in the database. Other types of virtual field will require a render_value to be specified, as with no value stored the default render method will have nothing to display. |
volatile | 0 | NO | Whether the field is liable to change frequently. | Setting volatile => 1 will prevent new revisions being create and avoid other post commit events from being triggered, such as re-indexing.
|
Rendering Properties
These properties affect how values of the metadata in this field are rendered.
Certain of these properties can be turned on temporarily by the Citation Format files - render_magicstop for example.
Name | Default Value | Required | Description | Notes |
---|---|---|---|---|
as_list | undef | NO | Whether to display a collection sub-fields values as a table row or a separate list in the input form | This is only applicable for Compound fields that have multiple => 1 . It is useful where the length of the table row would exceed the width of a typical user's window.
|
browse_link | undef | NO | This is the name of a view which values of this field should be linked to. | E.g. if there was a Browse by Publishers view configured named pubs, then adding browse_link => 'pubs' to the publisher field would cause it to be linked into the browse view page for the named publisher whenever it is rendered.
|
render_custom | undef | NO | Whether to use a pre-defined way of rendering the value for this field. | E.g. for Name fields by default the name will link to the creators browse view for that name. This property can be re-used within bespoke render functions to specify whether some custom way of rendering this field' svalue (e.g. with a link) should be used. |
render_dont_link | 0 | NO | Whether rendered field is not encapsulated in a hyperlink. | Currently only affects Url fields and Email fields. |
render_dynamic | 0 | NO | Whether the rendering of this field can use JavaScript to make it dynamic. | limit_names_shown.pl uses this property to determine if the list of hidden creators/editors can be expanded to show all creators/editors. |
render_limit | undef | NO | How many values for this field should be displayed. | limit_names_shown.pl uses this property to determine how many creators/editors to display. If undef just render all values.
|
render_magicstop | 0 | NO | Whether to render a full stop at the end of this field, unless the last character is a dot, question mark or exclamation mark. | This helps avoid the ugly World without Cheese?. effect you get when titles end in ? or !. |
render_noreturn | 0 | NO | Whether CR (Carriage Return) and LF (Line Feed) characters are turned into normal spaces. | |
render_quiet | 0 | NO | Whether to prevent a big ugly UNSPECIFIED being rendered if field is unset. | E.g. setting render_quiet => 1 on a field means it just gets rendered as nothing if it is unset.
|
render_single_value | undef | NO | The value of this property is the name of a function to call to render individual values from this field. | For a multiple field this is called once per value in the list of values. The function should take the following parameters: ($session, $field, $value, $object ). It should return a XHTML DOM object of the rendered value.
|
render_value | undef | NO | The value of this property is the name of a function to call to render the the field as a whole. | As with render_single_value, but this gets passed the entire list of values (an array reference) if it is a multiple field. Parameters passed are: ( code>$session, $self, $value, $all_langs, $no_link, $object ). $all_langs indicates that all language variants should be shown and is only really useful for Multilang fields. $no_link being true is a request to place no hyperlinks in the resulting HTML. The function should return an XHTML DOM object of the rendered value.
|
Input and Validation Properties
Name | Default Value | Required | Description | Notes |
---|---|---|---|---|
default_value | undef | NO | The default value to set for this field. | This is mainly used for system fields. For custom fields it is better to use eprint_fields_default.pl or similar. |
expanded_subjects | [] | NO | Subjects to show un-collapsed in the subject tree field in the workflow. | This is only applicable for Subject fields. All the fields listed will have their paths in the subject tree expanded so they can be seen, making them easier to find. |
false_first | 0 | NO | Display the false option before the true option in the input form. | This is only applicable to Boolean fields. By default true option is always displayed before the false option. |
fromform | undef | NO | The inverse of toform. This takes the value from the form and converts it into the value that will be stored in the database. | This function is passed the parameters: $value, $session, $object, $basename when $value is the value entered on the HTML form, and the return value is the value to be stored in the database. This function is not called when editing the eprint is cancelled. |
get_item | undef | NO | A bespoke function for how to lookup the Data object based on the stored value. | Only applicable to Itemref fields. |
help_xhtml | undef | NO | An XHTML DOM object to use for the help text for this field. | This can only be set via the Workflow Format configuration not via the metadata field directly. This is so that the workflow can conditionally change the help on a field. If you need to change the help text based on the eprint type, then you can just create a bespoke phrase with the format "eprint_fieldhelp_" + fieldname + "." + eprint_type (e.g. eprint_fieldhelp_id_number.article ).
|
input_add_boxes | 2 | NO | The number of rows to add when clicking the More input rows button for a field that sets multiple => 1 . |
The default value for this property is taken from cfg.d/field_property_defaults.pl. |
input_boxes | 3 | NO | The number of input rows to initially show for a field that sets multiple => 1 . |
The default value for this property is taken from cfg.d/field_property_defaults.pl. |
input_cols | 60 | NO | The number of columns in an HTML form field. | The default value for this property is taken from cfg.d/field_property_defaults.pl. This in combination with the maxlength property determines the value for size attribute for <input> HTML fields or the cols attribute for <textarea> HTML fields (used by Longtext fields).
|
input_lookup_params | undef | NO | Additional parameters to pass to the input_lookup_url. | E.g. an indication of which autocomplete file to use. |
input_lookup_url | undef | NO | The URL to use for autocompletion. | This is generally set using the workflow configuration rather than directly in the field configuration. The URL must be on the same server hostname as the repository. |
input_ordered | 1 | NO | Whether the ordering of values needs to be captured. | In some multiple => 1 fields, such as creators, the order of the values is important and by default numbers are shown to the left of input rows and to the right are move up and move down arrows. However, with some multiple => 1 fields the order is not important, in which case you can set this to 0 to stop the arrows and numbers being shown.
|
input_rows | 10 | NO | The number of rows in an HTML form field. | The default value for this property is taken from cfg.d/field_property_defaults.pl This property determines the value for size attribute for <select> HTML fields (used by Set fields) or the rows attribute for <textarea> HTML fields (used by Longtext fields).
|
maxlength | 255 | NO | The maximum allowed length in characters for a value. | This can be a very simple validation check. Also, it may confuse users to be allowed to type in 255 characters in a field intended for something like a postcode/zipcode. |
maxwords | undef | NO | The maximum number of words that should be entered for this field. | This field is only applicable to Longtext_counter fields. It does not restrict the number of words, it just displays this limit next to a dynamic counter of the number of words already entered. |
render_input | undef | NO | The name of a function which will render the input for this field. | This can be difficult to use as it must return the same CGI parameters as the default input form would have. It is easiest on simple fields. The subroutine is passed the following parameters: $field, $session, $current_value, $dataset, $staff, $hidden_fields, $object, $basename ). It should return the XHTML DOM object of the chunk of HTML form.
|
required | 0 | NO | Whether the field must have a value set. | If this is set to 1 then the field is always marked as required, no matter what the Workflow Format configuration says. |
separator | undef | NO | What character to use to separate elements of the value for the field for purposes of search | Only used by default for Keywords fields. |
show_help | undef | NO | How to display the help text for this field in the input form. | Can be one of three values: always, never or toggle. Toggle (allow to expand or collapse) is used if not explicitly defined. This can only be overridden in Workflow Format configuration. |
title_xhtml | undef | NO | An XHTML DOM object to use for the title for this field. | This can only be set via the Workflow Format configuration not via the metadata field directly. This is so that the workflow can conditionally change the title of a field. If you need to change the title based on the eprint type, then you can just create a bespoke phrase with the format "eprint_fieldname_" + fieldname + "." + eprint_type (e.g. eprint_fieldname_id_number.article ).
|
toform | undef | NO | This function is allowed to modify the current value which appears in the form. | E.g. if your database stores userids in a field, but you want to allow people to edit them as usernames, then this function can be used to take the current value (a userid) and return the associated username. This value is what appears in the field in the search form. It is passed: $value, $session, $object, $basename and returns the user-facing version of $value .
|
Ordering, Indexing and Searching
Name | Default Value | Required | Description | Notes |
---|---|---|---|---|
make_single_value_orderkey | undef | NO | The orderkey function (potentially language specific) used to order by this field. | This property allows you to define a function or refer to a predefined function to override the default EPrints' orderkey generation. This property is passed each value from multiple fields, in turn. It is passed: $field, $value, $dataset and returns an ordervalue string.
|
make_value_orderkey | undef | NO | Like make_single_value_orderkey but this is passed the array reference for a multiple field rather than just single values. | It should return the orderkey string for the entire value. It is passed: $field, $value, $session, $langid, $dataset .
|
match | EQ | NO | How to match the value(s) of this field against search terms. | This property can be EQ, EX, IN or SET. Default EQ means treat the search term as a single string. Match only whole search term matches the field value (or one of its values if multiple => 1 ). This can be modified for the field in the search form configuration.
|
merge | ALL | NO | Whether this field's values(s) has to match any or all of the search terms | This property can be ALL or ANY, Default ALL means all search terms have to match the values(s) in this field. This can be modified for the field in the search form configuration. For certain values of match this can also be changed by the user in the search form itself. |
text_index | 0 | NO | Whether the indexer considers this field for full-text indexing. | Some types of metadata field have a default of 1, e.g. Text fields and Longtext fields. |
search_cols | 40 | NO | How many characters (columns) wide the input field for searching this field. | The default value for this property is taken from cfg.d/field_property_defaults.pl. If one search field searches more than one field, then the properties from the first field listed are used. |
Other Properties
This may be applicable to textName | Default Value | Required | Description | Notes |
---|---|---|---|---|
allow_null | 1 | NO | Whether the value(s) stored in the database when no input is entered should be NULL or an appropriate default value based on its database field type. | This should generally never be set to 0 and certainly should not be changed to 0 after the field has been added to the database. You are much better off configuring a default value using eprint_fields_default.pl or similar. |
can_clone | 1 | NO | Whether the value(s) for this field should be cloned if a new record is created. | This property is mostly used by system fields such as dir or datestamp. It is applicable when an eprint is cloned using the Use as template or New version action buttons. |
export_as_xml | 1 | NO | Whether the field value(s) should be exported in an XML export. | This is handy to suppress either confidential or confusing fields, like the fileinfo</file> system field. |
import | 1 | NO | Whether new data objects create from an import can set values for the field. | E.g. eprintid, dir are determined when the eprint record is created. Imported metadata would not choose an appropriate value for such fields. |
replace_core | 0 | NO | Whether the field configuration should replace the exisiting core (system) field with the same name | This is useful for particularly bespoke requirements. It should be used with great care, as system fields are usually hard-coded because they should not be changed. |
show_in_fieldlist | 1 | NO | Whether to allow this field to appear in fields lists. | If set to 0 will prevent this field appearing in Fields field lists. This is primarily to allow you to remove it from the list of fields in the user configuration which are used to control which fields appear as columns in the Items and Review screens. |
show_in_html | 1 | NO | Whether this field is not shown in the Details tab of the eprint control page. | This is mostly used to hide confusing internal system fields like dir. |
Internal Properties
These are set by the system. Editing them by hand will do strange things.
Name | Default Value | Required | Description | Notes |
---|---|---|---|---|
confid | n/a | YES | The ID of the dataset to which this field belongs. | The value for this property is automatically set when the field is loaded. It is used to work out what phrase ids etc. it uses. |
join_path | undef | NO | How to join a field that references a different Data Object. | This should never be defined within a field's configuration. This is built by search to support building a database query to perform a user's search. |
parent | undef | NO | A reference to the actual parent Compound field object. | This is set automatically as a reference to the parent field object. |
parent_name | undef | NO | The name of the parent Compound field to which a sub-field belongs. | This is set automatically to be the name of the parent field. |
provenance | undef | NO | Where this field configuration was generated. | Typically any field configuration either defined within a Data object or in a configuration file (in a cfg.d directory will leave this as undef by not specifying this property. However, fields created using Manage Metadata Fields will set this to user so it is clear from where this field was created. |
Deprecated Properties
Do not use these!
Name | Default Value | Required | Description | Notes |
---|---|---|---|---|
input_advice_below | undef | NO | Help text to put directly below the form input field | Deprecated. Defined but no longer functional. |
input_advice_right | undef | NO | Help text to put directly to the right of the form input field | Deprecated. Defined but no longer functional. |
input_assist | undef | NO | Provides input assistance. | Deprecated. Defined but no longer functional. |
requiredlangs | [] | NO | The natural languages that should be used in the value(s) for this field | Deprecated. Defined but no longer functional. |
sql_langid | undef | NO | The language ID for SQL. | Deprecated. Defined but no longer functional. |
sql_sorted | 0 | NO | Whether SQL should be sorted. | Deprecated. Defined but no longer functional. |
Required Phrases
These are phrases which you need to define in the local repository phrases file to control how this field renders. Some types of field (eg. set fields) have additional phrases in addition to the ones listed below.
The actual name of the field, as it will appear to users is stored in
datasetid + "_fieldname_" + fieldname
The default help to display, when the field is being input, is stored in
datasetid + "_fieldhelp_" + fieldname
For example:
<epp:phrase id="eprint_fieldname_abstract">Abstract</epp:phrase> <epp:phrase id="eprint_fieldhelp_abstract">A summary of the items content. If the item has a formal abstract then that is what should be entered here. No complicated text formatting is possible.</epp:phrase>
Database
Most fields have a representation in the SQL database using one or more columns. The sub-pages for each field type give the details.
API
When you request (or set) a value of a metadata field, it is usually handled as a perl scalar (which is a string or number).
ALL values passed around in the API should be encoded in utf-8 or BAD THINGS may happen.
For example,
$eprint->set_value( "title", "For Us, The Living" );
Sets the title to the given string.
my $foo = $eprint->get_value( "title" );
Sets $foo to the string of the title, eg. "For Us, The Living".
Multiple Fields
If a field is set to multiple, then instead of a single value, a reference to an array of values is used. Eg.
$eprint->set_value( "corp_creators", [ "Jims Research", "Jones Research ] );
Other Exceptions
See the specific page for the full details.
- name fields are represented as a hash of the parts.
- compound fields do something a bit clever.
Examples
Example field definitions that can be copied into an configuration file and edited as appropriates.