Difference between revisions of "Metadata"

Latest revision as of 13:11, 8 September 2017

Manual Sections
Introduction How to Get Help/Support History Installing EPrints & Getting Started Recommended Platforms Installing Required Software Installing EPrints Upgrading EPrints Getting Started Backups Troubleshooting Advanced Installation How to use EPrints with HTTPS Setting up email Configuring EPrints Branding/Logos Deposit Types and Metadata Organisation Hierarchy Searches Advanced Configuration/Howtos Setting up HTTPS using Let's Encrypt HTTPS-only and HSTS User Authentication using Shibboleth Technical Reference EPrints Configuration Reference Repository Configuration Reference Metadata Reference (also Functions) API Overview Entire Manual (for printing)

Metadata Field Types

There are many different types of metadata field. The type controls how a field is rendered, indexed, searched and so forth. A field always has a type and a name property, and usually has several more. Most properties are documented on this page, but some properties are only available to certain types of field, and they are listed on the page for that field.

Some of these subclasses provide very rich features, others very simple. For example the url field works just like the text field except that it's only valid if it looks like a url and when rendered it is a hyper-link.

A metadata field describes one field of data in one type of Data Object. For example the "title" field of an EPrint Object or the "email" field in a User Object.

Every Data Object has system fields (which are set by the system, and not alterable), but the User Object and EPrint Object have additional fields which are configured on a per-repository basis.

These can be customised in the user_fields.pl and eprint_fields.pl files. Note that changing these files does not automatically modify the underlying database so should (generally) only be done before the database is created. Some metadata properties do not affect the database, and are marked as such.

If you add or remove fields, or modify a property which affects the database then you'll need to alter the database to match. In 3.0 this must be done by hand, but we have plans to build a tool to do this for you.

Inheritance

This is the list of useful field types. Under it is listed the other field types which are just included for completeness and are not intended to be used as part of the configuration.

Some field types inherit the properties of another, and then modify them in some way. For example the namedset field works like a set field except that it gets its options from a namedsets file not from the options=>[] in the field properties.

Basic metadata field - this is abstract, fields must be one of the types listed below...
- Boolean - TRUE or FALSE (or can be unset, of course).
- Compound - virtual field, joins together several "multiple" fields, e.g. author_name and author_email.
  - Dataobjref - references another data object.
  - Multilang - allows language variants of a field, e.g. titles in French, German and/or English.
  - Relation - stores a typed relationship with something represented by a URI.
- Date - stores a date
  - Time - stores a date and time
    - Timestamp - stores current date and time
- Float - stores a floating-point value
  - Decimal - stores a decimal number. Specifying the length of number before and after the decimal point.
- Id - like basic text field but search only finds exact matches
  - Id (case-insensive) - like Id field but search find exact matches ignoring case (use for usernames, email addresses, etc.)
    - Email - an email address
  - Keywords - stores as longtext but searchable as exact individual keyword phrases
  - Recaptcha - virtual field to display a reCAPTCHA to prevent spamming of public input forms.
  - Text - the basic text field. Maximum 255 bytes. nb. uft-8 means some chars take more than one byte.
    - Longtext - like text but allows much longer text (65,000 bytes).
      - Longtext counter - like longtext but with a word counter (requires jQuery)
    - Pagerange - a range of page from one number to another.
    - Secret - used to store passwords and other secrets.
    - Set - a limited set of options
      - Namedset - like a normal set, but takes its options from a namedset configuration file.
      - Subject - possible values are taken from the Subject hierarchy.
      - Base64 - stores Base64 encoded data.
        Image - stores image encoded in Base64 data.
  - Url - stores a URL.
  - Uuid - stores a UUID.
- Int - an integer value
  - Bigint' - a large integer value (can be greater than 2,147,483,647 or less than -2,147,483.647).
  - Counter' - an auto-incrementing integer value.
  - Itemref - a reference to another Data Object (e.g. a user or other eprint)
  - Pagerange - a pagerange, e.g. 122-130
- Multipart - Stores a mutiple sub-fields like a person's name.
  - Name - Stores a person's name broken up into logical parts.
- Subobject - Stores another data object under a parent data object.

Internal-use and Deprecated Field Types

Basic metadata field
- File - DEPRECATED. Use Subobject field instead.
- Id
  - Text
    - Set
      - Arclanguage - as for set, but the options are the valid languages of this repository. Probably better to use Multilang field.
      - Fields - as for set, but the options are the fields in a dataset.
      - Langid - used internally by Multilang fields to store the language ID.
    - Longtext
      - Search - a serialised search
- Int
  - Year - DEPRECATED. Use Date field instead.
- Storable - Stores a serialization of a Perl data structure.

Properties

Note that true/false properties use 1 and 0 to indicate their setting.

Some properties can be temporarily set or overridden by the Workflow Format and Citation Format files.

Core Properties

Name	Default Value	Required	Description	Notes
name	n/a	YES	This is the internal name of the field. It should only contain alphanumeric characters and underscores. It will be used to identify this field in scripts, other configuration files, in the database, and in the XML export/import system, etc.	This property is not required when defining sub-fields of Compound fields where `sub_name` should be used. This property affects the database structure. It must be unique within the Data Object (so the EPrint Object cannot have two fields called `email` but the EPrint Object and User Object can each have a field with the same name.
type	n/a	YES	This sets the type of the metafield, which in turn affects what other properties it may have.	This property affects the database structure. The value must be one of the metafield types listed above.
multiple	`0`	NO	This indicates if this field is a single value or a list of values. E.g. `title` is only a single Longtext field but `creators` is a multiple Compound field.	This property affects the database structure. In the database a non-multiple field is stored in one (or more) columns in the main object table, but a multiple field gets its own table.
readonly	`0`	NO	Whether to not make this field editable in the workflow.	This is useful if you want to display the pre-generated value(s) for this field for reference whilst other fields are being edited.
sql_index	`1`	NO	When the database is created this field indicates that an SQL index should be created to speed searching.	This property affects the database structure. Different field types override the default value with the sensible option for that type of field. It is not worth putting a SQL index on a field that is only ever searched for words in it (like title or abstract) but it is worth indexing fields who's values are explicitly searched for, or where ranges are searched (e.g. Date fields, Set fields etc.). It is unlikely you will need to set this by hand. You could change it after the database has been created but this will not update the database nor have any other effect.
sub_name	`undef`	YES	This is a special property which is required instead of the `name` property for the sub-fields inside Compound fields.	This property affects the database structure. The actual name of these fields is then forced to be parent field `name + '_' + sub_name`. E.g. Compound field `creators` is a sub-field with `sub_name => 'name'`. In this case the actual name of the name field in the system, database etc. is `creators_name/tt>.`
virtual	`0`	NO	Whether this field calculates a value or stores in in the database.	Compound fields are virtual fields, whereas there sub-fields are not as they store values in the database. Other types of virtual field will require a `render_value` to be specified, as with no value stored the default render method will have nothing to display.
volatile	`0`	NO	Whether the field is liable to change frequently.	Setting `volatile => 1` will prevent new revisions being create and avoid other post commit events from being triggered, such as re-indexing.

Rendering Properties

These properties affect how values of the metadata in this field are rendered.

Certain of these properties can be turned on temporarily by the Citation Format files - render_magicstop for example.

Name	Default Value	Required	Description	Notes
as_list	`undef`	NO	Whether to display a collection sub-fields values as a table row or a separate list in the input form	This is only applicable for Compound fields that have `multiple => 1`. It is useful where the length of the table row would exceed the width of a typical user's window.
browse_link	`undef`	NO	This is the name of a view which values of this field should be linked to.	E.g. if there was a `Browse by Publishers` view configured named `pubs`, then adding `browse_link => 'pubs' to the publisher field would cause it to be linked into the browse view page for the named publisher whenever it is rendered.`
render_custom	`undef`	NO	Whether to use a pre-defined way of rendering the value for this field.	E.g. for Name fields by default the name will link to the creators browse view for that name. This property can be re-used within bespoke render functions to specify whether some custom way of rendering this field' svalue (e.g. with a link) should be used.
render_dont_link	`0`	NO	Whether rendered field is not encapsulated in a hyperlink.	Currently only affects Url fields and Email fields.
render_dynamic	`0`	NO	Whether the rendering of this field can use JavaScript to make it dynamic.	limit_names_shown.pl uses this property to determine if the list of hidden creators/editors can be expanded to show all creators/editors.
render_limit	`undef`	NO	How many values for this field should be displayed.	limit_names_shown.pl uses this property to determine how many creators/editors to display. If `undef` just render all values.
render_magicstop	`0`	NO	Whether to render a full stop at the end of this field, unless the last character is a dot, question mark or exclamation mark.	This helps avoid the ugly `World without Cheese?.` effect you get when titles end in `?` or `!`.
render_noreturn	`0`	NO	Whether CR (Carriage Return) and LF (Line Feed) characters are turned into normal spaces.
render_quiet	`0`	NO	Whether to prevent a big ugly `UNSPECIFIED` being rendered if field is unset.	E.g. setting `render_quiet => 1` on a field means it just gets rendered as nothing if it is unset.
render_single_value	`undef`	NO	The value of this property is the name of a function to call to render individual values from this field.	For a multiple field this is called once per value in the list of values. The function should take the following parameters: (`$session, $field, $value, $object`). It should return a XHTML DOM object of the rendered value.
render_value	`undef`	NO	The value of this property is the name of a function to call to render the the field as a whole.	As with `render_single_value`, but this gets passed the entire list of values (an array reference) if it is a multiple field. Parameters passed are: ( code>$session, $self, $value, $all_langs, $no_link, $object ). `$all_langs` indicates that all language variants should be shown and is only really useful for Multilang fields. `$no_link` being true is a request to place no hyperlinks in the resulting HTML. The function should return an XHTML DOM object of the rendered value.

Input and Validation Properties

Name	Default Value	Required	Description	Notes
default_value	`undef`	NO	The default value to set for this field.	This is mainly used for system fields. For custom fields it is better to use eprint_fields_default.pl or similar.
expanded_subjects	`[]`	NO	Subjects to show un-collapsed in the subject tree field in the workflow.	This is only applicable for Subject fields. All the fields listed will have their paths in the subject tree expanded so they can be seen, making them easier to find.
false_first	`0`	NO	Display the false option before the true option in the input form.	This is only applicable to Boolean fields. By default true option is always displayed before the false option.
fromform	`undef`	NO	The inverse of `toform`. This takes the value from the form and converts it into the value that will be stored in the database.	This function is passed the parameters: `$value, $session, $object, $basename` when `$value` is the value entered on the HTML form, and the return value is the value to be stored in the database. This function is not called when editing the eprint is cancelled.
get_item	`undef`	NO	A bespoke function for how to lookup the Data object based on the stored value.	Only applicable to Itemref fields.
help_xhtml	`undef`	NO	An XHTML DOM object to use for the help text for this field.	This can only be set via the Workflow Format configuration not via the metadata field directly. This is so that the workflow can conditionally change the help on a field. If you need to change the help text based on the eprint type, then you can just create a bespoke phrase with the format `"eprint_fieldhelp_" + fieldname + "." + eprint_type` (e.g. `eprint_fieldhelp_id_number.article`).
input_add_boxes	`2`	NO	The number of rows to add when clicking the `More input rows` button for a field that sets `multiple => 1`.	The default value for this property is taken from `cfg.d/field_property_defaults.pl`.
input_boxes	`3`	NO	The number of input rows to initially show for a field that sets `multiple => 1`.	The default value for this property is taken from `cfg.d/field_property_defaults.pl`.
input_cols	`60`	NO	The number of columns in an HTML form field.	The default value for this property is taken from `cfg.d/field_property_defaults.pl`. This in combination with the `maxlength` property determines the value for `size` attribute for `<input>` HTML fields or the `cols` attribute for `<textarea>` HTML fields (used by Longtext fields).
input_lookup_params	`undef`	NO	Additional parameters to pass to the `input_lookup_url`.	E.g. an indication of which autocomplete file to use.
input_lookup_url	`undef`	NO	The URL to use for autocompletion.	This is generally set using the workflow configuration rather than directly in the field configuration. The URL must be on the same server hostname as the repository.
input_ordered	`1`	NO	Whether the ordering of values needs to be captured.	In some `multiple => 1` fields, such as creators, the order of the values is important and by default numbers are shown to the left of input rows and to the right are move up and move down arrows. However, with some `multiple => 1` fields the order is not important, in which case you can set this to `0` to stop the arrows and numbers being shown.
input_rows	`10`	NO	The number of rows in an HTML form field.	The default value for this property is taken from `cfg.d/field_property_defaults.pl` This property determines the value for `size` attribute for `<select>` HTML fields (used by Set fields) or the `rows` attribute for `<textarea>` HTML fields (used by Longtext fields).
maxlength	`255`	NO	The maximum allowed length in characters for a value.	This can be a very simple validation check. Also, it may confuse users to be allowed to type in 255 characters in a field intended for something like a postcode/zipcode.
maxwords	`undef`	NO	The maximum number of words that should be entered for this field.	This field is only applicable to Longtext_counter fields. It does not restrict the number of words, it just displays this limit next to a dynamic counter of the number of words already entered.
render_input	`undef`	NO	The name of a function which will render the input for this field.	This can be difficult to use as it must return the same CGI parameters as the default input form would have. It is easiest on simple fields. The subroutine is passed the following parameters: `$field, $session, $current_value, $dataset, $staff, $hidden_fields, $object, $basename` ). It should return the XHTML DOM object of the chunk of HTML form.
required	`0`	NO	Whether the field must have a value set.	If this is set to `1` then the field is always marked as required, no matter what the Workflow Format configuration says.
separator	`undef`	NO	What character to use to separate elements of the value for the field for purposes of search	Only used by default for Keywords fields.
show_help	`undef`	NO	How to display the help text for this field in the input form.	Can be one of three values: `always`, `never` or `toggle`. Toggle (allow to expand or collapse) is used if not explicitly defined. This can only be overridden in Workflow Format configuration.
title_xhtml	`undef`	NO	An XHTML DOM object to use for the title for this field.	This can only be set via the Workflow Format configuration not via the metadata field directly. This is so that the workflow can conditionally change the title of a field. If you need to change the title based on the eprint type, then you can just create a bespoke phrase with the format `"eprint_fieldname_" + fieldname + "." + eprint_type` (e.g. `eprint_fieldname_id_number.article`).
toform	`undef`	NO	This function is allowed to modify the current value which appears in the form.	E.g. if your database stores `userid`s in a field, but you want to allow people to edit them as `username`s, then this function can be used to take the current value (a `userid`) and return the associated `username`. This value is what appears in the field in the search form. It is passed: `$value, $session, $object, $basename` and returns the user-facing version of `$value`.

Ordering, Indexing and Searching

Name	Default Value	Required	Description	Notes
make_single_value_orderkey	`undef`	NO	The `orderkey` function (potentially language specific) used to order by this field.	This property allows you to define a function or refer to a predefined function to override the default EPrints' orderkey generation. This property is passed each value from multiple fields, in turn. It is passed: `$field, $value, $dataset` and returns an `ordervalue` string.
make_value_orderkey	`undef`	NO	Like `make_single_value_orderkey` but this is passed the array reference for a multiple field rather than just single values.	It should return the `orderkey` string for the entire value. It is passed: `$field, $value, $session, $langid, $dataset`.
match	`EQ`	NO	How to match the value(s) of this field against search terms.	This property can be `EQ`, `EX`, `IN` or `SET`. Default `EQ` means treat the search term as a single string. Match only whole search term matches the field value (or one of its values if `multiple => 1`). This can be modified for the field in the search form configuration.
merge	`ALL`	NO	Whether this field's values(s) has to match any or all of the search terms	This property can be `ALL` or `ANY`, Default `ALL` means all search terms have to match the values(s) in this field. This can be modified for the field in the search form configuration. For certain values of `match` this can also be changed by the user in the search form itself.
text_index	`0`	NO	Whether the indexer considers this field for full-text indexing.	Some types of metadata field have a default of `1`, e.g. Text fields and Longtext fields.
search_cols	`40`	NO	How many characters (columns) wide the input field for searching this field.	The default value for this property is taken from `cfg.d/field_property_defaults.pl`. If one search field searches more than one field, then the properties from the first field listed are used.

Other Properties

This may be applicable to text

Name	Default Value	Required	Description	Notes
allow_null	`1`	NO	Whether the value(s) stored in the database when no input is entered should be `NULL` or an appropriate default value based on its database field type.	This should generally never be set to `0` and certainly should not be changed to `0` after the field has been added to the database. You are much better off configuring a default value using `eprint_fields_default.pl` or similar.
can_clone	`1`	NO	Whether the value(s) for this field should be cloned if a new record is created.	This property is mostly used by system fields such as `dir` or `datestamp`. It is applicable when an eprint is cloned using the `Use as template` or `New version` action buttons.
export_as_xml	`1`	NO	Whether the field value(s) should be exported in an XML export.	This is handy to suppress either confidential or confusing fields, like the `fileinfo</file> system field.`
import	`1`	NO	Whether new data objects create from an import can set values for the field.	E.g. `eprintid`, `dir are determined when the eprint record is created. Imported metadata would not choose an appropriate value for such fields.`
replace_core	`0`	NO	Whether the field configuration should replace the exisiting core (system) field with the same `name`	This is useful for particularly bespoke requirements. It should be used with great care, as system fields are usually hard-coded because they should not be changed.
show_in_fieldlist	`1`	NO	Whether to allow this field to appear in fields lists.	If set to `0` will prevent this field appearing in Fields field lists. This is primarily to allow you to remove it from the list of fields in the user configuration which are used to control which fields appear as columns in the Items and Review screens.
show_in_html	`1`	NO	Whether this field is not shown in the Details tab of the eprint control page.	This is mostly used to hide confusing internal system fields like `dir`.

Internal Properties

These are set by the system. Editing them by hand will do strange things.

Name	Default Value	Required	Description	Notes
confid	n/a	YES	The ID of the dataset to which this field belongs.	The value for this property is automatically set when the field is loaded. It is used to work out what phrase ids etc. it uses.
join_path	`undef`	NO	How to join a field that references a different Data Object.	This should never be defined within a field's configuration. This is built by search to support building a database query to perform a user's search.
parent	`undef`	NO	A reference to the actual parent Compound field object.	This is set automatically as a reference to the parent field object.
parent_name	`undef`	NO	The name of the parent Compound field to which a sub-field belongs.	This is set automatically to be the name of the parent field.
provenance	`undef`	NO	Where this field configuration was generated.	Typically any field configuration either defined within a Data object or in a configuration file (in a `cfg.d` directory will leave this as `undef` by not specifying this property. However, fields created using Manage Metadata Fields will set this to `user` so it is clear from where this field was created.

Deprecated Properties

Do not use these!

Name	Default Value	Required	Description	Notes
input_advice_below	`undef`	NO	Help text to put directly below the form input field	Deprecated. Defined but no longer functional.
input_advice_right	`undef`	NO	Help text to put directly to the right of the form input field	Deprecated. Defined but no longer functional.
input_assist	`undef`	NO	Provides input assistance.	Deprecated. Defined but no longer functional.
requiredlangs	`[]`	NO	The natural languages that should be used in the value(s) for this field	Deprecated. Defined but no longer functional.
sql_langid	`undef`	NO	The language ID for SQL.	Deprecated. Defined but no longer functional.
sql_sorted	`0`	NO	Whether SQL should be sorted.	Deprecated. Defined but no longer functional.

Required Phrases

These are phrases which you need to define in the local repository phrases file to control how this field renders. Some types of field (eg. set fields) have additional phrases in addition to the ones listed below.

The actual name of the field, as it will appear to users is stored in

datasetid + "_fieldname_" + fieldname

The default help to display, when the field is being input, is stored in

datasetid + "_fieldhelp_" + fieldname

For example:

   <epp:phrase id="eprint_fieldname_abstract">Abstract</epp:phrase>
   <epp:phrase id="eprint_fieldhelp_abstract">A summary of the items content. 
      If the item has a formal abstract then that is what should be entered 
      here. No complicated text formatting is possible.</epp:phrase>

Database

Most fields have a representation in the SQL database using one or more columns. The sub-pages for each field type give the details.

API

When you request (or set) a value of a metadata field, it is usually handled as a perl scalar (which is a string or number).

ALL values passed around in the API should be encoded in utf-8 or BAD THINGS may happen.

For example,

$eprint->set_value( "title", "For Us, The Living" );

Sets the title to the given string.

my $foo = $eprint->get_value( "title" );

Sets $foo to the string of the title, eg. "For Us, The Living".

Multiple Fields

If a field is set to multiple, then instead of a single value, a reference to an array of values is used. Eg.

$eprint->set_value( "corp_creators", [ "Jims Research", "Jones Research ] );

Other Exceptions

See the specific page for the full details.

name fields are represented as a hash of the parts.

compound fields do something a bit clever.

Examples

Example field definitions that can be copied into an configuration file and edited as appropriates.

Difference between revisions of "Metadata"

Latest revision as of 13:11, 8 September 2017

Contents

Metadata Field Types

Inheritance

Internal-use and Deprecated Field Types

Properties

Core Properties

Rendering Properties

Input and Validation Properties

Ordering, Indexing and Searching

Other Properties

Internal Properties

Deprecated Properties

Required Phrases

Database

API

Multiple Fields

Other Exceptions

Examples

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Wiki

Tools

@@ Line 1: / Line 1: @@
-==EPrints Metadata==
+{{Manual}}
-===Introduction===
+{{EPrints Metadata Fields Content}}
-Metadata is data about data. In this case information about the documents we are storing.
-This section describes how to configure the metadata of an archive, and gives information on the various properties of metadata in GNU EPrints.
-From the point of view of the database, all eprint records have the same metadata fields. Although each eprint type will only expose a subset of those fields to the user interface.
-===Modifying the Metadata fields in an Archive===
-To add metadata fields you must edit <tt>ArchiveMetadataFieldsConfig.pm</tt> and then erase and recreate the database tables. This will destroy all your data. If you want to add or modify fields to an archive without destroying your data, you will need to go into the SQL.
-All types of record in GNU EPrints have metadata. All have a core set of fields which may not be modified as these are required by the software. The "eprints" and "users" table also have a list of metadata fields which are used for storing the information about users and eprints.
-In addition to the metadata types there are functions in <tt>ArchiveMetadataFieldsConfig.pm</tt> which can be modified to set default values of fields and to set certain fields based on values from other fields (automatics).
-Automatics are useful as they allow you store the simple answer to a more complex question. Eg. "Does this record have a document which has a security level public?" which a user may want to make a search criteria. If this is stored as a simple field in the record then it is easy to search on it.
-===Default Configuration===
-The default configuration of metadata fields, as of v2.3.0, has been designed as part of a collaboration between EPrints and library staff as a good starting setup for an institutional archive. It will still probably want some change or other depending on your requirements.
-There are also so default values for some fields and some automatics.
-The default value for the "hideemail" of user records is "TRUE" because we don't want to show peoples emails unless we were given explicit permission.
-If the "frequency" field of a user record is not set, it automaticall gets set to "never". This is because undefined means pretty much the same as "never" but it makes it clearer not to have it listed as "unspecified". "frequency" only applies to editors. It's how often they get sent updates on what needs approving.
-The default security level for a document is "" (public). The default language for a document is the language of the current session (whatever requested by the user doing the deposit). This is not important unless you have an archive which cares what languages individual documents are in.
-The default type for an eprint is "article". This means that eprints can never have an undefined type.
-There are several automatics for eprint records. If the eprint is a "monograph" or "thesis" and it does not have the "institution" field set, then nothing happens... but there is some code you can uncomment to make it set "institution" to the name of your University (or whatever).
-If the eprint is of type "patent" then it is automatically set to "published" as we should never be storing unpublished patents.
-If the eprint is of type "thesis" then it is automatically set to "unpublished" as thesis' do not get published. If this is incorrect for your archive it is easily disabled.
-The "effective date" field is set to the same value as the "date of issue". Unless it's undefined in which case it's set to the value of the "date of submission". If they are ''both'' undefined then it is set to the datestamp of the metadata record. This means that "effective date" contains the "best" date for this record, to use when searching and ordering. It's not actually rendered anywhere.
-"full_text_status" is set to a value indicating the status of the full text. One of "none" - no full text, "public" - full text is available, "restricted" - full text is available but is restricted in some way (security setting is not public).
-There is also a rather ugly hack to set the value of "fileinfo" to info about the documents so that it can be rendered as icons in citations.
-==Fields Configuration==
-Fields have a number of properties. The only required properties are "name" and "type". Name is the name of the field. This is used to identify this throughout the system. The other properties depend on what type the field is.
-When you add a field you need to add the "human readable" version in the phrase file, this seperation allows you to change the description without changing the field itself. When you add a field named "foo" to the "eprint" metadata you should add "eprint_typename_foo" to the phrases. You may also wish to add "eprint_typehelp_foo" which is the explanation given to the user on the metadata input page.
-The following types of field are supported, along with their special property options.
-(there are some internal types not mentioned here. There use is not recommended.)
-==Metadata Types==
-There are a number of different types which are stored, input, rendered and searched differently.
-Some types extend more simple types. Eg. "Year" extends "int", but forces a limitation of 4 digits and all the descriptive text is different.
-It is theoretically possible to add your own types which inherrit from the inbuilt ones. This should be approached with caution. Pagerange is a good example to look at, when considering making your own types. You must make sure your ArchiveConfig.pm has a "use" for the module for your new type as it won't be loaded otherwise. The module name is the same as for the type, except that the first letter is capitalised. Field type "latlong" would be described by module <tt>EPrints::MetaField::Latlong</tt>.
-; int : Optional properties: digits
-This type describes a positive integer. Stored as an <tt>INT</tt> in the database.
-; year : Where possible use a "date" field with a minimum resolution of "year" instead of the "year" type. That way the field can be treated as a date in the searches rather than an int.
-This type describes a year. It works pretty much like "int" but is always 4 digits long. Stored as an <tt>INT</tt> in the database.
-; longtext : Optional properties: input_rows, input_cols, search_cols
-This type describes an unlimited length text field. Used for things like titles and abstracts. It can't be effiently searched as a single value, the system indexes the words. See "free text indexing" section. Stored in MySQL as a <tt>TEXT</tt> field.
-; date : Optional properties: min_resolution
-This type describes a date, always expressed as YYYY-MM-DD, eg. 1969-05-23. It is stored as a <tt>DATE</tt> in the database.
-; boolean : Optional properties: input_style
-This is a simple yes/no field which is stored in the database as <tt>SET( 'TRUE','FALSE' )</tt>. It can be rendered as a menu, a check box or radio buttons. (See input_style)
-; name : Optional properties: input_name_cols, search_cols, hide_honourific, hide_lineage, family_first
-This type is used to store names of people (eg. authors). It is split into 4 parts: honourific, given names, family name and lineage. This may seem over fussy but it avoids people putting "Reverend" in the given names or "Junior" in the family name. If you dislike this you can hide honourific and lineage (See ArchiveConfig.pm).
-We use "family name" rather than "last name" in the hope of avoiding international confusion (some countries list family name first, so their last name is what I would call their "christian", or "first", name.
-Names are stored using 4 SQL fields. The name field "supervisor" would be stored as supervisor_honourific, supervisor_given, supervisor_family, supervisor_lineage. Each is a <tt>VARCHAR(255)</tt>.
-; set : Required properties: options
-Optional properties: input_rows, search_rows
-This type is a limited set of options. The list of options must be specified. Each option must also be added to the phrase file. Option "foo" of field "bar" in the "user" dataset will have the phrase id "user_fieldopt_bar_foo".
-Stored in the database as a <tt>VARCHAR(255)</tt>, containing the id of the option.
-; text : Optional properties: input_cols, maxlength, search_cols
-This is a simple text field. It normally has a maximum length of 255 ASCII characters, less if non-ASCII characters are used as these are UTF-8 encoded.
-Stored in the database as a <tt>VARCHAR(255)</tt>.
-; secret : Identical to "text" except that the input field is a starred-out password input field, and it is only ever written to the database, it can't be read back. Writing an empty value will NOT change the previous value.
-; url : Identical to "text" except it is rendered and validated differently.
-; email : Identical to "text" except it is rendered and validated differently.
-; subject : Optional properties: top, showtop, showall, input_rows, search_rows
-This is a hierarchical subject tree. At first glance it works like sets, but it can be searched for all items in or below a given subject. Subjects may be added to the live system.
-The subject tree starts at a subject with the id "ROOT" but a subject ''field'' only offers all the items below the subject with the id "subjects". This can be changed using the "top" property, so that you can have two fields which options are different parts of the same tree.
-Subjects may have more than one parent. eg. ''biophysics'' can appear under both ''physics'' and ''biology'', while still being the same subject.
-See the bin/import_subjects manpage for more information on seting up the initial subjects.
-You may have more than one "subject" field, eg. Subject and Department, with unrelated parts of the subject tree as their "top".
-A later version of eprints2 will have a feature which allows an admin user to limit an editor user to a certain subject (and things below it). So that in the above example you can declare an editor of either a Subject (capital-S) or a Department.
-; pagerange : A range of pages, eg 1-44. Currently not searchable.
-Stored in the database as a <tt>VARCHAR(255)</tt>.
-; datatype : Required properties: datasetid
-Optional properties: input_rows, search_rows
-This field works like a set, but gets its options from the types of the dataset specified.
-For example, if you specified the datasetid "user" then, unless you've changed the defaults, would give the options "user","editor" and "admin" - which are the types of user specified in '''metadata-types.xml'''.
-Options are:
-;; ''user'' : The types of user.
-;; ''document'' : The types of document.
-;; ''eprint'' : The types of eprint.
-;; ''security'' : Security levels of a document (probably not very useful).
-;; ''language'' : All the languages specified in '''languages.xml'''
-;; ''arclanguage'' : The languages supported by this archive. Configured in ArchiveConfig.pm. Stored in the database as a <tt>VARCHAR(255)</tt>.
-;; langid : This is used internally, it contains an ISO language ID. You probably don't want to use it. Stored as a CHAR(16).
-;; id : This is also used internally, it contains the ID part of a field with the hasid property. Don't use it! Stored in the database as a <tt>VARCHAR(255)</tt>.
-; search (Since EPrints v2.1) : Required properties: datasetid, fieldnames
-Optional properties: allow_set_order
-This type describes a stored search acting on the named dataset. The fields that can be searched are described by fieldnames.
-This field type is quite unusual and you are not really expected to use it. It was created for use in the systems field of the Subscription dataset.
-This field is stored in MySQL as a <tt>TEXT</tt> field.
-Field Properties:
-"status" indicates either "system" or "cosmetic" or "other". "system" properties cannot be changed without erasing and recreating your archive. "cosmetic" fields only effect the display of data and can be safely changed. "other" is explained in the description.
-; name : Status: system
-Required by: all
-Default: NO DEFAULT
-The name of the field. Strongly recommended to only be lowercase a-z only.
-; type : Status: system
-Required by: all
-Default: NO DEFAULT
-The type of field. One of the list described above.
-; browse_link : Status: cosmetic
-Optional on: all
-Default: undef
-This is the id of a "browse" view. This will hyperlink this value to the browse for that value when rendering it.
-; confid : Status: cosmetic
-Internal use only. Sets the confid if a field is being created without a dataset. The confid is used as a fake dataset for generating phrase ids.
-; datasetid : Status: other
-Required by: datatype
-Default: NO DEFAULT
-Used to set which dataset's types are this fields options.
-Changing this on a live system could cause some confusion, as values in the old dataset may exist.
-; digits : Status: cosmetic
-Optional on: int
-Default: 20
-Maximum number of digits for this number.
-; input_rows : Status: cosmetic
-Optional on: longtext, set, subject, datatype
-Default: set in ArchiveConfig.pm
-The number of input rows in a text area, or options to display at once in a menu. Setting to 1 will make a pull down menu (unless this is a "multiple" field).
-; search_cols : Status: cosmetic
-Optional on: text, longtext, url, email, name, id
-Default: set in ArchiveConfig.pm
-The width of the search field. If searching multiple fields at once then the value is taken from the first field in the list.
-; search_rows : Status: cosmetic
-Optional on: datatype, set, subject
-Default: set in ArchiveConfig.pm
-The number of items to display in a search field list. If searching multiple fields at once then the value is taken from the first field in the list.
-; input_cols : Status: cosmetic
-Optional on: text, longtext, url, email
-Default: set in ArchiveConfig.pm
-The width of the input field.
-; input_name_cols : Status: cosmetic
-Optional on: name
-Default: set in ArchiveConfig.pm
-The width of the input fields of a "name" field.
-; input_id_cols : Status: cosmetic
-Optional on: fields with "hasid" set.
-Sets the width of the ID input field on a field with an ID.
-Default: set in ArchiveConfig.pm
-; input_add_boxes : Status: cosmetic
-Optional on: fields with "multiple" or "multilang" set.
-Default: set in ArchiveConfig.pm
-How many boxes to add when the user hits the "more spaces" button.
-; input_boxes : Status: cosmetic
-Optional on: fields with "multiple" set.
-Default: set in ArchiveConfig.pm
-How many boxes to initially show on a multiple field.
-; input_style : Status: cosmetic
-Optional on: boolean
-Default: undef
-By default booleans render as a check box. These other formats look a bit clearer on the input field:
-;; menu : Display as a pull-down menu. You will need to set the phrases ''dataset''_fieldopt_''fieldname''_TRUE and ''dataset''_fieldopt_''fieldname''_FALSE (where dataset & fieldname are the ids of the dataset and field). These are the menu options.
-;; radio : Display as radio buttons (ones which deselect when you select another one). You will need to set the phrase ''dataset''_radio_''fieldname''. This phrase should have two "pin" elements: true and false, which are the positions to place the radio buttons.
-; input_assist : Status: cosmetic
-Optional to: all
-Default: undef
-Add an internal button which reloads the page, with a "#" jump to make the page load at the current input field. The assist button does not do anything except cause the page to be reloaded. This is intended to work with the input_advice fields.
-; input_advice_right : Status: cosmetic
-Optional to: all
-Default: undef
-If defined this should be a function pointer which takes params ( $session, $field, $value )
-value is the current value of the field.
-The return value of this should be an XHTML chunk based on the value. This XHTML will appear to the right of the input fields for the value.
-This is intended to give useful advice, which as if the field is an int which is the eprintid of another eprint, this feature could render then name and a link to that eprint to appear next to the integer input box.
-; input_advice_below : Status: cosmetic
-Optional to: all
-Default: undef
-As with input_advice_right, only the results appear below, not to the right. Both _right and _below may be used.
-; fromform : Status: cosmetic
-Optional to: all
-Default: undef
-A reference to a perl function which will process the value from the form before storing it. The function will be passed ($value, $session) where value is the value from the form and session is the current EPrints::Session. It should return the processed value.
-This could be used, for example, to turn a username "moj199" into a userid "312" for internal user.
-; toform : Status: cosmetic
-Optional to: all
-Default: undef
-A reference to a perl function which will process the value just before it is displayed in the form. The function will be passed ($value, $session) where value is the value from the database and session is the current EPrints::Session. It should return the processed value.
-This could be used, for example, to turn a userid "312" being used internally by your systems into more human-friendly username "moj199".
-If you use toform then you should probably set fromform to change your values back again.
-; maxlength : Status: cosmetic
-Optional to: text, email, url, secret
-Default: 255
-The maximum length of the value.
-; hasid : Status: system
-Optional to: all
-Default: 0
-This adds an additional "ID" property to the field. This is most useful on a "name" field which is "multiple". It associates an additional value with the name, for example a username, or email address, which can be used to ''uniquely'' identify that person. If you want to get an accurate list of all of someones papers then their name is NOT good enough.
-You might also wish to make a "publication" text field have an ID which is an optional ISSN, but it makes more sense in "multiple" fields.
-; multilang : Status: system
-Optional to: all (but silly for date, year, int, boolean)
-Default: 0
-If set this makes the field "multilingual". That is to say it can have more than one value, one value per language.
-For example, the "canadian stuff" archive may wish to make your title and abstract multilang so that authors can enter them in both french and english.
-This is more useful than having title_en and title_fr as eprints ''understands'' it and can render the version of the field appropriate to the viewer (if they set a language preference).
-; multiple : Status: system
-Optional to: all (but silly for date, year, int, boolean)
-Default: 0
-If set this property makes the field a LIST rather than one value and handles rendering it as a list and inputing it. The input field will appear with a default of 3 inputs and a "more spaces" button which will reload the page with more if you need more than 3.
-This causes the field to be stored in a seperate SQL table.
-; options : Status: other
-Required by: set
-Default: NO DEFAULT
-This should be a array of options. eg.
-<code>
- [ "blue", "green", "red" ]
-</code>
-Removing options on a live system could leave invalid values floating around. Adding options is fine. Don't forget to add them to the phrase file too.
-; required : Status: system
-Optional to: all
-Default: 0
-This indicates that this field is ''always'' required. It is not recommended to set this, but rather indicate requirednes of fields by type in the metadata-types.xml file.
-Either way you set it, required fields will cause the item they are in to fail to validate unless the field has a value.
-; requiredlangs : Status: other
-Optional to: fields with "multilang" property
-Default: []
-A list of languages which are required for this multilang field. eg. you can force an "en" (english) entry, while allowing them to optionally add others.
-eg. [ "en", "fr" ]
-A list of codes can be found in languages.xml
-Adding more requiredlangs does not magically give you values for these languages in existing data.
-; showall : Status: cosmetic
-optional to: subjects
-Default: 0
-By default subjects are only shown if they are "depositable". This option makes all subjects, depositable or not, options.
-; showtop : Status: cosmetic
-optional to: subjects
-Default: 0
-If set then the topmost item in the subject is shown. Usually this is a container, eg. "subjects", and should remain hidden.
-; top : Status: cosmetic
-optional to: subjects
-Default: "subjects"
-Sets the top node in the tree. The options are all the children (and their children).
-; idpart : Used internally.
-; mainpart : Used internally.
-; render_single_value : Status: cosmetic
-Optional to: all
-Default: undef
-This overrides the rendering of a single item. In a multiple, multilang field it will be called on each value of the language to display.
-This is a reference to a function which takes ( $session, $field, $value ) and returns a XHTML DOM fragment.
-Set this to \&EPrints::Latex::render_string to make eprints try and spot latex in this fields values and render it as images instead!
-(Since EPrints v2.1) Set this to \&EPrints::Utils::render_xhtml_field to make eprints read this field as XML and place that XML right in the XHTML web page. (Normally the system would escape all the greater-than and less-than characters.
-; render_value : Status: cosmetic
-Optional to: all
-Default: undef
-This is a reference to a function which will render the entire value of the field, overriding eprints own renderer. It should take as parameters: ( $session, $field, $value, $alllangs, $nolink )
-The function should return an XHTML DOM fragment.
-If $alllangs is set then the function should render all values on a multilang field, rather than just the "best" one.
-If $nolink is set then no HTML anchor links should be used, eg. to link a URL.
-; render_opts (v2.3.0) : Status: cosmetic
-Optional to: all
-Default: undef
-This allows you to specify certain minor tweaks in how this fields values are displayed without going to all the trouble of creating a custom render_value subroutine.
-See the section on "Metadata Field Render Options" for details.
-; export_as_xml : Status: cosmetic
-Optional to: all
-Default: 1
-If this attribute is set to zero then this field will be ommitted from the output of the XML export script.
-; make_value_orderkey : Status: other
-Optional to: all
-Default: undef
-This may be a reference to a subroutine which returns a single string which can be used to alphabetically sort this string. It is used to order the results within the database. The function is passed the following parameters ( $field, $value, $session, $langid ). You may wish to sort certainly fields differently for different languages.
-For example - for some reason you may want a field formated with a single character then an integer ( a934 or b3 ) - If you sort this alphabetically then a2 would come after a11. So you make the orderkey function do something like:
-<code>
- $value =~ m/^(.)([0-9]+)$/;
- return sprintf( "%s%08d", $1, $2 );
-</code>
-This would turn a2 into a00000002 and a11 into a00000011 which will sort correctly alphabetically. Don't worry - these values are only ever used for sorting, they should never get output.
-You should probably use the bin/reindex command on the dataset in question (probably "archive" or "user" after adding or changing this property to a field. This may take a significant amount of time.
-; make_single_value_orderkey : Status: other
-Optional to: all
-Default: undef
-This is a slightly more simple version of make_value_orderkey. It only takes ( $field, $value ) as parameters. It is only ever passed single values of $value and lets eprints takes care of multiple values (or multilang values) by calling the function once per value.
-As with make_value_orderkey you should reindex after meddling with orderkeys.
-; fieldnames : Status: cosmetic(ish)
-Required by: search
-Default: NO DEFAULT
-This should be a reference to an array of field names - exactly like the ones used in ArchiveConfig.pm to configure search, advanced search and subscriptions.
-Adding fields to this will cause no problem. Removing fields will mean that those fields are ignored when turning values of this field back into searches.
-; can_clone (since v2.2) : Status: changeable (but changes functionality)
-Default: 1
-If can_clone is set to zero then this field will not be cloned when the record is cloned. This may be useful for automaticly generated fields or fields with meaning such as "content has been spellchecked" or somesuch.
-; sql_index (2.2) : Status: system
-Default: 1
-If this field is set to zero then an SQL index will NOT be created for it. This means the field should never be used in a "value exactly matches" search as it may be very slow. MySQL has a limit of 32 indexes per table, which is why you should use this field if you go over that limit.
-; id_editors_only (2.2) : Status: cosmetic
-Default: 0
-Optional on: fields with "has_id" set.
-It means that the "id" part of the field only appears in the editor view, not the normal user submission form. Some archives may wish to do this to save confusing the person making the deposit.
-; allow_set_order (2.2) : Status: changeable (but changes functionality)
-Default: 1
-Optional on: search
-Prompt user for a search order in addition to the search fields.
-; min_resolution (2.3.0) : Status: changeable
-Default: day
-Optional on: date
-If this is set to "month" then the "day" part of date field will be made optional in the input form and validation.
-If this is set to "year" then both the "day" and "month" parts will be optional.
-This allows you to allow users to only enter "2003" if that's all they know, without preventing them give the exact date if relevant and known.
-; hide_honourific (2.3.0) : Status: changeable
-Default: 0
-Optional on: name
-If set to true (1) then the honourific field does not appear in the input form for this field.
-; hide_lineage (2.3.0) : Status: changeable
-Default: 0
-Optional on: name
-As for honourific.
-; family_first (2.3.0) : Status: changeable
-Default: 0
-Optional on: name
-If set to true (1) then the input form presents the "family" field before the "given" field. This seems to make librarians happy.