Difference between revisions of "Metadata"

From EPrints Documentation
Jump to: navigation, search
(Description)
(Input and Validation Properties)
Line 98: Line 98:
 
| '''required''' || 0 || If this is set to true then the field is always marked as required, no matter what the workflow says.
 
| '''required''' || 0 || If this is set to true then the field is always marked as required, no matter what the workflow says.
 
|-
 
|-
| '''
+
| '''input_add_boxes''' || 2 *config || The number of rows to add when clicking the "more rows" button in a multiple or multilang field.
 +
|-
 +
| '''input_boxes''' || 3 *config || The number of input rows to initially show in a multiple field.
 +
|-
 +
| ''input_cols''' || 60 *config || The number of columns in a text input field.
 +
|-
 +
| ''input_rows'' || 10 *config || For longtext input fields, the number of rows of input to show. For set fields this is the number of items to show in a select menu.
 +
|-
 +
| ''input_lookup_url'' || undef || The URL to use for autocompletion. This is generally set using the workflow configuration rather than directly in the field configuration. The URL must be on the same server hostname as the repository.
 +
|-
 +
| ''input_lookup_params'' || undef || Additional parameters to pass to the input_lookup_url. For example an indication of which autocomplete file to use.
 +
|-
 +
| input_ordered || 1 || This is true by default. In some multiple fields, such as creators, the order of the values is important and by default numbers are shown to the left of input rows and to the right are "move up" and "move down" arrows. However, with some multiple fields the order is not important in which case you can set this to zero to stop the arrows and numbers being shown.
 +
|-
 +
| render_input || undef || The name of a subroutine which will render the input for this field. This is a bit tricky to use as it must return the same CGI parameters as the default input form would have. It's easiest on simple fields. The subroutine is passed the following parameters ( $field, $session, $current_value, $dataset, $staff, $hidden_fields, $object, $basename ). It should return the [[XHTML DOM]] object of the chunk of HTML form.
 +
|-
 +
| maxlength || 255 || This is a limit to the maximum allowed size of a value. It may be useful, for example, as a very simple validation check. Also it may confuse users to be allowed to type in 255 characters in a "postcode/zipcode" field.
 +
|-
 +
| toform || undef || This function is allowed to modify the current value which appears in the form. For example, if your database stores userids in a field, but you want to allow people to edit them as usernames, then this function can be used to take the current value (a userid) and return the associated username. This value is what appears in the field in the search form. It is passed ( $value, $session, $object, $basename ) and returns the user-facing version of $value.
 +
| fromform || undef || The inverse of toform. This takes the value from the form and converts it into the value that will be stored in the database. It is passed the parameters ( $value, $session) when $value is the value entered on the web form, and the return value is the value to be stored in the database.
 +
|}
  
 +
| '''can_clone''' || 1 || no || no || core || If this is set to false then this field is not copied when the object is cloned. This is mostly used by system fields such as "dir" or "datestamp".
  
  
| '''can_clone''' || 1 || no || no || core || If this is set to false then this field is not copied when the object is cloned. This is mostly used by system fields such as "dir" or "datestamp".
+
|-
 +
| '''input_advice_right''' || undef || Do not use.
 +
|-
 +
| '''input_advice_right''' || undef || Do not use.
 +
|-
 +
| '''input_assist''' || undef || Do not use.

Revision as of 16:33, 8 January 2007

Warning This page is under development as part of the EPrints 3.4 manual. It may still contain content specific to earlier versions.


Metadata Field Types

There are many different types of metadata field. The type controls how a field is rendered, indexed, searched and so forth. A field always has a type and a name property, and usually has several more. Most properties are documented on this page, but some properties are only available to certain types of field, and they are listed on the page for that field.

Some of these subclasses provide very rich features, others very simple. For example the url field works just like the text field except that it's only valid if it looks like a url and when rendered it is a hyper-link.

  • Basic metadata field
    • Boolean - TRUE or FALSE (or can be unset, of course)
    • Compound - virtual field, joins together several "multiple" fields. eg. author_name and author_email
      • Multilang - allows language variants of a field. eg. titles in French, German and/or English.
    • Date - stores a date.
      • Time - stores a date and time.
    • File - virtual field represtenting the files in a document
    • Float - stores a floating point value
    • Id - deprecated (do not use)
    • Int - a positive integer value
    • Search - a serialised search
    • Set - a limited set of options
      • Arclanguage - as for set, but the options are the valid languages of this repository
      • Fields - as for set, but the options are the fields in a dataset.
      • Langid - used internally by multilang fields to store the language id.
      • Namedset - like a normal set, but takes its options from a namedset configuration file.
      • Subject - possible values are taken from the Subject heirarchy.
    • Subobject - a virtual field, similar to itemref, but representing an object or objects which are sub-parts of the current object (as oppose to just related in some way)
    • Text - the basic text field. Maximum 255 bytes. nb. uft-8 means some chars take more than one byte.
      • Email - an email address
      • Fulltext - virtual field used to represent to the full text of an eprint
      • Longtext - like text but allows much longer text (65,000 bytes)
      • Name - Stores a persons name broken up into logical parts.
      • Secret - used to store passwords.
      • Url - stores a URL

Description

A metadata field describes one field of data in one type of Data Object. For example the "title" field of an EPrint Object or the "email" field in a User Object.

Every Data Object has system fields (which are set by the system, and not alterable), but the User Object and EPrint Object have additional fields which are configured on a per-repository basis.

These can be customised in the user_fields.pl and eprint_fields.pl files. Note that changing these files does not automatically modify the underlying database so should (generally) only be done before the database is created. Some metadata properties do not affect the database, and are marked as such.

If you add or remove fields, or modify a property which affects the database then you'll need to alter the database to match. In 3.0 this must be done by hand, but we have plans to build a tool to do this for you.

Default values marked *config indicate that the default value for the repository may be modified in the configuration file field_property_defaults.pl

Properties

Note that true/false properties use 1 and 0 to indicate their setting.

Core Properties

name default description
name n/a This property is always required. This property affects the database structure. This is the internal name of the field. It should only contain a-z and underscores. It will be used to identify this field in scripts, other configuration files, in the database, and in the XML export/import system, etc. It must be unique within the object (so the EPrint Object can't have two fields called "email" but the eprint object and User Object could have a field each of the same name.
type n/a This property is always required. This property affects the database structure. This sets the type of the metafield, which in turn affects what other properties it may have. The value must be one of the metafield types listed above.
multiple 0 This property affects the database structure. This indicates if this field is a single value or a list of values. eg. "title" is only a single longtext field but "creators" is a multiple name field. In the database a non-multiple field is stored in one (or more) columns in the main object table, but a multiple field gets its own table.
sql_index 1 When the database is created this field indicates that an SQL index should be created to speed searching. Different field types override the default value with the sensible option for that type of field. It's not worth putting an sql index on a field that is only ever searched for words in it (like title or abstract) but it is worth indexing fields whoes values are explicitly searched for, or where ranges are searched - date fields, set fields etc. It's unlikely you'll need to set this by hand. You could change it after the database has been created; it won't break anything. In fact, it won't do anything at all.

Rendering Properties

These properties affect how values of the metadata in this field are rendered.

Certain of these properties can be turned on temporarily by the Citation Format files - render_magicstop for example.

name default description
browse_link undef This is the name of a view which values of this field should be linked to. For example if their was a browse by publishers view configured named "pubs", then adding browse_link=>"pubs" to the publisher field would cause it to be linked into the page for the named publisher whenever it is rendered.
render_quiet 0 Normally if a field is rendered an it isn't set, it is rendered as a big ugly "UNSPECIFIED". Setting render_quiet on a field means it just gets rendered as nothing if it's empty.
render_magicstop 0 If true then this renders a full stop at the end of this field. Unless the last character is a dot, question mark or exclamation mark. This helps avoid the ugly "World without Cheese?." affect you get when titles end in ? or !.
render_noreturn 0 If true then all CR and LF's are turned into normal spaces.
render_dont_link 0 Set this to true to stop this field hyperlinking itself when rendered. Currently only affects url fields and email fields.
render_single_value undef The value of this property is the name of a subroutine to call to render values from this field. For a multiple field this is called once per value in the list of values. The function should take the following parameters: ( $session, $field, $value, $object). It should return an XHTML DOM object of the rendered value.
render_value undef As with render single value, but this gets passed the entire list of values (an array reference) if it's a multiple field. Parameters passed are: ( $session, $self, $value, $all_langs, $no_link, $object ). $all_langs indicates that all language variants should be shown - only really useful for multilang fields. $no_link being true is a request to place no hyperlinks in the resulting HTML. It should return an XHTML DOM object of the rendered value.

Input and Validation Properties

name default description
required 0 If this is set to true then the field is always marked as required, no matter what the workflow says.
input_add_boxes 2 *config The number of rows to add when clicking the "more rows" button in a multiple or multilang field.
input_boxes 3 *config The number of input rows to initially show in a multiple field.
input_cols' 60 *config The number of columns in a text input field.
input_rows 10 *config For longtext input fields, the number of rows of input to show. For set fields this is the number of items to show in a select menu.
input_lookup_url undef The URL to use for autocompletion. This is generally set using the workflow configuration rather than directly in the field configuration. The URL must be on the same server hostname as the repository.
input_lookup_params undef Additional parameters to pass to the input_lookup_url. For example an indication of which autocomplete file to use.
input_ordered 1 This is true by default. In some multiple fields, such as creators, the order of the values is important and by default numbers are shown to the left of input rows and to the right are "move up" and "move down" arrows. However, with some multiple fields the order is not important in which case you can set this to zero to stop the arrows and numbers being shown.
render_input undef The name of a subroutine which will render the input for this field. This is a bit tricky to use as it must return the same CGI parameters as the default input form would have. It's easiest on simple fields. The subroutine is passed the following parameters ( $field, $session, $current_value, $dataset, $staff, $hidden_fields, $object, $basename ). It should return the XHTML DOM object of the chunk of HTML form.
maxlength 255 This is a limit to the maximum allowed size of a value. It may be useful, for example, as a very simple validation check. Also it may confuse users to be allowed to type in 255 characters in a "postcode/zipcode" field.
toform undef This function is allowed to modify the current value which appears in the form. For example, if your database stores userids in a field, but you want to allow people to edit them as usernames, then this function can be used to take the current value (a userid) and return the associated username. This value is what appears in the field in the search form. It is passed ( $value, $session, $object, $basename ) and returns the user-facing version of $value. fromform undef The inverse of toform. This takes the value from the form and converts it into the value that will be stored in the database. It is passed the parameters ( $value, $session) when $value is the value entered on the web form, and the return value is the value to be stored in the database.

| can_clone || 1 || no || no || core || If this is set to false then this field is not copied when the object is cloned. This is mostly used by system fields such as "dir" or "datestamp".


|- | input_advice_right || undef || Do not use. |- | input_advice_right || undef || Do not use. |- | input_assist || undef || Do not use.