Metadata

From EPrints Documentation
Revision as of 15:41, 8 January 2007 by WikiSysop (talk | contribs)
Jump to: navigation, search
Warning This page is under development as part of the EPrints 3.4 manual. It may still contain content specific to earlier versions.


Metadata Field Types

There are many different types of metadata field. The type controls how a field is rendered, indexed, searched and so forth. A field always has a type and a name property, and usually has several more. Most properties are documented on this page, but some properties are only available to certain types of field, and they are listed on the page for that field.

Some of these subclasses provide very rich features, others very simple. For example the url field works just like the text field except that it's only valid if it looks like a url and when rendered it is a hyper-link.

  • Basic metadata field
    • Boolean - TRUE or FALSE (or can be unset, of course)
    • Compound - virtual field, joins together several "multiple" fields. eg. author_name and author_email
      • Multilang - allows language variants of a field. eg. titles in French, German and/or English.
    • Date - stores a date.
      • Time - stores a date and time.
    • File - virtual field represtenting the files in a document
    • Float - stores a floating point value
    • Id - deprecated (do not use)
    • Int - a positive integer value
    • Search - a serialised search
    • Set - a limited set of options
      • Arclanguage - as for set, but the options are the valid languages of this repository
      • Fields - as for set, but the options are the fields in a dataset.
      • Langid - used internally by multilang fields to store the language id.
      • Namedset - like a normal set, but takes its options from a namedset configuration file.
      • Subject - possible values are taken from the Subject heirarchy.
    • Subobject - a virtual field, similar to itemref, but representing an object or objects which are sub-parts of the current object (as oppose to just related in some way)
    • Text - the basic text field. Maximum 255 bytes. nb. uft-8 means some chars take more than one byte.
      • Email - an email address
      • Fulltext - virtual field used to represent to the full text of an eprint
      • Longtext - like text but allows much longer text (65,000 bytes)
      • Name - Stores a persons name broken up into logical parts.
      • Secret - used to store passwords.
      • Url - stores a URL

Description

A metadata field describes one field of data in one type of Data Object. For example the "title" field of an EPrint Object or the "email" field in a User Object.

Every Data Object has system fields (which are set by the system, and not alterable), but the User Object and EPrint Object have additional fields which are configured on a per-repository basis.

These can be customised in the user_fields.pl and eprint_fields.pl files. Note that changing these files does not automatically modify the underlying database so should (generally) only be done before the database is created. Some metadata properties do not affect the database, and are marked as such.

If you add or remove fields, or modify a property which affects the database then you'll need to alter the database to match. In 3.0 this must be done by hand, but we have plans to build a tool to do this for you.

Properties

Note that true/false properties use 1 and 0 to indicate their setting.

name default required? affects db? catagory description
name n/a yes yes core This is the internal name of the field. It should only contain a-z and underscores. It will be used to identify this field in scripts, other configuration files, in the database, and in the XML export/import system, etc. It must be unique within the object (so the EPrint Object can't have two fields called "email" but the eprint object and User Object could have a field each of the same name.
type n/a yes yes core This sets the type of the metafield, which in turn affects what other properties it may have. The value must be one of the metafield types listed above.
multiple 0 no yes core This indicates if this field is a single value or a list of values. eg. "title" is only a single longtext field but "creators" is a multiple name field. In the database a non-multiple field is stored in one (or more) columns in the main object table, but a multiple field gets its own table.
can_clone 1 no no core If this is set to false then this field is not copied when the object is cloned. This is mostly used by system fields such as "dir" or "datestamp".
sql_index 1 no ish... core When the database is created this field indicates that an SQL index should be created to speed searching. Different field types override the default value with the sensible option for that type of field. It's not worth putting an sql index on a field that is only ever searched for words in it (like title or abstract) but it is worth indexing fields whoes values are explicitly searched for, or where ranges are searched - date fields, set fields etc. It's unlikely you'll need to set this by hand. The affects db? is listed as "ish..." as this only affects the database when the tables are created. You could change it later; it won't break anything. In fact, it won't do anything at all.