API:EPrints/DataObj/Document
EPrints 3 Reference: Directory Structure - Metadata Fields - Repository Configuration - XML Config Files - XML Export Format - EPrints data structure - Core API - Data Objects
Latest Source Code (3.4, 3.3) | Revision Log | Before editing this page please read Pod2Wiki
Contents
- 1 NAME
- 2 DESCRIPTION
- 3 CORE METADATA FIELDS
- 3.1 docid (int)
- 3.2 rev_number (int)
- 3.3 files (subobject, multiple)
- 3.4 eprintid (itemref)
- 3.5 pos (int)
- 3.6 placement (int)
- 3.7 format (namedset)
- 3.8 formatdesc (text)
- 3.9 language (namedset)
- 3.10 security (namedset)
- 3.11 license (namedset)
- 3.12 main (text)
- 3.13 date_embargo (date)
- 3.14 date_embargo_retained (date)
- 3.15 relation (relation, multiple)
- 3.16 media (compound)
- 4 METHODS
- 4.1 get_system_field_info
- 4.2 main_input_tags
- 4.3 main_render_option
- 4.4 doc_with_eprintid_and_pos
- 4.5 get_dataset_id
- 4.6 create
- 4.7 create_from_data
- 4.8 get_defaults
- 4.9 clone
- 4.10 remove
- 4.11 get_eprint
- 4.12 get_baseurl
- 4.13 is_public
- 4.14 path
- 4.15 file_path
- 4.16 get_url
- 4.17 local_path
- 4.18 files
- 4.19 remove_file
- 4.20 set_main
- 4.21 get_main
- 4.22 set_format
- 4.23 set_format_desc
- 4.24 upload
- 4.25 add_file
- 4.26 sanitise
- 4.27 upload_archive
- 4.28 add_archive
- 4.29 add_directory
- 4.30 upload_url
- 4.31 commit
- 4.32 get_derived_versions
- 4.33 validate
- 4.34 user_can_view
- 4.35 get_type
- 4.36 queue_files_modified
- 4.37 files_modified
- 4.38 rehash
- 4.39 make_indexcodes
- 4.40 remove_indexcodes
- 4.41 cache_file
- 4.42 register_parent
- 4.43 thumbnail_url
- 4.44 icon_url
- 4.45 render_icon_link
- 4.46 new_window_1
- 4.47 preview_1
- 4.48 public_0
- 4.49 public_1
- 4.50 with_link_0
- 4.51 render_preview_link
- 4.52 caption_frag
- 4.53 set_name
- 4.54 thumbnail_plugin
- 4.55 thumbnail_path
- 4.56 thumbnail_types
- 4.57 remove_thumbnails
- 4.58 make_thumbnails
- 4.59 mime_type
- 4.60 get_parent_dataset_id
- 4.61 get_parent_id
- 4.62 add_relation
- 4.63 remove_relation
- 4.64 has_relation
- 4.65 search_related
- 4.66 render_citation_link
- 4.67 render_video_preview
- 4.68 permit
- 5 COPYRIGHT
NAME
EPrints::DataObj::Document - A single format of a record.
DESCRIPTION
Document represents a single format of an EPrint (eg. PDF) - the actual file(s) rather than the metadata.
CORE METADATA FIELDS
docid (int)
The unique ID of the document.
rev_number (int)
The revision number of this document record.
files (subobject, multiple)
A virtual field which represents the list of files which are part of this record.
eprintid (itemref)
The ID number of the eprint to which this document belongs.
pos (int)
The position of the document record within those associated with the eprint.
placement (int)
Placement of the document - the order documents in which should be shown. This may be different to pos, as the ultimate_doc_pos may lead to a different ordering.
format (namedset)
The format of this document. One of the types of the namedset c<document>.
formatdesc (text)
An additional description of this document. For example the specific version of a format.
language (namedset)
The ISO 639-1 cod of the language of this document. The default configuration of EPrints does not set this.
security (namedset)
The security type of this document - who can view it. One of the types of the namedset security.
license (namedset)
The license applied of this document - who can view it. One of the types of the namedset license.
main (text)
The file which we should link to. For something like a PDF file this is the only file. For an HTML document with images it would be the name of the actual HTML file.
date_embargo (date)
The date until which the document has restricted access (set by security). At which point the embargo is lifted and security is set to public and this field set back to undef.
Requires bin/lift_embargos script to be deployed as a cron job.
date_embargo_retained (date)
The retained date of any embargo originally placed on this document. This is updated when a user modifies date_embargo but is not unset by the bin/lift_embargos script.
relation (relation, multiple)
Predicated relationships between this document and other data objects within the archive.
media (compound)
A compound field containing a description of the document media - dimensions, codec etc.
METHODS
get_system_field_info
$metadata = EPrints::DataObj::Document->get_system_field_info
Returns an array describing the system metadata of the document dataset.
main_input_tags
EPrints::DataObj::main_input_tags( $session, $object )
main_render_option
EPrints::DataObj::main_render_option( $session, $object )
doc_with_eprintid_and_pos
EPrints::DataObj::doc_with_eprintid_and_pos( $repository, $eprintid, $pos )
Find the document for an eprint based on the $eprintid and $pos values supplied matching the document's corresponding fields.
Returns the document data object matching the criteria. Otherwise, checks dark_document dataset if it exists to find a corresponding match.
get_dataset_id
$dataset = EPrints::DataObj::Document->get_dataset_id
Returns the ID of the EPrints::DataSet object to which this record belongs.
create
$doc = EPrints::DataObj::Document::create( $session, $eprint )
Create and return a new document belonging to the given $eprint object.
N.B. This creates the document in the database, not just in memory.
create_from_data
$dataobj = EPrints::DataObj::Document->create_from_data( $session, $data, $dataset )
Create document data object from $data provided.
Returns undef if a bad (or no) eprintid specified in $data.
Otherwise calls the parent method in EPrints::DataObj.
get_defaults
$defaults = EPrints::DataObj::Document->get_defaults( $session, $data )
Return default values for this data object based on the starting $data.
clone
$newdoc = $doc->clone( $eprint )
Attempt to clone this document. Both the document metadata and the actual files. The clone will be associated with the given $eprint.
Returns to the newly colument document.
remove
$success = $doc->remove
Attempt to completely delete this document. Including derived documents such as thumbnails.
Returns boolean dependent on success of deleting document.
get_eprint
$eprint = $doc->get_eprint
Return the eprint this document is associated with.
Alias for:
$doc->get_parent
get_baseurl
$url = $doc->get_baseurl
Returns the base URL of the document.
is_public
$boolean = $doc->is_public
Returnes true if this document has no security set and is in the live archive. Otherwise, returns false.
path
$path = $doc->path
Returns the relative path to the document without specifying any file.
file_path
$path = $doc->file_path( [ $file ] )
Returns the relative path to $file stored in this document. If $file is undefined returns the path to the main file.
This is an efficient shortcut to this:
my $file = $doc->stored_file( $filename ); my $path = $file->path;
get_url
$url = $doc->get_url( [ $file ] )
Returns the full URL of the document.
If $file is not specified then the main file is used.
local_path
$path = $doc->local_path
DEPRECATED.
Returns the full path of the directory where this document is stored in the filesystem.
files
%files = $doc->files
Return a hash, the keys of which are all the files belonging to this document (relative to local_path). The values are the sizes of the files in bytes.
remove_file
$success = $doc->remove_file( $filename )
Attempts to remove the file with $filename. $filename must be specified in the format that can be retrieved by get_stored_file.
set_main
$doc->set_main( $main_file )
Sets main for the document to the named $main_file and adjusts format and mime_type as necessary. Will not affect the database until the document is committed.
Unsets main if $main_file is undefined.
get_main
$filename = $doc->get_main
Return the filename of the file set as main in this document.
set_format
$doc->set_format( $format )
Set format for document to $format. Will not affect the database until document is committed.
Alias for:
$doc->set_value( "format" , $format );
set_format_desc
$doc->set_format_desc( $format_desc )
Set format description for document to $format_desc. Will not affect the database until document is committed.
Alias for:
$doc->set_value( "format_desc" , $format_desc );
upload
$success = $doc->upload( $filehandle, $filename, [ $preserve_path, $filesize ] )
DEPRECATED - Use add_file, which will automatically identify the file type.
Upload the contents of the given $filehandle into this document as the given $filename.
If $preserve_path then make any subdirectories needed, otherwise place this in the top level directory.
add_file
$fileobj = $doc->add_file( $file, $filename, [ $preserve_path ] )
$file is the full path to a file to be added to the document, with name $filename. $filename is passed through EPrints::System#sanitise before being written.
If $preserve_path is true then include path components in $filename.
Returns the file object if successfully created or undef on failure.
sanitise
$cleanfilename = sanitise( $filename )
DEPRECATED - use EPrints::System#sanitise.
Sanitises filename by replacing invalid characters.
upload_archive
$success = $doc->upload_archive( $filehandle, $filename, $archive_format )
DEPRECATED - use add_archive.
Upload the file contents provided through $filehandle using the filename from $filename. How to deal with the specified $archive_format (e.g. .zip, .tar.gz) is configured in EPrints::SystemSettings.
add_archive
$success = $doc->add_archive( $file, $archive_format )
Adds the contents of that archive $file to the document, where $archive_format is the format of the archive file (e.g. .zip, .tar.gz, etc.)
Returns a boolean dependent on whether the contents of the archive file is added to the document's subdirectory on the filesystem.
add_directory
$success = $doc->add_directory( $directory )
Upload the contents of $directory to this document. This will not set the document's main field.
This method expects $directory to have a trailing slash /.
Returns boolean depending on success of adding directory to document.
upload_url
$success = $doc->upload_url( $url )
Attempts to grab files from the given $url over HTTP. Grabbing files this way is always problematic. Therefore, by default, only relative links will be followed and only links to files in the same directory or subdirectory will be followed.
This method by default uses wget. However, you can modify this in EPrints::SystemSettings.
Returns a boolean dependent of whether file(s) were successfully uploaded.
commit
$success = $doc->commit( [ $force ] )
Commit any changes that have been made to this data object to the database.
Calls set_document_automatic_fields in the archive's configuration first to set any automatic fields that may be needed.
If $force is defined and true then still commit even if there are no non-volatile changes.
Returns boolean depending on whether commit of document data object is successful.
get_derived_versions
@derived_docs = $doc->get_derived_versions
Return an array of documents that are derived from the current document through the isVersionOf relation.
validate
$problems = $doc->validate( [ $for_archive ] )
Validates the document data object. If $for_archive is defined this will be passed through to the archive configured validate_document method, in case it is required for bespoke changes to this method.
Returns a reference to an array of XHTML DOM objects describing validation problems with the entire document, including the metadata and repository config specific requirements.
A returned reference to an empty array indicates no problems.
user_can_view
$boolean = $doc->user_can_view( $user )
Return true if this document's security settings allow the given $user access to view it.
get_type
$type = $doc->get_type
Returns the type of this document.
Alias for:
$doc->value( "format" );
queue_files_modified
$doc->queue_files_modified
Adds a files_modified task (e.g. for creating/updating thumbnails) to the event queue.
files_modified
$doc->files_modified
This method does all the things that need doing when a file has been modified.
rehash
$doc->rehash
Recalculate the hash value of the document. Uses MD5 of the files (in alphabetic order), but can use user-specified hashing function instead.
make_indexcodes
$indexcodes_doc = $doc->make_indexcodes
Make the index codes document for this document. Returns the generated index codes document on success or undef on failure.
remove_indexcodes
$doc = $doc->remove_indexcodes
Remove any documents containing index codes for this document. Returns the number of documents removed.
cache_file
$filename = $doc->cache_file( $suffix );
DEPRECATED
Returns a cache filename for this document with the given $suffix.
register_parent
$doc->register_parent( $parent )
Registers the $parent EPrints::DataObj::EPrint object for this document.
This may cause reference loops, but it does avoid two identical eprint data objects existing at once.
thumbnail_url
$doc->thumbnail_url( $size )
Returns the URL for the thumbnail of the document for a specified $size. If $size is unspecified defaults to small. Other values for $size include medium and preview.
Returns undef if file for particular type of thumbnail does not exist.
This method is called bt icon_url. It is best to use that method to reliably retrieve the required URL.
icon_url
$doc->icon_url( $size )
Returns the URL for the icon of the document for a specified $size. If $size is unspecified defaults to small. Other values for $size include medium and preview.
render_icon_link
$frag = $doc->render_icon_link( %opts )
Render a link to the icon for this document.
Options:
new_window_1
new_window => 1
Make link go to _blank not current window.
preview_1
preview => 1
If possible, provide a preview pop-up.
public_0
public => 0
Show thumbnail/preview only on public documents.
public_1
public => 1
Show thumbnail/preview on all documents if possible.
with_link_0
with_link => 0
Do not link.
render_preview_link
$frag = $doc->render_preview_link( %opts )
Render a link to the preview for this document (if available) using a lightbox.
Options:
caption => $frag
XHTML fragment to use as the caption, defaults to empty.
set_name
set => "name"
The name of the set this document belongs to, defaults to none (preview won't be shown as part of a set).
thumbnail_plugin
$plugin = $doc->thumbnail_plugin( $size )
Return the plugin used to generatee thumbnails of the specified $size.
thumbnail_path
$path = $doc->thumbnail_path
DEPRECATED
Returns the filesystem path to location of thumbnails for the document.
thumbnail_types
$doc->thumbnail_types
Returns array containing names of all the thumbnail types available for this document.
remove_thumbnails
$doc->remove_thumbnails
Removes all thumbnail files associated with this document.
make_thumbnails
$doc->make_thumbnails
Make all the thumbnail files required for this document.
mime_type
$mime_type = $doc->mime_type
DEPRECATED - use $doc->value( "mime_type" )
Returns the MIME type of this document.
get_parent_dataset_id
$dataset_id = EPrints::DataObj::Document->get_parent_dataset_id
Returns the ID of the parent dataset for a document, (i.e. eprint).
get_parent_id
$eprintid = $doc->get_parent_id
Returns the ID of the parent for this document, (i.e. the eprint ID).
add_relation
$doc->add_relation( $tgt, @types )
Add one or more relations with type(s) specified by @types to the document data object and pointing to the $tgt data object.
This will not update the $tgt data object even if reflexive relations exist.
remove_relation
$doc->remove_relation( $tgt, [ @types ] )
Removes the relations for the document data object to the $tgt data object. If @types is not defined, remove all relations to $tgt. If $tgt is also undefined removes all relations given in @types.
If both $tgt, and @types are both undefined no relations will be removed. If you want to remove all relations do:
$doc->set_value( "relation", [] );
has_relation
$bool = $doc->has_relation( $tgt, [ @types ] )
Returns true if document data object has relations to $tgt. If @types is also given, check these relations satisfy all of the given types. If $tgt is undefined, relations that satisfy the given types may be to any data object.
$list = $doc->search_related( [ $type ] )
Returns an EPrints::List that contains all document data objects related to this document data object. If $type is defined return only those document data object related by that type.
render_citation_link
$citation = $doc->render_citation_link( $style, %params )
Returns a XHTML DOM citation rendering of the document data object. Using citation $style and %params provided and setting class for DOM parent element to ep_document_link.
render_video_preview
$frag = $doc->render_video_preview( $css_class )
Returns a XHTML DOM fragment rendering of a HTML5 video preview with optional subtitles. Assigning the $css_class to the parent element if the XHTML DOM fragment, if provided.
Access / security concerns should be addressed at a higher level.
permit
$boolean = $doc->permit( $priv, $user )
Returns boolean depending on whether the $user has the privilege $priv to carry out a particular action on this document data object.
COPYRIGHT
© Copyright 2000-2024 University of Southampton.
EPrints 3.4 is supplied by EPrints Services.
http://www.eprints.org/eprints-3.4/
LICENSE
This file is part of EPrints 3.4 http://www.eprints.org/.
EPrints 3.4 and this file are released under the terms of the GNU Lesser General Public License version 3 as published by the Free Software Foundation unless otherwise stated.
EPrints 3.4 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with EPrints 3.4. If not, see http://www.gnu.org/licenses/.