Difference between revisions of "New Features in EPrints 3.2"

From EPrints Documentation
Jump to: navigation, search
Line 1: Line 1:
This is a list of proposed features in 3.2. The final release may not contain this complete set of new features!
+
This page is now describing actual features, rather than planned.
  
= EPrints Edit locking =  
+
The [http://wiki.eprints.org/index.php?title=New_Features_in_EPrints_3.2&oldid=6394 Planned Features] may be useful to improve the quality of descriptions on this page.
  
Allows you to lock editing to certain users and sessions (to make sure that two editors don't make simultaneous changes to the same record).
+
==NOTE==
 +
* We now recommend LibXML in preference to GDOME. It's less buggy, and easier to install.
 +
* Upgrade may take several hours as it cleans up the unicode issues in the database.
  
Confirmed: Alpha 1
+
==Database==
 +
* In addition to MySQL, EPrints 3.2 now supports Postgres and Oracle
  
= Upload Progress Bar =
+
==API==
 +
* This release features a formal API. Not all functionality is yet available via the API, but will be added slowly and carefully in future releases.
 +
* The bugbear of EPrints internals, EPrints::Session has been merged into EPrints::Repository. All old code will still work.
  
It does exactly what it says on the title.
+
==Documents==
 +
* Thumbnails are now documents in their own right
 +
* Built in document-format icons, as well as those you configure yourself
  
Confirmed: Alpha 1
+
==Deposit Interface==
 +
* Edit Locking locks records reduces risk of 2 people editing a record at the same time.
 +
* Option to extract metadata and images from OpenXML files (.docx and .pptx)
 +
* Offers options to users and editors on the deposit screen if there are problems
 +
* Document upload screen has been redesgined to be clearer.
 +
* Split document uploading into adding a new document and editing existing documents
 +
* The documents inside an EPrint may now be re-ordered
 +
* Progress bar on file upload
 +
* Document upload methods (file, url, zip etc.) are now plugin-based and can be extended
 +
* When attempting to deposit an eprint with problems show Save button
 +
* Made it an option to provide action buttons top and bottom in workflow
  
= Plug-in Based Storage Layer and Storage Controller =
+
==Search & Indexing==
 +
* The search library has been entirely re-written to reduce use of cache tables and to improve performance. Simple searches are now over ten times faster.
 +
* The indexer now uses plugins, so you can schedule other tasks, like thumbnail conversion, to be done in the background.
 +
* Added config option "cache_max" to limit the cachemap tables used
  
The EPrints Storage Layer is evolving to enable easy plug-and-plug with many storage platforms including local and multiple institutional storage as well as cloud storage. The Storage Controller enables you to use multiple storage platforms simultaneously, define rules for what is stored on each platform and also manage these platforms and migrate resources between platforms as required. More information, including the current API, can be found on the [[StorageController]] page.
+
==Unicode==
 +
* EPrints use of unicode has been significantly improved.
  
Confirmed: Alpha 1
+
==REST==
 +
* A "REST" style interface to objects, via /rest/eprint/23/title.txt, for example. This can also support "PUT" to alter fields!
  
= Thumbnails are now documents =
+
==WebDav==
 +
* ???
  
Up to now, thumbnails have been 'special case' files that have not been managed very consistently by the repository. In particular, there has been no way to create or manage thumbnails through the API. Now, thumbnails and document previews are normal documents of a specific type and specific relationships to other documents. This allows a much more flexible approach to be taken to thumbnails, with multiple size and multiple formats being produced by external services. It also becomes possible to have more complex categories of thumbnail, for example a different thumbnail for every slide in a PowerPoint slideshow, not just a thumbnail of the front slide. That in turn allows EPrints to potentially build visual catalogues of collections of PowerPoint presentations.
+
==SWORD2==
 +
* ???
  
Confirmed: Alpha 1 (thumbnails will need to be re-created)
+
==Semantic Web Support==
 +
* RDF+XML Format
 +
* N3 Format
 +
* URIs for all objects, including non dataobjs. eg. Authors, Events, Locations.
 +
* BIBO Ontology
 +
* Extendable
 +
* URIs now use content negotiation to decide which export plugin to redirect to, based on mime-types supplied by plugins and the "accept" header.
  
= SWORD 2 (1.3 Specification Support) =
+
==Storage Layer==
 +
* Now uses plugins to store files
 +
** Local Filesystem
 +
** Amazon S3
 +
** Sun Cloud Storage
  
Conforming to the new standards set out by the SWORD project, EPrints 3.2 will include compatibility for the new features.
+
==Speed==
 +
* Search & Indexing much faster
 +
* Import is faster
 +
* Other parts of the code have been audited for speed, and optimised.
  
Expected: Alpha 1
+
==Import==
 +
* Modified Import UI to allow a per-plugin/single/bulk workflow
  
= Preservation Planning Capabilities =  
+
==EPC & EPrints Script==
 +
* New EPC tag type: epc:debug, which is like print but sends the XML to STDERR for debugging purposes.
 +
* New EPScript methods: citation_link, dataset, related_objects, url, doc_size, is_public, thumbnail_url, preview_link, icon, human_filesize, control_url, contact_email, property, substr, filter_compound_list, to_data_array, pretty_list, array_concat, action_list, action_button, action_title, action_description, action_icon
 +
* Improvements to the epc:foreach processing (better handling of multiple object types in lists)
 +
* New Script method: property() which takes a string and returns a property from a hash or dataobj.
 +
* New EPrints Script datatype: DATA_ARRAY: Represents a list of tuples of [$value, $epscript_type]
  
Allows EPrints to be linked with file classification tools (primarily DROID) and risk analysis services (PRONOM) which can then not only profile the content of your repository but also identify risks to objects contained within it.
+
==OAI==
 +
* Stateless OAI Interface means no timing-out
 +
* Support for multiple constraints in custom OAI sets
  
More information on this can be found on the [[Preservation in EPrints 3.2]] page.
+
==Unit Tests==
 +
* We have introduced unit-tests to improve both the short and long term quality of our code.
 +
==Metadata Types==
 +
* Counter (incrementing value)
 +
* Timestamp (defaults to the current time)
 +
* UUID
 +
* MetaField::Search now has two properties:
 +
== - "namedfields" which is an array ref of field names to search OR==
 +
== - "namedfields_config" which is the name of a config variable==
 +
* MetaField::Search can now be used in any workflow (not hard-coded to editpermfields)
 +
* A captcha pseudo-field based on http://recaptcha.net/
 +
* added "repeat_secret" property to secret fields that will render a confirmation box which is checked with validate()
  
Confirmed: Alpha 1 (As an extension)
+
==Administration Interface==
 +
* Converted Admin screen into several tabs.
 +
* Improved the BatchEdit interface
 +
* Show a progress bar while records are updated (batch edit?)
  
= Enhanced Compatability for DRIVER project systems =
+
==Editorial Interface==
 +
* Improved "Review" Screen
 +
* The "Review buffer" can now be filtered for better management of large review buffers.
 +
* When an editor provide the "Move to Review" button if there are problems
  
In EPrints 3.2 repositories will be by default DREIVER enabled. In a future 3.1 release the better compatability will exist but will remain off by default.
+
==User Defined Datasets==
 +
* Allows 3rd party tools to create their own additional datasets
 +
* Suite of interface screens to work with these new datasets
  
Expected: EPrints 3.1 and 3.2.
+
==Command Line Tools==
 +
* Allow eprint ids to be specified for redo_thumbnails
  
= Arbitrary metadata linking capabilities =
+
==Export ==
 +
* Added support for JSONP
 +
* Added support for an 'n' argument to search exports
 +
* Added arguments to export plugins. Passed by CGI arguments on abstract search or by da --arg opton we have added to bin/export for magickal extra goodness.
  
Allows the user to expand their data model with custom predicates which link a resources with other resources. Such an example include the derivedFrom predicate which we are already using.
+
==Abstract Page==
 +
* Now generated with a citation
 +
* Shows an "action list", so plugins can register to appear on this page
  
Confirmed: Alpha 1 (no GUI, API only)
 
  
= docx,xslx,pptx MS Office XML compatability =
+
==Phrases==
 +
* Primary method of editing phrases is now the web interface
 +
* Added "ref" option to phrases, which will cause the referenced phrase to be used instead - Equivalent to calling the referenced phrase directly
  
Upon upload of these file types EPrints 3.2 will automatically fill in much of the metadata such as title, authors and abstract if possible. 3.2 will also be able to pull these files apart offering optional access to the content within them such as embedded pictures.
+
==Views==
 +
* Entire rendering of item lists and menus can be over-ridden by a function
  
Expected: Alpha 1
+
==Misc. Changes ==
 +
* Can now disable a repository through a system configuration setting
 +
* Refactoured DataObj::get_defaults so that you can now specify default values through a "default_value" property
 +
* Most of get_defaults() can now be specified through the metafield spec.
 +
* Can now apply multiple changes to the same field (???? I assume this means metafield?)
 +
* Preference field for users (to store k/v pairs in)
  
= Enhancements to repository web site management =
+
==Key Bugfixes==
 +
* Fixed login page not using phrase for title
 +
* Fixed spurious history objects being created on document upload
 +
* Fixed an HTML insertion bug in the <title> element [Brian D. Gregg]
 +
* Fixed schema errors in uketd_dc and METS/MODS export plugins
 +
* Fixed bug in Compound creation of Set types that squashed the set options
 +
* Fixed order static directories are searched to: repository->theme->system
 +
* Support long values in browse views by using the MD5 of the value,
 +
* Subject inputform component can now be used with singular values
 +
* Fixed bug that is_advertised property on export plugins was being ignored.
 +
* Fixed bug in indexer which meant it didn't index in a round-robin fashion.
  
Taking on the push of 3.1 to make it easy for the repository manager to edit and change the repository configuration without needing access to the configuration files themselves, we are taking that another step further. Coming in 3.2 we intend to allow full look and feel (branding) editing of the main EPrints web pages and templates to be done externally to EPrints in tools such Dreamweaver and Amaya. There will also be a complementary way of uploading new image files.
+
==New and alterd Config options==
 
+
* Breaks up SystemSettings into logically named files
This editing capability is also complemented by two links which appear on certain pages enabling the administrator to directly edit the page look and feel as well as the phrases on that page.  
+
* Can now disable a repository through a system configuration setting
 
+
* Moved most of eprint_render.pl into a citation file: summary_page.xml
Confirmed : Alpha 1
+
* Updated defaults views.pl to show current config.
 
+
* Improved document_upload.pl layout to make it easier to add/remove suffix to mimetype mappings.
= Abstract Page Improvements =
+
* Added URI to EPrint Summary Page
 
+
* Added RDF+XML and N3 Document formats
Lightboxes have been added to the abstract pages for easier previewing of documents.
+
* New metafield option: $defaults{render_max_search_values} = 5;
 
+
* Added "show_help" option to workflow component to disable collapsing
Confirmed: Alpha 1
+
* Usage: show_help="no"
 
+
* show_help={always,toggle,never}
= User definable datasets =
+
* Added config option "cache_max" to limit the cachemap tables used
 
+
* Can now disable a repository through a system configuration setting
Allows you to expand the core EPrints data model with whole new types of data and datasets which can be indexed and used in searches. It is proposed that the stakeholders in the CERIF standard (projects, authors, institutions etc) will be modelled as separate EPrints datasets and used to support Research Management activities and integration with Research Information Systems.
+
* user defined datasets
 
+
* Made it an option to provide action buttons top and bottom in workflow
Confirmed: Alpha 1 (command line only)
+
**$c->{locking}->{eprint}->{enable} = 1;
 
+
**$c->{locking}->{eprint}->{timeout} = 600;
= OAI-ORE Import and Export Plug-ins =
+
*rest privs
 
+
*check registation email callback
Capability to import and export resources or collections of resources as ORE Resource Maps. Both Atom and RDF serialisations are planned to be made available.
+
*epc:debug,
 
+
*lots of eprints script functions
Confirmed: Alpha 1
+
*views.pl
 
+
** "DEFAULT;render_fn=render_view_items_3col_boxes",
= IRStats/EPStats =
+
** render_menu => "render_view_menu_3col_boxes"
 
+
** ranges & variations were introduced in 3.1.? but need documenting.
Institutional Repository stats are becoming an even more important part of the repository and we hope to have these in the final 3.2 release.
 
 
 
Confirmed: Alpha 1 (As a package) IRStats (Lite)
 
 
 
= EPrints Scheduler =
 
 
 
Allows tracking of events which happen as well as scheduling new events which need to take place to maintain your repository. Investigation is under way into the power of such a system and if it can be interfaced with desktop calendar programs such as iCal and Google Calendar.
 
 
 
Expected: Beta 2
 
 
 
= Shelves of EPrints =
 
 
 
No details to be released on this yet
 
 
 
Expected: Beta 2 (tentative)
 
 
 
= Coverpage capabilities =
 
 
 
Provides a mechanism for adding coverpages to documents before they are provided to users.
 
 
 
Expected: Beta 2 (tentative)
 
 
 
= Issues Raising and Resolving Tool =  
 
 
 
No description currently
 
 
 
Expected: Beta 2 (tentative)
 
 
 
= Plugin system for document upload processing =
 
 
 
So zip,targz,file, from URL etc. become a plugin each and we add Docx/PPTx too.
 
 
 
Confirmed: Alpha 1
 
 
 
= FTP and WebDAV Daemon Support =
 
 
 
Access to the user's inbox via FTP and WebDAV, allowing the repository to be mounted as a file system or network drive.
 
 
 
Confirmed: Alpha 1
 
 
 
= Multi-stage editorial buffer =
 
 
 
If an institution has a number of stages to the editorial process (e.g and editor to check the metadata, followed by a different(?) editor to check copyright), they should be able to configure editor roles and buffer stages.
 
 
 
Expected: Beta 2 (tentative)
 
 
 
= User Profiles =
 
 
 
For those without institutional 'people' pages, EPrints will generate a page for each user containing information including a list of publications and perhaps a picture.
 
 
 
Expected: Beta 2 (tentative)
 
 
 
= Institutional GeoLoc Authority and Autocompletion =
 
 
 
Implementation of an optional institution autocompletion script (for e.g. conference repositories) that autocompletes on an authority file containing canonical institution names and coordinates.
 
 
 
Expected as extension only: Beta 1
 
 
 
= REST Interface =
 
 
 
Creating a proper REST interface for EPrints CRUD operations. E.g.
 
http://devel.eprints.org/eprint/103/creators/1/name/family.txt
 
will return a text form of the family component of the namer of the first creator of eprint id 103.
 
 
 
Confirmed: Alpha 1 (Currently READ and simple PUT)
 
 
 
= Linked Data Support =
 
 
 
EPrints is now represented on the Linked Data map (see http://linkeddata.org/ ). It currently has an RDF exporter, but it lacks a systematic way of coining appropriate institutionally-blessed URIs for individuals, projects etc. A new version of the RDF plugin will be released which enables repositories to participate in the international linked data effort.
 
 
 
Expected: Beta 2
 
 
 
= SPARQL End Point =
 
 
 
Tied in with the above, this will provide a new semantic endpoint to EPrints.
 
 
 
Expected: Beta 2
 
 
 
= Citation Data Management Framework =
 
 
 
Enables the installation of the various extensions which allow gather of citation metrics from many external service providers.
 
 
 
Confirmed: Alpha 1
 
 
 
= Google scholar citation Plug-In =
 
 
 
Accompanies above framework and harvests data about your papers from google scholar.  
 
 
 
Expected: Beta 1
 
 
 
= ISI Citation Plug-In =
 
 
 
Adds the capability of managing Thomson ISI citation data. EPrints will harvest citation data, displaying it on abstract pages. Items can be searched by citation range, searches can be ordered by citation impact and aggregate reports of  citation statistics (including h-index) can be produced per research group or subject area using the citation report export plugin.
 
 
 
Confirmed as plug-in/extension: Alpha 1
 
 
 
= Unit Tests and Performance Profiles =
 
 
 
Introduced in the core to enable more reliable and faster development of the next versions of EPrints. These basically represent an improvement of internal quality control of parts of the EPrints codebase. These unit tests are code based and not function based (as these already exist).
 
 
 
[[Category:New Features]]
 

Revision as of 20:05, 5 February 2010

This page is now describing actual features, rather than planned.

The Planned Features may be useful to improve the quality of descriptions on this page.

NOTE

  • We now recommend LibXML in preference to GDOME. It's less buggy, and easier to install.
  • Upgrade may take several hours as it cleans up the unicode issues in the database.

Database

  • In addition to MySQL, EPrints 3.2 now supports Postgres and Oracle

API

  • This release features a formal API. Not all functionality is yet available via the API, but will be added slowly and carefully in future releases.
  • The bugbear of EPrints internals, EPrints::Session has been merged into EPrints::Repository. All old code will still work.

Documents

  • Thumbnails are now documents in their own right
  • Built in document-format icons, as well as those you configure yourself

Deposit Interface

  • Edit Locking locks records reduces risk of 2 people editing a record at the same time.
  • Option to extract metadata and images from OpenXML files (.docx and .pptx)
  • Offers options to users and editors on the deposit screen if there are problems
  • Document upload screen has been redesgined to be clearer.
  • Split document uploading into adding a new document and editing existing documents
  • The documents inside an EPrint may now be re-ordered
  • Progress bar on file upload
  • Document upload methods (file, url, zip etc.) are now plugin-based and can be extended
  • When attempting to deposit an eprint with problems show Save button
  • Made it an option to provide action buttons top and bottom in workflow

Search & Indexing

  • The search library has been entirely re-written to reduce use of cache tables and to improve performance. Simple searches are now over ten times faster.
  • The indexer now uses plugins, so you can schedule other tasks, like thumbnail conversion, to be done in the background.
  • Added config option "cache_max" to limit the cachemap tables used

Unicode

  • EPrints use of unicode has been significantly improved.

REST

  • A "REST" style interface to objects, via /rest/eprint/23/title.txt, for example. This can also support "PUT" to alter fields!

WebDav

  •  ???

SWORD2

  •  ???

Semantic Web Support

  • RDF+XML Format
  • N3 Format
  • URIs for all objects, including non dataobjs. eg. Authors, Events, Locations.
  • BIBO Ontology
  • Extendable
  • URIs now use content negotiation to decide which export plugin to redirect to, based on mime-types supplied by plugins and the "accept" header.

Storage Layer

  • Now uses plugins to store files
    • Local Filesystem
    • Amazon S3
    • Sun Cloud Storage

Speed

  • Search & Indexing much faster
  • Import is faster
  • Other parts of the code have been audited for speed, and optimised.

Import

  • Modified Import UI to allow a per-plugin/single/bulk workflow

EPC & EPrints Script

  • New EPC tag type: epc:debug, which is like print but sends the XML to STDERR for debugging purposes.
  • New EPScript methods: citation_link, dataset, related_objects, url, doc_size, is_public, thumbnail_url, preview_link, icon, human_filesize, control_url, contact_email, property, substr, filter_compound_list, to_data_array, pretty_list, array_concat, action_list, action_button, action_title, action_description, action_icon
  • Improvements to the epc:foreach processing (better handling of multiple object types in lists)
  • New Script method: property() which takes a string and returns a property from a hash or dataobj.
  • New EPrints Script datatype: DATA_ARRAY: Represents a list of tuples of [$value, $epscript_type]

OAI

  • Stateless OAI Interface means no timing-out
  • Support for multiple constraints in custom OAI sets

Unit Tests

  • We have introduced unit-tests to improve both the short and long term quality of our code.

Metadata Types

  • Counter (incrementing value)
  • Timestamp (defaults to the current time)
  • UUID
  • MetaField::Search now has two properties:

- "namedfields" which is an array ref of field names to search OR

- "namedfields_config" which is the name of a config variable

  • MetaField::Search can now be used in any workflow (not hard-coded to editpermfields)
  • A captcha pseudo-field based on http://recaptcha.net/
  • added "repeat_secret" property to secret fields that will render a confirmation box which is checked with validate()

Administration Interface

  • Converted Admin screen into several tabs.
  • Improved the BatchEdit interface
  • Show a progress bar while records are updated (batch edit?)

Editorial Interface

  • Improved "Review" Screen
  • The "Review buffer" can now be filtered for better management of large review buffers.
  • When an editor provide the "Move to Review" button if there are problems

User Defined Datasets

  • Allows 3rd party tools to create their own additional datasets
  • Suite of interface screens to work with these new datasets

Command Line Tools

  • Allow eprint ids to be specified for redo_thumbnails

Export

  • Added support for JSONP
  • Added support for an 'n' argument to search exports
  • Added arguments to export plugins. Passed by CGI arguments on abstract search or by da --arg opton we have added to bin/export for magickal extra goodness.

Abstract Page

  • Now generated with a citation
  • Shows an "action list", so plugins can register to appear on this page


Phrases

  • Primary method of editing phrases is now the web interface
  • Added "ref" option to phrases, which will cause the referenced phrase to be used instead - Equivalent to calling the referenced phrase directly

Views

  • Entire rendering of item lists and menus can be over-ridden by a function

Misc. Changes

  • Can now disable a repository through a system configuration setting
  • Refactoured DataObj::get_defaults so that you can now specify default values through a "default_value" property
  • Most of get_defaults() can now be specified through the metafield spec.
  • Can now apply multiple changes to the same field (???? I assume this means metafield?)
  • Preference field for users (to store k/v pairs in)

Key Bugfixes

  • Fixed login page not using phrase for title
  • Fixed spurious history objects being created on document upload
  • Fixed an HTML insertion bug in the <title> element [Brian D. Gregg]
  • Fixed schema errors in uketd_dc and METS/MODS export plugins
  • Fixed bug in Compound creation of Set types that squashed the set options
  • Fixed order static directories are searched to: repository->theme->system
  • Support long values in browse views by using the MD5 of the value,
  • Subject inputform component can now be used with singular values
  • Fixed bug that is_advertised property on export plugins was being ignored.
  • Fixed bug in indexer which meant it didn't index in a round-robin fashion.

New and alterd Config options

  • Breaks up SystemSettings into logically named files
  • Can now disable a repository through a system configuration setting
  • Moved most of eprint_render.pl into a citation file: summary_page.xml
  • Updated defaults views.pl to show current config.
  • Improved document_upload.pl layout to make it easier to add/remove suffix to mimetype mappings.
  • Added URI to EPrint Summary Page
  • Added RDF+XML and N3 Document formats
  • New metafield option: $defaults{render_max_search_values} = 5;
  • Added "show_help" option to workflow component to disable collapsing
  • Usage: show_help="no"
  • show_help={always,toggle,never}
  • Added config option "cache_max" to limit the cachemap tables used
  • Can now disable a repository through a system configuration setting
  • user defined datasets
  • Made it an option to provide action buttons top and bottom in workflow
    • $c->{locking}->{eprint}->{enable} = 1;
    • $c->{locking}->{eprint}->{timeout} = 600;
  • rest privs
  • check registation email callback
  • epc:debug,
  • lots of eprints script functions
  • views.pl
    • "DEFAULT;render_fn=render_view_items_3col_boxes",
    • render_menu => "render_view_menu_3col_boxes"
    • ranges & variations were introduced in 3.1.? but need documenting.