EPrints 3.4.3
Release Notes
3.4 | 3.4.1 | 3.4.2 | 3.4.3 | 3.4.4 | 3.4.5 | 3.4.6 | 3.4.7
3.3 | 3.3.5 | 3.3.6 | 3.3.7 | 3.3.8 | 3.3.9 | 3.3.10 | 3.3.11 | 3.3.13 | 3.3.14 | 3.3.15 | 3.3.16
3.2.0 | 3.2.1 | 3.2.2 | 3.2.3 | 3.2.4 | 3.2.5 | 3.2.6 | 3.2.7 | 3.2.8 | 3.2.9
Contents
- 1 Release Notes
- 1.1 New Dependencies
- 1.2 Changes Since 3.4.2
- 1.3 Known Issues
- 1.3.1 Multiple values not supported by EPrints::MetaField::Date validate function
- 1.3.2 Bespoke validations no longer supported for Date MetaField
- 1.3.3 Requesting /cgi/set_lang causes an internal server error
- 1.3.4 Non-multiple Subject field causes an internal server error
- 1.3.5 Multiple Subject field using default rendering shows selected options twice
- 1.3.6 Non-compulsory boolean radiogroup fields cannot deselect UNSPECIFIED
- 1.3.7 RIOXX2 plugin field validation leads to internal server error
- 1.3.8 EPrints::System's sanitise cannot be called by bin scripts
- 1.3.9 $c->{userhome} and $c->{urlpath} Auto-set Incorrectly
 
- 1.4 Configuration for URLs and Paths
 
- 2 Planned Development
Release Notes
EPrints 3.4.3 is now available on files.eprints.org and GitHub.
- Zero codename: Snickerdoodle Sandstorm
- Publications flavour codename: Banana Bread Rainbow (1.3)
New Dependencies
Dependencies can be installed as RPMs (yum install PACKAGE), DEBs (apt-get install PACKAGE) or CPAN (cpan MODULE).  Perl's IO::String module is now needed to ensure BibTeX import from file still works with the latest version of BibTex::Parser packaged with EPrints 3.4.3.
- Perl IO::Stringmodule- RPM: perl-IO-String
- DEB: libio-string-perl
- CPAN: IO::String
- This dependency is incorporated into the EPrints 3.4.3 DEB and RPM packages.
 
- RPM: 
Also, check earlier dependencies for EPrints 3.4.2 and before.
Changes Since 3.4.2
New Functionality
- Provides function (EPrints::Utils::compare_version) for comparing EPrints software versions so plug-ins can choose to behave differently.
- Adds jquery EPrints 3.4 ingredient to allow JQuery resources to be incorporated if required by non-core EPrints functionality. Ingredient added but commented out in flavours/pub_lib/inc.
- Provides picker for date fields to reduce potential for human error.
Security and Privacy Improvements
- Makes JSON export respect value set for export_as_xml to avoid exporting unintended fields.
- Sets contact_email to not export by default for better GDPR compliance.
- Rate limits failed local login attempts for a particular user. Defaults to 10 failed login attempts in 10 minutes (using $c->{max_login_attempts} and $c->{lockout_minutes} configuration options).
- Rate limits new account requests. Defaults to 100 requests in an hour long period to avoid causing problems with planned mass sign up events. Configured using $c->{max_account_requests} and $c->{max_account_requests_minutes} options.
- Limits the maximum length of a password (default 200 characters) to prevent specific Denial-of-Service attacks.
- Modifies latexinvocation to make more secure and ensure it rights to the correct output directory.
- Removes legacy /cgi/latex2pngscript to prevent Remote Code Execution (RCE) from CVE-2021-3342.
- Validates parameters passed to /cgi/calscript to protect against RCE and Cross Site Scripting (XSS) from CVE-2021-26475 and CVE-2021-26476.
- Validates datasetparameter passed to/cgi/dataset_dictionaryto protect against XSS from CVE-2021-26702.
- Validates parameters passed to /cgi/history_searchto protect against the possibility of XSS and MySQL injection vulnerabilities, although none currently exploitable.
- Validates verbpassed to/cgi/toolbox/toolboxto protect against RCE from CVE-2021-26704.
- Allows EPrints::XML::parse_stringto temporarily modify parser configuration to disable expanding of XML entities by the/cgi/ajax/phrasescript to protect against CVE-2021-26703.
- Ensure eprints_langcookie explicitly setsSameSiteattribute toLaxif HTTPS is not enabled or setssecureattribute to1if it is.
- Doubles (to 16) number of characters used for randomly generated password suggestions (e.g. for database user created by epadmin).
General Improvements
- Resolves all accessibility errors and most alerts (as reported by the WAVE Web Accessibility Evaluation Tool) for backend admin pages as listed in the Accessibility report.
- Tidies up the use of configuration URLs and Paths
- Uses HTML class for Longtext_counter counter line and provides CSS to by default make this red and bold if the maxwords limit is exceeded.
- Allows full path or just subject name to be displayed for values set in subjects field by setting render_path attribute (true by default).
- Allows time between password reset requests to be configured (to a number of hours) rather than hard-coded to 24 hours.
- Adds option to set status of EPrints that should be checked by check_xapian script.
- Adds date_embargo_retained field to documents to retain the embargo date after the embargo is lifted.
- Adds deprecation warning to indicate only LibXML library will be supported in future versions of EPrints. (Version for removal yet to be confirmed).
- Adds basic default citation for files.
- Adds validation for date fields to prevent invalid dates being set.
- Allows files to be stored to disk with generic filenames (i.e. <fileid>.bin) rather than the upload filename that can cause retrieval problem by setting the$c->{generic_filenames}configuration setting.
- Adds code to libandpub_libversions ofsecurity.plto deal with request a copy access issue if repository has coversheets plug-in enabled.
- Allows sub-fields to have help texts configured.
- Generates log message for views that have reached their max_itemslimit.
- Allows document fields displayed in an eprint's details tab to be configured.
- Adds data-context attributes to aid CSS styling of HTML <div>elements within search forms.
- Improves scaling of of Lightbox popups on narrow screens.
- Allows a pin to set the id or class attribute for a template HTML element to facilitate different CSS styling on different pages.
- Prompts for organisation name when creating a new archive using epadmin create.
- Allows classes to be assigned to parts (i.e. top, left, main, after, etc.) of abstract/summary pages to facilitate better CSS styling.
- Allows search results forms to be configured to automatically reload on ordering change.
- Allows classes for EPrints' user menu (i.e. by default ep_tm_key_tools) to be configured to better integrate with institutional branding.
- Adds commented advice on fixing issues with PDF thumbnail generation using ImageMagick.
- Tidies up citations by removing identical flavours/pub_lib/citations that appear inlib/.
- Allows max_filesthat can be expanded from an uploaded zip file to be configured. (Previously hardcoded to 100 file limit).
- Allows customisable overriding of type-mapping based on ispublishedvalue for BibTeX and RIS export. (I.e. not usingunpublishedtype ifispublishedset tounpub).
- Provides has_roleEPC function to allow more granular control of workflows that can be edited by specific group of user that have has this role added to their profile.
- Allows positioning of hovering document preview to be set to left or right and implements this in summary_pageandresulteprint citations to improve preview positioning.
- Improves verbosity of get_inputandget_input_hiddento explain better why input is invalid. Assistingepadmindata entry.
- Makes id_number(i.e. for DOIs, etc.) field default typeidrather thantextto facilitate search for exact IDs, especially those containing non-alphanumeric characters (e.g. 10.1016/j.ejor.2018.02.047).
- Restricts characters that can be used in new user-added subject IDs to alphanumeric characters, underscores and hyphens.
- Ensures subject-type field behaves the same in default (i.e. multiple select form element) as it does in subject tree format. I.e. subject tree mode lets previously selected subjects stay selected, even if the subject itself has been set as undepositable in the intervening period. Default mode silently unselects these.
- Switches to using real bazaar ingredient rather than bazaar_stub for zero flavour in EPrints::SystemSettingsas this former is now available in core codebase.
- Prevents removing items that have and have once been in the live archive, without an extra permission: eprints/remove_once_archived. (Retiring items is still permitted without extra permission).
Bug Fixes
- Prevents access code from changing if request a copy is approved multiple times (and warns if request has already been approved).
- Fixes bug with logged in users not being able to access approved request a copy documents that they would not normally have access.
- Fixes formatting of results from user search introduced by accessibility changes in 3.4.2.
- Updates issue citation to use <div>rather thanandtags so it works with accessibility changes introduced in 3.4.2.
- Deals more gracefully if History DataObj's parent does not exist when checking if it is an EPrints::List.
- Fixes internal server error when trying to save values to a non-multiple compound field.
- Fixes issue with XML import setting unset compound subfields to NULL rather than empty string.
- Fixes errors on import pages caused by accessibility improvements and fixes other accessibility issues for import pages.
- Fixes broken link of default 401 error page.
- Sets STDOUT and STDERR binmode to utf8 to avoid wide character errors.
- Fixes user history layout as a result of earlier accessibility improvements.
- Fixes missing phrases reported in error logs. Typically fieldhelp phrases for sub-fields of compound fields, which are generally not rendered but still report missing phrase warnings.
- Sets SameSiteproperty toNonefor secure login and request copy cookies as Google Chrome is now quite strict about this.
- Provides handling for XML parsing of revision files to show warning for particular revision in the history tab rather than an internal server error for the whole page.
- Allows use of same pin multiple times within a phrase.
- Prevents undefined $c->{aliases}from causingepadmin config_coreto fail.
- Ensures saved searches indicate email notifications cannot be sent when the Xapianv2(in addition toXapian) plug-in has been used for the original search.
- Adds missing event_queuephrases.
- Fixes inappropriate mapping of monograph_typein BibTeX import.
- Fixes erroneous post-login redirection when login required to access eprint summary/abstract pages and long URLs (e.g. /id/eprint/1234) enabled.
- Ensures readonlyattribute works forMetaField::Setin compound-multiple fields.
- Adds missing reasonphrase forRejectWithEmailscreen plug-in.
- Prevents HEAD requests (e.g. email link checkers) for password from resets from actioning the reset. (Otherwise, when the user clicks the link it will say the password has already been reset, which may concern the user).
- Forces full day-month-year to be set for document embargo expiry to avoid any ambiguity.
- Improves epadminby removing unusable functionality and reporting if database tables do not get created as expected.
- Adds symlinks between briefanddefaultcitations for various data objects to prevent log warnings.
- Better handles DOI not being set when parsing it.
- Provides more useful logging when field has invalid parameters.
- Fixes missing pin for error message phrase when restarting indexer.
- Fixes MetaField::Listingfrom logging an error that nodatasetidis defined when on its index page (listing datasets).
- Updates positions for plug-ins that appear in key tools menu to ensure deterministic ordering.
- Fixes bug to allow REST calls for subjects with IDs that contain full stops (.).
- Better handles finding parent eprint when this has been deleted.
- Fixes tools/map_config.plto use correct directory for EPrints 3.4 and allow flavour to be specified as a parameter.
- Fixes tools/update_phrase/fileto ensure EPrints libraries can be used and allows specified base path, current working directory or full path to be used for phrase file parameters.
- Fixes BibTeX import using uploaded files broken by update to third-party BibTeX::Parserin EPrints 3.4.2.
- Fixes RDF export by adding doi namespace for when id_number field is set to a DOI and is then translated to an owl:sameAs in a RDF export.
- Fixes bug preventing exporting file data objects in various metadata formats (e.g. Atom, EP3XML, CSV, etc.).
Known Issues
Multiple values not supported by EPrints::MetaField::Date validate function
The validate function of EPrints::MetaField::Date does not support multiple values like is used in Dates, Dates, Dates Bazaar plugin. This patch fixes this issue and also improves the warning message generated if an invalidate date is entered.
Bespoke validations no longer supported for Date MetaField
The validate function of EPrints::MetaField::Date does not call the SUPER function where validation triggers for specific Date MetaFields are called. This patch can be used to reintroduce the SUPER function call and allow bespoke validations again.
Requesting /cgi/set_lang causes an internal server error
This is cause by an underlying error where there is no scheme method for Apache2::RequestRec object.  Instead EPrints::Repository's get_secure method should be called and tested for whether the return value is the string "on".  This patch can be used to fix this issue.  However, the original intent for this change in 3.4.3 is to ensure EPrints' set_lang continues to work in future version of Firefox and other browsers (see https://github.com/eprints/eprints3.4/issues/118).  
Even with the above patch to the fix, EPrints' set_lang may not be correctly defined to be supported in future web browsers.  To make sure it is, you need to make sure that the installed version Perl's <a href="https://metacpan.org/pod/CGI::Cookie">CGI::Cookie</a> package is at least version 4.31.  This is the case for Ubuntu (18.04 and 20.04 LTS) and RHEL/CentOS 8, all of which, (as of 2nd June 2021), install version 4.38.  However, RHEL/CentOS 7 only installs version 3.63.  This means you will need to install the latest version through CPAN (as the root user):
cpan CGI::Cookie
Non-multiple Subject field causes an internal server error
This problem does not affect any fields in a vanilla installation of EPrints 3.4.3 but will affect any non-multiple Subject fields unless they use the Field::Subject typed component in the workflow.  From the web browser the page will just show an internal server error but the underling error message in the webserver logs will be something like:
[Fri Sep 10 10:54:00.439169 2021] [:error] [pid 12763] Can't use an undefined value as an ARRAY reference at /opt/eprints3/perl_lib/EPrints/MetaField/Subject.pm line 128.\n
This issue was caused by some recently introduced code to attempt to harmonise the behaviour of Subject fields independent of whether they use the Field::Subject workflow component.  Specifically, that any subject that has been made undepositable but which had already been selected for an item, should still appear an remain selected.  This prevent a user deselecting this option without realisng, whilst editing it for another reason. The divisions is a good example, where an institution has changed its structure and does not want users to deposit against old divisions but wants to ensure these old divisions remain associated with historical items. (see https://github.com/eprints/eprints3.4/issues/144 for more details).  
This patch fixes this issue by checking to see whether the value returned by the get_value for the specific field, is already an ARRAY reference and if not making sure it is converted to one before the foreach code block that iterates over this value.
Multiple Subject field using default rendering shows selected options twice
This problem does not affect the subjects stage of the workflow because the subjects field is rendered using the special Field::Subject workflow component.  However, it is will effect the divisions field in the core (Details) stage of the workflow.  In of itself this does not cause selected options to be saved to the database but it liable to be concerning to users.  
This issue can be resolved by the same patch that fixes the bug with non-multiple Subject fields. It makes sure that only previously selected items that are now undepositable are added to the array of selected options, as the one that are still depositable will be added to this array by other means. In making this change, both undepositable and depositable options will only be selected once in the select input for this field.
Non-compulsory boolean radiogroup fields cannot deselect UNSPECIFIED
If a field is set to type boolean and then in the workflow it is not set as required, then the UNSPECIFIED option will be displayed and selected.  This is due to a typo in the code when enhancing accessibility for various MetaFields.  This patch fixes the typo so that the name of the input field is set correctly, whilst the id for the input field is set to a unique value.
RIOXX2 plugin field validation leads to internal server error
When upgrading if you have the RIOXX2 plugin installed then this can lead to an internal server error when trying to edit an eprint item. This is due to a combination of issue.
- The RIOXX2 plugin has its own MetaField type that has a validatethat calls thevalidatemethod of the underlying MetaField but without providing the data object as a parameter.
- When this underlying validate method calls the EP_TRIGGER_VALIDATE_FIELDthis initiate the trigger method in lib/cfg.d/user_password_maxlength.pl. This trigger expect the data object to be set under the variable$userbut is is not and therefore fails when trying to call theisamethod on it, which is not possible when it is undefined.
Both the RIOXX2 plugin and lib/cfg.d/user_password_maxlength.pl should be patched.  The former to avoid any other trigger methods from causing an internal server error in a similar way and the latter to work round the bug in the RIOXX2 field's validate method. This is the patch for the latter.  Any future versions of the RIOXX2 plugin should hopefully have a patch to ensure the data object is sent as a parameter to the underlying field's validate_method.
EPrints::System's sanitise cannot be called by bin scripts
If you call the sanitise function ofr EPrints::System, this works fine under CGI because the EPrints::Repository object can be determined but this is not currently possible if this function is called by a standalone bin script.  This patch allows the function to be passed an optional EPrints::Repository object.  The bin script in question may need to be modified so it pass this additional parameter.
$c->{userhome} and $c->{urlpath} Auto-set Incorrectly
As a result of changes for configuration for URLs and paths, a bug was introduced which made $c->{userhome} and $c->{urlpath} full URLs rather than just paths.  This has been identified in GitHub issue.  The resolution is either to apply this patch or to update your archive's cfg/cfg.d/10_core.pl to include:
$c->{urlpath} = "/";
$c->{userhome} = "/cgi/users/home";
Configuration for URLs and Paths
EPrints has various configuration settings for URLs and paths that are automatically configured based on the settings in an archive's cfg/cfg.d/10_core.pl.  The six most significant are the following:
- http_url
- The HTTP version of the URL (e.g. http://example.eprints.org).  Since EPrints 3.4.1, this will be set to the same as https_urlif$c->{host}is undefined in the archive's cfg/cfg.d/10_core.pl.
- https_url
- The HTTPS version of the URL (e.g. https://example.eprints.org). If $c->{securehost}is not defined in the archive'scfg/cfg.d/directory (typically in10_core.plorhttps.pl) then the URL will be set to HTTP (e.g. http://example.eprints.org)
- base_url
- This is set to http_urlor https_url depending on whether$c->{securehost}is defined.
- http_root
- The HTTP version of the root path.  Typically this will an empty string unless EPrints needs to run under its own path (e.g. http://example.eprints.org/eprints will need the http_rootset to/eprints).
- https_root
- The HTTPS version of the root path.  Typically this will be empty string unless EPrints needs to run under its own path (e.g. https://example.eprints.org/eprints will need the https_rootset to /eprints).
- rel_path
- Like base_urlthis is the same as thehttp_rootorhttps_rootdependent on whether$c->{securehost}is defined.
Each of the six configuration settings above have their sister configuration settings for their CGI URLs or paths:
| Setting | Example | CGI Setting | CGI Example | 
|---|---|---|---|
| http_url | http://example.eprints.org | http_cgiurl | http://example.eprints.org/cgi | 
| https_url | https://example.eprints.org | https_cgiurl | https://example.eprints.org/cgi | 
| base_url | https://example.eprints.org | perl_url | https://example.eprints.org/cgi | 
| http_root | http_cgiroot | /cgi | |
| https_root | https_cgiroot | /cgi | |
| rel_path | rel_cgipath | /cgi | 
Over time there has been a move from EPrints repositories being HTTP-only to HTTP and HTTPS and further towards HTTPS only (with HTTP redirects). EPrints configuration has had to evolve over time to support all these use cases, so those running EPrints can (continue to) run the HTTP/HTTPS setup that works for them. (Although, the recommendation would be to use the most secure configuration available). However, this has led to inconsistent use of the twelve configuration settings for URLs and paths described above. Therefore, EPrints 3.4.3 has attempted to tidy up much of this.
The use of http_url, https_url, http_cgiurl and https_cgiurl is unnecessary, when base_url and perl_url by default will consistently provide the most secure available option.  (The only case where there may be issue is for URIs representing eprint items.  Where changing the repository setup should not change permanent URIs. For this the recently introduced uri_url configuration option can be used).  Using base_url and perl_url consistently also ensure that there is not switching between the use of HTTPS and HTTP.  In particular, this helps avoid problems with the web browser reporting mixed content warnings (e.g. HTTP images on HTTPS pages) and warnings when forms are submitted to an HTTP URL and the current page is HTTPS. 
Using a full URL (with a protcol and hostname) is excessive in most cases and therefore rel_path and rel_cgipath should be the preferred options.  However, there are a number of situations where this would not be appropriate, as the URL may be used outside the scope of the EPrint repository:
- In export formats for eprint items
- In citations for eprint items
- In generated email messages
- Links in the web page head to related alternate resources of services independent of the web page (i.e. to export formats, API services but not CSS, JavaScript, favicons, etc.).
For all these scenarios not using a full URLs might prevent the resource being accessed once the content has been exported, embedded, emailed or consumed by a user or third-party application.
The use of http_root, https_root, http_cgiroot and https_cgiroot is also unnecessary, when  rel_path and rel_cgipath are available.  Except when being used to define other URL and path configuration settings or (with http_url) generating Apache configuration (using bin/generate_apacheconf).
Summary of Changes
- All existing URL and path configuration settings have been retained to ensure existing plugins and bespoke code continue to function.
- Eliminated use of http_url(except in Apache configuration generation),https_url,http_cgiurlandhttps_cgiurl.
- Used base_urlandperl_urlfor links related to user-generated content (i.e. eprint summary/abstract pages, document download URLs, document preview images/thumbnails, etc.), links defined in the head of a web page that are not directly related to the generation of the page (i.e. everything but CSS stylesheets, JavaScript and favicons) and links that are rendered as text on a web page.
- Used rel_pathandrel_cgipathfor all other purposes where it is unnecessary to usebase_urlorperl_url.
- Eliminated use of http_cgirootandhttps_cgirootand limited use ofhttp_rootandhttps_rootto Apache configuration generation and in defining other configuration settings.
- Ensured rel_pathandrel_cgipathare used for links in templates, static pages and other places where previously no URL/path configuration setting was used. To ensure these links work when EPrints is not running on the root path (i.e.rel_pathis not an empty string).
Planned Development
See EPrints 3.4.4
