EPrints 3.4.3

From EPrints Documentation
Revision as of 12:13, 6 April 2021 by Drn@ecs.soton.ac.uk (talk | contribs) (Added info about cgi and legacy settings)
Jump to: navigation, search

Provisional Release Notes

EPrints 3.4.3 release candidate 1 is now available on GitHub. Full release still planned for end of March 2021.

  • Zero codename: Snickerdoodle Sandstorm
  • Publications flavour codename: Banana Bread Rainbow


New Dependencies

None. Check earlier dependencies for EPrints 3.4.2 and before.


Changes Since 3.4.2

New Functionality

  • Provides function (EPrints::Utils::compare_version) for comparing EPrints software versions so plugins can choose to behave differently.
  • Adds jquery EPrints 3.4 ingredient to allow JQuery resources to be incorporated if required by non-core EPrints functionality. Ingredient added but commented out in flavours/pub_lib/inc.
  • Provides picker for date fields to reduce potential for human error.


Security and Privacy Improvements

  • Makes JSON export respect value set for export_as_xml to avoid exporting unintended fields.
  • Sets contact_email to not export by default for better GDPR compliance.
  • Rate limits failed local login attempts for a particular user. Defaults to 10 failed login attempts in 10 minutes (using $c->{max_login_attempts} and $c->{lockout_minutes} configuration options).
  • Rate limits new account requests. Defaults to 100 requests in a hour long period to avoid causing problems with planned mass sign up events. Configured using $c->{max_account_requests} and $c->{max_account_requests_minutes} options.
  • Limits the maximum length of a password (default 200 characters) to prevent specific Denial-of-Service attacks.
  • Modifies latex invocation to make more secure an ensure it rights to the correct output directory.
  • Removes legacy /cgi/latex2png script to prevent Remote Code Execution (RCE) from CVE-2021-3342.
  • Validates parameters passed to /cgi/cal script to protect against RCE and Cross Site Scripting (XSS) from CVE-2021-26475 and CVE-2021-26476.
  • Validates dataset parameter passed to /cgi/dataset_dictionary to protect against XSS from CVE-2021-26702.
  • Validates parameters passed to /cgi/history_search to protect against the possibility of XSS and MySQL injection vulnerabilities, although none currently exploitable.
  • Validates verb passed to /cgi/toolbox/toolbox to protect against RCE from CVE-2021-26704.
  • Allows EPrints::XML::parse_string to temporarily modify parser configuration to disable expanding of XML entities by the /cgi/ajax/phrase script to protect against CVE-2021-26703.
  • Ensure eprints_lang cookie explicitly sets SameSite attribute to Lax if HTTPS is not enabled or sets secure attribute to 1 if it is.
  • Doubles (to 16) number of characters used for randomly generated password suggestions (e.g. for database user created by epadmin).


General Improvements

  • Resolves all accessibility errors and most alerts (as reported by the WAVE Web Accessibility Evaluation Tool) for backend admin pages as listed in the Accessibility report.
  • Appropriately uses protocol-relative, preferred (i.e. https where available) or absolute path URLs instead of http URLs to avoid mixed content warnings and other similar issues.
  • Uses HTML class for Longtext_counter counter line and provides CSS to by default make this red and bold if the maxwords limit is exceeded.
  • Allows full path or just subject name to be displayed for values set in subjects field by setting render_path attribute (true by default).
  • Allows time between password reset requests to be configured (to a number of hours) rather than hard-coded to 24 hours.
  • Adds option to set status of EPrints that should be checked by check_xapian script.
  • Adds date_embargo_retained field to Documents to retain the embargo date after the embargo is lifted.
  • Adds deprecation warning to indicate only LibXML library will be supported in future versions of EPrints. (Version for removal yet to be confirmed).
  • Adds basic default citation for files.
  • Adds validation for date fields to prevent invalid dates being set.
  • Allows files to be stored to disk with generic filenames (i.e. <fileid>.bin) rather than the upload filename that can cause retrieval problem by setting the $c->{generic_filenames} configuration setting.
  • Adds code to lib and pub_lib versions of security.pl to deal with request a copy access issue if repository has coversheets plugin enabled.
  • Allows sub-fields to have help texts configured.
  • Generates log message for views that have reached their max_items limit.
  • Allows document fields displayed in an eprint's Details tab to be configured.
  • Adds data-context attributes to aid CSS styling of HTML div elements within search forms.
  • Improves scaling of of Lightbox popups on narrow screens.
  • Allows a pin to set the id or class attribute for a template HTML element to facilitate different CSS styling on different pages.
  • Prompts for organisation name when creating a new archive using epadmin create.
  • Allows classes to be assigned to parts (i.e. top, left, main, after, etc.) of abstract/summary pages to facilitate better CSS styling.
  • Allows search results forms to be configured to automatically reload on ordering change.
  • Allows classes for EPrints' user menu (i.e. by default ep_tm_key_tools) to be configured to better integrate with institutional branding.
  • Adds commented advice on fixing issues with PDF thumbnail generation using ImageMagick.
  • Tidies up citations by removing identical flavours/pub_lib/ citations that appear in lib/.
  • Allows max_files that can be expanded from an uploaded zip file to be configured. (Previously hardcoded to 100 file limit).
  • Allows customisable overriding of type-mapping based on ispublished value for BiBTeX and RIS export. (I.e. not using @unpublished type if ispublished set to unpub).
  • Provides has_role EPC function to allow more granular control of workflows that can be edited by specific group of user that have has this role added to their profile.
  • Allows positioning of hovering document preview to be set to left or right and implements this in summary_page and result eprint citations to improve preview positioning.
  • Improves verbosity of get_input and get_input_hidden to explain better why input is invalid. Assisting epadmin data entry.
  • Makes id_number (i.e. for DOIs, etc.) field default type id rather than text to facilitate search for exact IDs, espeically those containing non-alphanumeric characters (e.g. 10.1016/j.ejor.2018.02.047).

Bug Fixes

  • Prevents access code from changing if request a copy is approved multiple times (and warns if request has already been approved).
  • Fixes formatting of results from user search introduced by accessibility changes in 3.4.2.
  • Updates issue citation to use <div> rather than and tags so it works with accessibility changes introduced in 3.4.2.
  • Deals more gracefully if History DataObj's parent does not exist when checking if it is an EPrints::List.
  • Fixes internal server error when trying to save values to a non-multiple compound field.
  • Fixes issue with XML import setting unset compound subfields to NULL rather than empty string.
  • Fixes errors on import pages caused by Accessibility improvements and fixes other Accessibility issues for import pages.
  • Fixes broken link of default 401 error page.
  • Sets STDOUT and STDERR binmode to utf8 to avoid wide character errors.
  • Fixes user history layout as a result of earlier Accessibility improvements.
  • Fixes missing phrases reported in error logs. Typically fieldhelp phrases for sub-fields of compound fields, which are generally not rendered but still report missing phrase warnings.
  • Sets SameSite property to None for secure login and request copy cookies as Google Chrome is now quite strict about this.
  • Provides handling for XML parsing of revision files to show warning for particular revision in the hostory tab rather than an internal server error for the whole page.
  • Allows use of same pin multiple times within a phrase.
  • Prevents undefined $c->{aliases} from causing epadmin config_core to fail.
  • Ensures saved searches indicate email notifications cannot be sent when the Xapianv2 (in addition to Xapian) plugin has been used for the original search.
  • Adds missing event_queue phrases.
  • Fixes inappropriate mapping of monograph_type in BibTeX import.
  • Fixes erroneous post-login redirection when login required to access eprint summary/abstract pages and long URLs (e.g. /id/eprint/1234) enabled.
  • Ensures readonly attribute works for MetaField::Set in compound-multiple fields.
  • Adds missing reason phrase for RejectWithEmail screen plugin.
  • Prevents HEAD requests (e.g. email link checkers) for password from resets from actioning the reset. (Otherwise, when the user clicks the link it will say the password has already been reset, which may concern the user).
  • Forces full day-month-year to be set for document embargo expiry to avoid any ambiguity.
  • Improves epadmin by removing unusable functionality and reporting if database tables do not get created as expected.
  • Adds symlinks between brief and default citations for various data objects to prevent log warnings.
  • Better handles DOI not being set when parsing it.
  • Provides more useful logging when field has invalid parameters.
  • Fixes missing pin for error message phrase when restarting indexer.
  • Fixes MetaField::Listing from logging an error that no datasetid is defined when on its index page (listing datasets).
  • Updates positions for plugins that appear in key tools menu to ensure deterministic ordering.
  • Fixes bug to allow REST calls for subjects with IDs that contain full stops (.).
  • Better handles finding parent eprint when this has been deleted.
  • Fixes tools/map_config.pl to use correct directory for EPrints 3.4 and allow flavour to be specified as a parameter.
  • Fixes tools/update_phrase/file to ensure EPrints libraries can be used and allows specified base path, current working directory or full path to be used for phrase file parameters.
  • Fixes BibTeX import using uploaded files broken by update to third-party BibTeX::Parser in EPrints 3.4.2.

Default Configuration Changes: URLs and Paths

EPrints has various configuration settings for URLs and paths that are automatically configured based on the settings in an archive's cfg/cfg.d/10_core.pl. Over time there has been a move from HTTP-only to HTTP and HTTPS and further towards HTTPS only (with HTTP redirects) EPrints repositories. Therefore, EPrints configuration has had to evolve over time to support all these use cases, so those running EPrints can (continue to) run the HTTP/HTTPS setup that works for them. (Although, the recommendation would be to use the most secure configuration available).

Normally, the easiest solution to this problem would be to avoid using full URLs that include the hostname and use the path (e.g. rather than http://example.eprints.org/view/year, just use /view/year. However, EPrints neededing to extensively support exports and embedding that means that often having the hostname in the URL link is essential.

Prior to EPrints 3.4.3, there have been five main configuration settings for the primary part of the URL:

http_url
The HTTP version of the URL (e.g. http://example.eprints.org). Since EPrints 3.4.1, this will be set to the same as https_url if $c->{host} is undefined in the archive's cfg/cfg.d/10_core.pl.
https_url
The HTTPS version of the URL (e.g. https://example.eprints.org). If $c->{securehost} is not defined in the archive's cfg/cfg.d/ directory (typically in 10_core.pl or https.pl then the URL will be set to HTTP (e.g. http://example.eprints.org)
http_root
The HTTP version of the root path. Typically this will be undefined unless EPrints need to run under its own path (e.g. http://example.eprints.org/eprints will need the http_root set to /eprints).
The HTTPS version of the root path. Typically this will be undefined unless EPrints need to run under its own path (e.g. https://example.eprints.org/eprints will need the https_root set to /eprints).
rel_path
This is the same as the http_root or https_root dependent on whether the current page is HTTP or HTTPS. If these are not defined, this will be set to an empty string.

In EPrints 3.4.3, two additional URL configuration settings have been had to help assist making repository more secure and reduce issues, like page reporting mixed content warnings or form popping up a security warning when they are submitted.

pr_url
This is a protocol-relative URL (e.g. //example.eprints.org). If the current page is HTTP, a link will take the user to the HTTP page. Similarly if the current page is HTTPS the link will take the user to the HTTPS page. This is useful to prevent users switching between HTTP and HTTPS, which they might find concerning. However, this is not suitable for some export formats where the text for the URL can be seen, as the format of the URL is likely to be unfamiliar to users. It may also not be supported by some third-party applications that ingest EPrints exports.
preferred_url
This is the preferred URL, (i.e. the most secure available URL). In essence this works the same as https_url but avoids any confusion to whether the URL will still be set to HTTPS if HTTPS is not enabled.

Both of these new URL configuration settings have their sister cgi configuration settings, i.e. pr_cgiurl and preferred_cgiurl. Like the existing http_cgiurl, https_cgiurl, http_cgipath, https_cgipath and rel_cgipath,, they are the same as their non-cgi sibling albeit a /cgi suffix (e.g. //example.eprints.org becomes //example.eprints.org/cgi).

EPrints still has a number of legacy configuration settings that relate to URLs and paths, some small changes have been made to these to ensure they have the most appropriate setting:

urlpath
This is equivalent to http_root and this has not changed for EPrints 3.4.3.
base_url
This was set to http_url with a trailing '/' but has been changed to pr_url also with a trailing '/' for EPrints 3.4.3.
perl_url
Similarly, this was set to http_cgiurl with a trailing '/' but has been changed to pr_cgiurl also with a trailing'/' for EPrints 3.4.3.
frontpage
As this URL text often appears rendered within a web page, although this was set to http_url with a trailing '/' it has be changed to preferred_url also with a trailing '/' for EPrints 3.4.3.
userhome
This has remained unchanged as http_cgiroot trailed by '/users/home' is not affected by the HTTPS setup for the repository.