Miscellaneous Config Options
Revision as of 12:16, 25 July 2024 by Drn@ecs.soton.ac.uk (talk | contribs) (Added warning about legacy options)
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
Work in Progress
The page is a work in progress.
Only some config options that are not in files have been added. This list is not yet exhaustive
This page describes configuration options that do not appear in any specific configuration files present in the main codebase (inc. pub_lib flavour). The reason a configuration option does not appear in a default configuration file may be because it is a legacy option that may become deprecated and removed. Some descriptions below state that a configuration option is legacy but some descriptions may still need updating to explain they are legacy options.
A
- access_logger_func - Defines a function that is called every time a documents is downloaded or abstract/summary page is viewed (i.e. an access record is created). This is used by recent versions of IRStats2, so it can process stats from a file-based access records rather than the database, which becomes a lot more efficient if the access database table is large.
- access_table_logger_disabled - Whether access records should still be saved to the database. (Without a setting database will continue to store access records).
- allow_duplicate_usernames - Whether the validation of
user
metadata should check for duplicate usernames. (Without a setting user metadata validation will fail if duplicate usernames. This should only bet set to1
temporarily if username changes are required that may temporarily lead to duplicates). - allow_uploaded_doc_js - A malevolent user could deliberately upload malicious JavaScript to perform a clickjacking or similar attack against logged in users. Should uploaded JavaScript only be returned with a Content Security Solicy (CSP) to prevent it from being run on the client web browser. (Without a setting CSP to prevent JavaScript running on the client is enabled).
- auth_basic - Deprecated configuration for configuring basic rather than cookie-based user authentication.
B
- browse_views_max_items - Global setting for the maximum number of results that can be displayed on a browse view listing page. Superseded if a particular view has a
max_items
attribute set. (Without a setting this defaults to2000
items).
C
- check_user_password - Allows a function to be defined to perform a bespoke check of whether the users password is correct. Is passed the username and password and should return
0
if the password validation failed and1
if it was successful. (Without a setting, standard local authentication of the user's password is used). - custom_handlers - If you have a third-party application that needs to integrate with its own connector (e.g. Pure), then this allow you to define configuration for this.
D
- dbdriver - Which database driver (type of database, i.e.
mysql
,Pg
orOracle
) to use. (Without a setting this defaults tomysql
). - dbschema - Which database schema to use. This is only applicable for
Pg
(PostGreSQL) databases. - default_export_plugin - Which export plugin to pre-select on browse view and search results pages. (Without a setting, whichever export plugin appears first is pre-selected).
- deps - Defines dependencies of ingredients on other ingredients. By default this is not defined as no ingredients package in an EPrints release have any dependencies.
- disable_basic_auth - Ensure authentication cannot accidentally fallback to basic (rather than cookie-based) authentication. (Without a setting it could fallback but it is very unlikely to do this unless your EPrints repository has been significantly modified).
- disable_make_open_access - Remove the option on the Request Copy approval form to immediately make the document open access. (Without a setting the checkbox for this is present).
- does_user_own_eprint - Allows a function to augment what eprint records a user appears to own. This may be useful if a particular user is acting as a surrogate for another user. Should be defined in conjunction with
get_users_owned_eprints
.
E
- email_blacklist - Array reference containing a list of email addresses that are not allowed to make requests for copies. Can be useful if some individual is making excessive requests, although ReCaptcha is probably more useful (Without being defined no restriction on email addresses that can requests copies).
- enable_file_imports - Whether local files (for the importer) can be uploaded to the repository an part of an XML (e.g. EP3XML, Atom, etc.) import. (Without a setting they cannot be uploaded).
- enable_import_fields - Specify fields for a data object that can be imported even through their field definition says they can not.
- enable_web_imports - Whether web-based files can be uploaded to the repository an part of an XML (e.g. EP3XML, Atom, etc.) import. (Without a setting they cannot be uploaded).
- eprints_access_restrictions_callback - Allows a function to be defined to test if the current user has access to read/write (access View/Edit page) the specified eprint record. Useful if you want to give access to certain eprint records without changing a user's type.
- eprint_rss_media_doc - Allows a function to be defined to build a bespoke XML document fragment to encapsulate RSSv2 export (Without a setting an empty XML document fragment is used).
- eprint_status_change - Allows a function to be defined that is called when the status (i.e.
eprint_status
metadata field) has changed. This is a legacy function.EP_TRIGGER_STATUS_CHANGE
dataset trigger function should be used instead. - expiry_for_doc_request - How many days an approved request copy link will last before access will be revoked. (Without a setting this is 7 days).
- expiry_for_unresponded_doc_request - How many days an unapproved request can exists before the request can no longer be approved. Saves really old requests being approved without proper consideration or confusing / annoying the requester months/years after their request. (Without a setting this is 90 days, approx. 3 months).
- export_fieldlists - What fields can exported for a data object. Further restricts field definition settings for
export_as_xml
. (Without a setting, what fields can be exported are only defined by their field definitions).
F
- file_local_path_function - Allows function to be defined to modify the location where the file represented by the file data object can be found on the local filesystem. Could be used if a bespoke new type of data object that has associated files need to be added.
G
- generic_filenames - New (non-history) files will be added to the documents directory as
<fileid>.bin
rather than there actual filename. This can be useful if users are uploading files with names that use special characters that are not compatible with the filesystem or database encoding. EPrints will already allow existing files to be mannually changed to the<fileid>.bin
to provide a simple fix for existing issues using special characters, which subsequently cannot be downloaded. - get_current_user - Allows a bespoke function to defined to determine who the current user is. (Without a setting current user is determine from cookie or basic authentication attributes in the request).
- get_custom_view_header - Allows a function to be defined that adds custom HTML markup (XML DON object) to a browse view page. By default this is after the navigation but before any listings / menus.
- get_custom_view_header_location - Allows content for
get_custom_view_header
(if defined) to be added either before the navigation (before_nav
) or after the navigation (after_nav
). (Without a setting it is added after the navigation. Only added since EPrints 3.4.6). - get_login_url - Allows bespoke function to defined to generate the login URL for the repository archive. Useful for Shibboleth or other SAML-based login. (Without a setting
/cgi/users/login
is used). - get_request_eprint_descriptor - Allows a bespoke function to be defined to generate the descriptor for an eprint. This may be useful if multiple eprints have the same title. (Without a setting the
title
of the eprint is used as the descriptor). - get_users_owned_eprints - Allows function to be defined that lists the set of eprints a user owns, instead of just those where their
userid
is set on the eprint record. Should be defined in conjunction withdoes_user_own_eprint
.
H
- history_enable - Allows revision history to be recorded for rich data objects beyond just eprint data objects. Is an hash reference of
datasetid
to whether history is enabled. Any defined value means enabled.
I
- ignore_login_ip - Whether IP address should be ignored in
loginticket
so user can change IP address without being logged out. (Without a setting up to 3.4.6 the login IP address has not been ignored, from 3.4.6. it will be ignored by default). - items_filters - What filters can be uses on the
EPrints::Plugin::Screen::Items
(i.e. Manage Deposits) page. Useful if an additionaleprint_status
option has been added. (Without a setting this uses the foureprint_status
options (inbox, buffer, archive, deletion
). - items_filters_order - What order should filters be displayed on the
EPrints::Plugin::Screen::Items
(i.e. Manage Deposits) page. (Without a setting this uses the foureprint_status
options in the following order (inbox, buffer, archive, deletion
). - import_xml_permitted_tags - When importing HTML encoded text through an XML import what HTML tags are allowed. Useful if there is a particular HTML tag that is commonly used is not included in the hardcoded default list, which is quite restrictive to avoid use of certain HTML tags for malicious purposes. (By default allowed HTML tags are:
b big blockquote br code dd div dl dt em h1 h2 h3 h4 h5 h6 hr i li ol p pre s small span strike strong sub sup table tbody td th tr tt u ul
).
J
K
L
- login_required_for_cgi
- enable - Whether users need to be logged in to access
/cgi/...
pages (/cgi/users/...
pages always required being logged in). (Without a setting login is not required, set to1
to enable). - exceptions - Array reference of paths that are excluded from needing login. (Without a setting all CGI scripts require loginm if
enabled
is set to1
).
- enable - Whether users need to be logged in to access
- login_required_for_eprints
- enable - Whether users need to be logged in to access eprint abstract/summary pages. (Without a setting login is not required, set to
1
to enable).
- enable - Whether users need to be logged in to access eprint abstract/summary pages. (Without a setting login is not required, set to
- login_required_for_views
- enable - Whether users need to be logged in to access browse view pages. (Without a setting login is not required, set to
1
to enable).
- enable - Whether users need to be logged in to access browse view pages. (Without a setting login is not required, set to
- login_required_url - URL for page to be redirected to if login is required. (Without a setting
/cgi/users/login
is used).
M
- max_history_width - The number of characters per line for XML displayed in the history tab before wrapping to a new line. (Without a setting a maximum of
120
characters are displayed before wrapping).
N
- notify_embargo_expiry - Allows a bespoke function to be defined to perform actions when an embargo is lifted on a document. As the name suggests, such actions may be sending out a notification to the eprint owner or a repository administrator. (Without a setting no additional actions are performed when an embargo is lifted.
O
- oai
- v2
- output_plugins - Hash referernce of Additional export plugins that can be used with OAI-PMH. (Without a setting just those that have an
xmlns
set, provide export for the appropriate dataset (typically eprint) and are at least visible tostaff
). - sample_identifier - An example of an identifier used by OAI-PMH for individual records. (Without a setting no sample identifier is shown on OAI-PMH pages).
- output_plugins - Hash referernce of Additional export plugins that can be used with OAI-PMH. (Without a setting just those that have an
- v2
- on_files_modified - Allows a function to be defined that is called when a files modified indexer task runs. This is a legacy function.
EP_TRIGGER_FILES_MODIFIED
trigger function should be used instead. - on_generate_thumbnails - Allows a function to be defined that is called when a generate thumbnails indexer task runs. This may be useful if a special type of thumbnail file is required but cannot be defined with existing configuration for thumbnail types (now using an
EP_TRIGGER_THUMBNAIL_TYPES
trigger function). - on_logout - Allows a bespoke function to be defined to carry out certain actions before a user is logged out. (without a setting no additional actions are performed).
- order_auto_submit - Whether changing the ordering for search results will automatically reload the page. (Without a setting search results pages will not automatically be reloaded on ordering change).
P
Q
R
- recaptcha - Configuration for Recaptcha fields.
- ignore_countries - Array reference of ISO-639-1 two character codes for countries who should be automatically rejected. (Without a setting no countries are automatically rejected).
- private_key - The secret key generated by registering the hostname of the EPrints repository as a reCATPCHAv2 site.
- public_key - The site key generated by registering the hostname of the EPrints repository as a reCATPCHAv2 site.
- timeout - The maximum time to wait for a response from ReCAPTCHA. (Without a setting LWP's default of 180 seconds is used. ReCAPTCHA is usually fairly quick to respond, so there is not really any need to set this).
- recaptcha3 - Configuration for Recaptcha3 fields.
- ignore_countries - Array reference of ISO-639-1 two character codes for countries who should be automatically rejected. (Without a setting no countries are automatically rejected).
- min_score - The minimum score reCAPTCHAv3 needs gives for the request for it to be valid. (Without a setting Google's default of
0.5
is used). - private_key - The secret key generated by registering the hostname of the EPrints repository as a reCATPCHAv3 site.
- public_key - The site key generated by registering the hostname of the EPrints repository as a reCATPCHAv3 site.
- timeout - The maximum time to wait for a response from ReCAPTCHA. (Without a setting LWP's default of 180 seconds is used. ReCAPTCHA is usually fairly quick to respond, so there is not really any need to set this).
- request_copy_cc - Array reference of email addresses that should be cc-ed in on all requests submissions that lead to an email be ing sent.
- required_formats - Legacy options for defining either an array reference or a function that will generate an array reference of required formats for documents. This can be useful if different eprint types should allow different formats of documents. However,
validate_field
configuration is a more appropriate way to do this is in modern versions of EPrints. - retain_embargo_dates - Whether embargo dates for documents should be retained when the embargo date has been reached. This may be useful if EPrints integrates with a third-party application that needs to retain embargo dates. (Without this setting
bin/lift_embargos
will remove dates once the embargoes expire. Otherwise, validation warnings for both the dates being in the past and documents still having embargo dates when they are public will be generated). - rewrite_exceptions - Array reference of paths that should not be served by EPrints's Perl handler and should be served directly by Apache or some other handler.
- robotstxt
- crawl_delay - Adds crawl delay configuration to robots.txt.
- default_seconds - The default number of seconds all other user agents (bots) should wait between crawl requests. (without a setting no default crawl delay will be set).
- seconds - The number of seconds for specified (i.e. more aggressive) user agents (bots) should wait between crawl requests. (Without a setting
10
seconds will be used). - user_agents - Array reference of specific (more aggressive) user agents that require a (greater) crawl delay. (Without a setting no user agents will have a crawl delay, unless
default_seconds
is set).
- crawl_delay - Adds crawl delay configuration to robots.txt.
S
- saved_search_additional_recipients - Comma separated list of emails addresses that should additionally be included in saved search emails. (Without a setting no additional recipients will be added to saved search emails).
- saved_search_citation -
eprint
citation to use for items listed in saved search emails. - signup_style - The introductory message to show on the signup page. (Without a setting the default introductory message is used. If set to
minimal
then the minimal introductory message is used. Typically this setting is only useful ifdefault_user_type
is also set tominuser
). - skip_validation - Allows a bespoke function to be defined to determine if an eprint is allowed to skip validation. This may be useful for imported records that had different validation rules (Without a setting no eprints are allowed to skip validation).
- STAFF_ONLY_LOCAL_callback - Allows a function to be defined to test of current user is a local staff (editor/admin) user to set
STAFF_ONLY_LOCAL
for use in workflows. (Without settingSTAFF_ONLY_LOCAL
will be set to0
).
T
- theme - Rather than having to define templates, CSS and JavaScript branding just for the archive, use the files under the defined theme subdirectory (e.g.
lib/theme/
,archives/ARCHIVE_ID/cfg/theme/
or a theme sub-directory for paths lists in flavour's inc file). (Without a setting no theme will be used). - thumbnail_types - Allows a function to be define to add extra thumbnail types that should be generated for a document. This is a legacy function.
EP_TRIGGER_THUMBNAIL_TYPES
trigger function should be used instead.
U
- ultimate_doc_pos - Allows a bespoke function to be defined for determining the
pos
value (i.e. the sub-directory where files associated with this document are stored on disk) for the newly created document under an eprint. (Without a setting this value for the document'spos
field is one greater than the current maximumpos
value of all documents under that eprint). - user_access_restrictions_callback - Allows a function to be defined to test if the current user has access to read/write (access View/Edit page) the specified user record.
- user_area_template - Page template to use when accessing user area pages. Superseded in EPrints 3.4.x by automatically using
default_internal
template all screen plugins. (Without a setting eitherdefault
(EPrints 3.3.x) ordefault_internal
(EPrints 3.4.x) template is used. - user_cookie_timeout - The amount time (using: h, d, w, m, and y) before the user session cookie should expire (e.g.
+1h
,+7d
). (Without a setting no cookie timeout is set). - user_inactivity_timeout - How long to wait in seconds before logging the user out after their last activity. (Without a setting 86400 * 7 seconds, i.e. 7 days is used).
- user_session_timeout - How long in seconds the user can stay logged in before they must re-log in. (Without a setting user session nevr times out. With a setting
user_inactivity_timeout
probably will to be reduced to a considerably shorter period of time to this).
V
- version_extra - Extra information to add about the EPrints repository software version, as appears under
/cgi/counter
Useful if the version of the codebase has been augmented. (Without a setting no extra version is included). - view_sort_function - Redefines the sort function used by
EPrints::MetaField->sort_values
(Without setting theUnicode::Collate
is used as the collator to compare values when sorting. - virtualhost - When running
bin/generate_apacheconf
what to set in the<virtualhost HOSTNAME:PORT>
lines. (Without a setting,*
, i.e. a wildcard is used).
W
- workflow_datepicker - Defines function for rendering a datepicker for date fields in a workflow. Useful if you want to include some JavaScript to help a user pick a date. (Without a setting, a simple year, month and day set of fields is rendered.)