Https3
Introduction
Setting up EPrints3 to work with https is a little tricky. There seems to be a few bugs to be worked round. This How To considers the following scenario:
- Two repositories, repos1 and repos2, being served by virtual hosts repos1.FQDN:80 and repos2.FQDN:80
- A single https domain, at eprints.FQDN:443 (so that only one certificate is needed). Secure pages for repos1 and repos2 will be accessed at eprints.FQDN:443/repos1 and eprints.FQDN:443/repos2 respectively.
This How To should work with EPrints 3.0 or 3.0.1. It was developed on Ubuntu Server 6.06, but should work on other systems without significant changes. The instructions can be adapted for an arbitrary number of repositories.
It is assumed that EPrints is installed in /opt/eprints3/.
Getting started
Install EPrints 3.x following the appropriate instructions.
Run bin/epadmin create twice to create repos1 and repos2.
Edit /opt/eprints3/archives/repos1/cfg/cfg.d/10_core.pl to read:
$c->{host} = 'repos1.FQDN'; $c->{port} = 80; $c->{aliases} = []; $c->{securehost} = 'eprints.FQDN'; $c->{securepath} = '/repos1'; $c->{secureport} = 443;
Make secure versions of the templates:
cp /opt/eprints3/archives/repos1/cfg/lang/en/templates/default.xml /opt/eprints3/archives/repos1/cfg/lang/en/templates/secure.xml
Repeat these steps for repos2.
Generate the Apache configuration:
/opt/eprints3/bin/generate_apacheconf
Add 'Include /opt/eprints3/cfg/apache.conf' to the Apache configuration (for Ubuntu / Debian, can replace everything in /etc/apache/sites-avaliable/default with 'Include /opt/eprints3/cfg/apache.conf'). Apache should now be correctly configured to serve the non-secure pages.
Secure Apache Configuration
Next, we want to configure Apache to serve the secure pages. However, generate_apacheconf hasn't created a secure.conf file in /opt/eprints3/cfg/ so this needs to be done manually. Some configuration has been generated for us in /opt/eprints3/archives/repos1/var/auto-secure.conf and /opt/eprints3/archives/repos2/var/auto-secure.conf, but there are some problems with this:
- Some sections of the configuration overlap;
- The EPrints_ArchiveID and PerlSetVar EPrints_Secure variables have not been set.
We'll therefore create our own configuration. Create a new file called cfg/secure.conf:
#cfg/secure.conf: NameVirtualHost *:443 <VirtualHost *:443> ServerAdmin itsupport@FQDN ServerName eprints.FQDN SSLEngine On SSLCertificateFile /etc/apache2/ssl/apache.pem ErrorLog /var/log/apache2/error.log # Possible values include: debug, info, notice, warn, error, crit, # alert, emerg. LogLevel warn CustomLog /var/log/apache2/access.log combined ServerSignature On DocumentRoot "/var/www/eprints" <Directory "/opt/eprints3/cgi/users"> AuthName "User Area" AuthType "Basic" PerlAuthenHandler EPrints::Apache::Auth::authen PerlAuthzHandler EPrints::Apache::Auth::authz require valid-user SetHandler perl-script PerlHandler ModPerl::Registry PerlSendHeader Off Options ExecCGI FollowSymLinks </Directory> <Directory "/opt/eprints3/cgi/users/awstats"> PerlSendHeader On </Directory> <Directory "/opt/eprints3/cgi"> SetHandler perl-script PerlHandler ModPerl::Registry PerlSendHeader Off Options ExecCGI FollowSymLinks </Directory> PerlTransHandler EPrints::Apache::Rewrite Include /opt/eprints3/archives/repos1/var/manual-secure.conf Include /opt/eprints3/archives/repos2/var/manual-secure.conf </VirtualHost>
Not the line 'DocumentRoot "/var/www/eprints"'. Create an index.html file in /var/www/eprints/ with a welcome message and links to the home pages of the repositories. Also note that we need to create a manual-secure.conf file for each repository. The contents of this file are as follows:
#/opt/eprints3/archives/repos1s/var/manual-secure.conf <Location "/repos1"> PerlSetVar EPrints_ArchiveID repos1 PerlSetVar EPrints_Secure yes PerlSetVar EPrints_Dir_SecuredCGI /opt/eprints3/cgi/users PerlSetVar EPrints_Dir_Documents /opt/eprints3/archives/repos1/documents PerlLogHandler EPrints::Apache::LogHandler </Location> Alias /repos1/cgi/accounts/confirm /opt/eprints3/cgi/confirm Alias /repos1/cgi/accounts/register /opt/eprints3/cgi/register Alias /repos1/cgi/accounts/reset_password /opt/eprints3/cgi/reset_password Alias /repos1/cgi/accounts/set_password /opt/eprints3/cgi/set_password Alias /repos1/cgi/users/ /opt/eprints3/cgi/users/ Alias /repos1/ /opt/eprints3/archives/publications/html/
For completeness, we'll also want to add the welcome page to http: Add the following lines to /opt/eprints3/cfg/apache.conf
<VirtualHost *:80> ServerName eprints.FQDN ServerAdmin itsupport@FQDN DocumentRoot "/var/www/eprints" </VirtualHost>
Add 'Include /opt/eprints3/cfg/secure.conf' to the Apache configuration.
One thing remains. In /opt/eprints3/archives/repos1/cfg/cfg.d/misc.pl and /opt/eprints3/archives/repos2/cfg/cfg.d/misc.pl change the line
$c->{cookie_domain} = $c->{host};
to read
$c->{cookie_domain} = $c->{securehost};
Restart Apache. At this point it should be possible to access the repositories at http://publications.modhist.ox.ac.uk and http://oxhistonline.modhist.ox.ac.uk and log in to the secure area.
Debian / Ubuntu specific SSL instructions
Create a file called ssl in /etc/apache2/sites-available/ssl and add the line 'Include /opt/eprints3/cfg/secure.conf'. Run the commands:
a2ensite ssl a2enmod ssl apache2-ssl-certificate echo "Listen 443" >> /etc/apache2/ports.conf
Bugs
Broken Actions
Links which call Perl cgi scripts are broken – e.g. Under Manage Deposits, click New Item. Select an Item Type and then click Next. You will be returned to the Manage Deposits page, rather than to the next step in the workflow. This appears to be because the form action is pointing to http://publications.modhist.ox.ac.uk/cgi/users/home#t rather than /publications/cgi/users/home. As far as I can see, this is a bug, rather than a configuration mistake, though I'm happy to be advised otherwise.
The workaround I have for this bug is to install patch 252, which can be downloaded from http://files.eprints.org/252/
The patch seeks to resolve the problem by introducing a configuration variable users_url in 20_baseurls.pl. The use of perl_url has been replaced with users_url for all links to scripts in cgi/users. For insecure use, users_url can be set to perl_url. When https is requires, it can be adjusted appropriately.
Apply the patch to the EPrints 3.x source (patch -d eprints-3.0/ -p0 < users-url.patch) and re-run configure and install.pl. Add the following lines to /opt/eprints3/archives/repos1/cfg/cfg.d/20_baseurls.pl :
$c->{secure_urlpath} = $c->{securepath}; $c->{secure_url} = "https://".$c->{securehost}.($c->{secureport}!=443?":".$c->{secureport}:"").$c->{secure_urlpath};
# Mod_perl scripts for users scripts # If not using https, make this the same as perl_url # Otherwise make it $c->{secure_url}."/cgi" #$c->{users_url} = $c->{perl_url}; $c->{users_url} = $c->{secure_url}."/cgi";
Similarly for /opt/eprints3/archives/repos2/cfg/cfg.d/20_baseurls.pl. Restart Apache. It should now be possible to upload documents, step through workflows etc.
However, some bugs with image urls remain.
Internet Explorer 'Secure and non Secure items'
We're not done yet! Internet Explorer complains if non-secure (http) and secure (https) items are displayed on the same page. This happens when a full url, beginning http:// is embedded in a page that is https. If the securepath and url_path variables were the same, e.g. we were using http://repos1.FQDN/repos1 and https://eprints.FQDN/repos1 then we could simply make all urls relative, but because they are different, e.g http://repos1.FQDN/ and https://eprints.FQDN/repos1 we must handle the secure and non-secure cases separately.
Edit /opt/eprints3/archives/repos1/cfg/lang/en/templates/secure.xml to use full https:// urls for included javascript and css. The head section should look something like this:
<head> <title><epc:pin ref="title" textonly="yes"/> - <epc:phrase ref="archive_name"/></title> <script src="{$config{secure_url}}/javascript/auto.js" type="text/javascript"></script> <style type="text/css" media="screen">@import url(<epc:print expr="$config{secure_url}"/>/style/secure_auto.css);</style> <style type="text/css" media="print">@import url(<epc:print expr="$config{secure_url}"/>/style/print.css);</style> <link rel="icon" href="/favicon.ico" type="image/x-icon"/> <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"/> <link rel="Top" href="{$config{frontpage}}"/> <link rel="Search" href="{$config{perl_url}}/search"/> <epc:pin ref="head"/> </head> <body bgcolor="#ffffff" text="#000000">
We now need to modify 'generate_static' to create the secure_auto.css file as well as auto.css. The relevant section should be modified like this:
# do the magic auto.js and auto.css my $js = ""; my $css = ""; my $secure_css =""; my $fn; my $base_url = $session->get_repository->get_conf( "base_url" ); my $secure_url = $session->get_repository->get_conf( "secure_url" ); foreach my $target ( sort keys %{$map} ) { if( $target =~ m/(\/style\/auto\/.*\.css$)/ ) { $css .= "\@import url($base_url$1);\n"; $secure_css .= "\@import url($secure_url$1);\n"; } if( $target =~ m/(\/javascript\/auto\/.*\.js$)/ ) { $fn = $map->{$target}; open( JS, $fn ) || EPrints::abort( "Can't read $fn: $!" ); $js .= "\n\n\n/* From: $fn */\n\n"; $js .= join( "", <JS> ); close JS; } } $fn = "$base_target_dir/style/auto.css"; open( CSS, ">$fn" ) || EPrints::abort( "Can't write $fn: $!" ); $wrote_files->{$fn} = 1; print CSS $css; close CSS; $fn = "$base_target_dir/style/secure_auto.css"; open( CSS, ">$fn" ) || EPrints::abort( "Can't write $fn: $!" ); $wrote_files->{$fn} = 1; print CSS $secure_css; close CSS;
Re-run generate_static repos1.
Missing Images
In /opt/eprints3/lib/lang/en/phrases/system.xml, change the links to images in the following phrases to use securepath
sys:ep_form_required Plugin/InputForm/Surround/Default:show_help Plugin/InputForm/Surround/Default:hide_help
e.g. the ep_form_required phrase becomes:
<epp:phrase id="sys:ep_form_required"><img src="{$config{securepath}}/style/images/required.png" border="0" class="ep_required" alt="Required"/> <epc:pin name="label"/></epp:phrase>
If some repositories aren't using https, you may prefer to override these phrases on a per repository basis.
Change the links to images in the following phrases to use urlpath
lib/session:show_help lib/session:hide_help
e.g. the lib/session:show_help phrase becomes:
<epp:phrase id="lib/session:show_help"><epc:pin name="link"><img alt="+" title="Show help" src="{$config{urlpath}}/style/images/help.gif" border="0"/></epc:pin></epp:phrase>
Troubleshooting
Some common problems and solutions:
File does not exist: /htdocs
In the log file you may see messages like:
File does not exist: /htdocs
This may be generated by https pages with invalid links to images (e.g. /images instead of /repos1/images) if no DocumentRoot is set for the https virtual host. Note that if we had only one repository per https address we could use DocumentRoot to point to DocumentRoot /opt/eprints3/archives/ARCHIVEID/html/en/ (as suggested by Peter Schober) but this wont work on a per repository base as you can’t put DocumentRoot in a Location block.
EPrint 1 has no directory set
If you get the message “EPrint 1 has no directory set. This is very dangerous as EPrints has no idea where to write files for this eprint. This may imply a buggy import tool or some other cause of corrupt data.” it probably means that you forgot to set Apache to run as eprints:eprints (or alternatively, add the Apache user to the eprints group).
Other
Remember to check that the EPrints_Dir_SecuredCGI and EPrints_Dir_Documents variables are set and cookie_domain is securehost not host.