Difference between revisions of "Non-root proxy"

From EPrints Documentation
Jump to: navigation, search
m (Editing the new apache config file)
m (Mod-Perl)
 
(52 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 +
[[Category:Installation]]
 
===Task===
 
===Task===
 
To build Apache 2 & mod_perl 2 for ePrints 3, installed as a standard user, with no superuser-like access to core services (including Perl and mySQL).  The ePrints server will run as a normal user, and be accessed through a central proxy
 
To build Apache 2 & mod_perl 2 for ePrints 3, installed as a standard user, with no superuser-like access to core services (including Perl and mySQL).  The ePrints server will run as a normal user, and be accessed through a central proxy
  
===Preparation===
+
[[Image:Webserver_proxy.png]]
  
As we are installing software as a normal user (I'll use MyUser in this example), we are not adding any additional Perl modules centrally, but into a local tree. We need a directory to install this tree:
+
We will build the web server on a private computer called ''services.example.com'', running on port ''1234''; and the public access will be ''eprints.example.com'', on port ''80''
  
<pre>
+
===Preparation===
 
 
%&gt; mkdir ~/perl5
 
 
 
</pre>
 
 
 
We also need to be able to install some Perl packages from CPAN, so we need a configuration for that:
 
  
To set up cpan install as a non-root user, you need to set up your own <code>~/.cpan/CPAN/MyConfig.pm</code> file. Copy one from another user, or find the system-wide one.
+
As we are installing software as a normal user (I'll use MyUser in this example), we are not adding any additional Perl modules centrally, but into a local tree. Cleverly, the Perl people have thought of this, and simply by running the <code>cpan</code> command for the first time will configure the whole thing for you!
You need to edit the values of a few keys in <code>$CPAN::Config</code>:
 
 
 
* <code>'build_dir'</code>, <code>'cpan_home'</code>, &amp; <code>'keep_source_where'</code> all need to be set to the full path for the users .cpan directory (eg: <code>'build_dir' =&gt; q[/home/MyUser/.cpan/build],</code> )
 
* <code>'makepl_arg'</code> needs to be set to <code>PREFIX=/home/MyUser/perl5</code>
 
 
 
You will need to set up a <code>PERL5LIB</code> environment variable for the shell (to run the various ePrints scripts). The easy way to work out what you need here is to see what the default library path is, and modify it for your user:
 
  
 
<pre>
 
<pre>
  
%&gt; perl -V
+
%&gt; cpan
Summary of my perl5 (revision....
 
[snip]
 
  @INC:
 
    /some/path/lib/perl5/5.8.0/sun4-solaris
 
    /some/path/lib/perl5/5.8.0
 
    /some/path/lib/perl5/site_perl/5.8.0/sun4-solaris
 
    /some/path/lib/perl5/site_perl/5.8.0
 
    /some/path/lib/perl5/site_perl
 
  
 
</pre>
 
</pre>
  
We replace "/some/path" with the path to our new PREFIX (as defined above):
+
When asked how you are installing Perl modules, the answer is <code>localLib</code>
<pre>
 
 
 
%&gt; export PERL5LIB=/home/MyUser/perl5/lib/perl5/site_perl/5.8.0/sun4-solaris/: \
 
            /home/MyUser/perl5/lib/perl5/5.8.0/sun4-solaris: \
 
            /home/MyUser/perl5/lib/perl5/5.8.0: \
 
            /home/MyUser/perl5/lib/perl5/site_perl/5.8.0/MyUser: \
 
            /home/MyUser/perl5/lib/perl5/site_perl/5.8.0: \
 
            /home/MyUser/perl5/lib/perl5/site_perl
 
 
 
</pre>
 
Add this to the users login/profile scripts (eg .bashrc)
 
  
 
===Now we can start installing software.===
 
===Now we can start installing software.===
Line 105: Line 75:
 
   MP_APXS="/home/MyUser/www/bin/apxs" \
 
   MP_APXS="/home/MyUser/www/bin/apxs" \
 
   MP_AP_CONFIGURE="--prefix=/home/MyUser/www --disable-userdir \
 
   MP_AP_CONFIGURE="--prefix=/home/MyUser/www --disable-userdir \
   --disable-status --enable-module=mod-perl"
+
   --disable-status --enable-module=mod_perl"
 
%> make
 
%> make
 
%> make install
 
%> make install
  
 
</pre>
 
</pre>
 +
NOTE: To install under Debian/Ubuntu, you will need the <code>libperl-dev</code> package installed
  
 
NOTE: Notice that there is a PREFIX defined, which matches the prefix in the CPAN configuration; that we are stating we want mod-perl as a DSO; the full path to the previously installed Apache "apxs" command; and that the configure parameters to be passed to the apache rebuild include enabling mod-perl
 
NOTE: Notice that there is a PREFIX defined, which matches the prefix in the CPAN configuration; that we are stating we want mod-perl as a DSO; the full path to the previously installed Apache "apxs" command; and that the configure parameters to be passed to the apache rebuild include enabling mod-perl
Line 183: Line 154:
 
<pre>
 
<pre>
  
%> ./configure -prefix=/home/MyUser/ePrints -with-perl=/path/to/specific/perl -with-user=MyUser \
+
%> ./configure --prefix=/home/MyUser/ePrints --with-perl=/path/to/specific/perl --with-user=MyUser \
     -with-group=MyUserGroup -with-toolpath=/path/to/tools
+
     --with-group=MyUserGroup -with-toolpath=/path/to/tools --disable-diskfree \
 +
    --with-smtp-server=your.mail.server
  
 
</pre>
 
</pre>
Line 199: Line 171:
 
As we do not have root access to the MySQL database, you will need to get the database administrator to add a user to provide access the MySQL database. Note: Assuming your user is not given <code>GRANT ALL</code> (its a big security risk) you will need <code>CREATE TEMPORARY TABLES</code> as well as <code>CREATE</code> privilages.
 
As we do not have root access to the MySQL database, you will need to get the database administrator to add a user to provide access the MySQL database. Note: Assuming your user is not given <code>GRANT ALL</code> (its a big security risk) you will need <code>CREATE TEMPORARY TABLES</code> as well as <code>CREATE</code> privilages.
  
====Now we set up out basic ePrints environment====
+
====Create the basic repository====
 +
 
 +
The trick here is to install a basic eprint repository to match the web server (installed above), and then alter the core configuration to reflect the proxy settings.
  
 
<pre>
 
<pre>
Line 207: Line 181:
 
</pre>
 
</pre>
  
First we need to set up the database connection. I fed in the values, but didn't get the table made, as that had already been done by the MySQL administrator. You need to decide the ARCHIVEID now, as this is used all over the place.
+
We also know that the service is going to run as a local user, so the file and directory permissions need to be altered
 +
 
 +
These are all defined in the file <code>perl_lib/EPrints/SystemSettings.pm</code>:
 +
<pre>
 +
$EPrints::SystemSettings::conf = {
 +
                                      // snip //
 +
                                    'file_perms' => 0640,
 +
                                      // snip //
 +
                                    'dir_perms' => 0775
 +
</pre>
 +
 
 +
In EPrints 3, all the configuration is done using the <code>bin/epadmin</code>.
 +
 
 +
* Initial configuration
  
* Creating the archive
+
'''NOTE:''' EPrints assumes a MySql backend, and the configuration routines assume it. This means that you need to have the mysql libraries available to the perl routines during this process. You may setup a Postgres, Oracle, or "Cloud" storage system once the basic configuration is in place, but you need mysql to kick-start the process.
  
Run ''bin/configure_archive''. The answers to the questions asked by this script are:
+
'''NOTE''' In my install, all the <code>bin/</code> files are installed with <code>/usr/bin/perl</code>, not whatever's defined in the .configure statement - you may need to edit them!
  
Archive ID? ARCHIVEID
+
The command to create the basic framework for a repository is:
Hostname? public.server.name
 
Webserver Port [80]? 8084
 
Alias (enter # when done) [#] ? #
 
Administrator Email? ******
 
Archive Name? ******
 
Database Name [bucm] ? eprints.database
 
MySQL Host [localhost] ? mysql.server.name
 
MySQL Port (# for no setting) [#] ? #
 
MySQL Socket (# for no setting) [#] ? #
 
Database User [bucm] ? ******
 
Database Password ? ******
 
Create database ?eprints.database? [yes] ? no
 
MySQL Root Password ? *******
 
Create config files [no] ? yes
 
Hit return to continue [] ?
 
  
You have to use your own values for the fileds replaced by ****** and for ARCHIVEID, public.server.name, eprints.database, mysql.server.name.
+
<pre>
Note: public.server.name is the hostname of the web server that will catch the request, and be proxied through to the backend server, ''however'' the port number is the backend server port.
+
bin/epadmin create
The database details are the ones created by the db administrator above.
+
</pre>
  
* Configuration files modifications
+
* When asked <code>Configure vital settings? [yes] ?</code>, say "Yes" and fill in the details
 +
** <code>Hostname?</code> is the actual address of the web server created above, not the public address (we fix that later) - so '''services.example.com'''
 +
** <code>Webserver Port [80] ?</code> is the actual address of the web server created above, not the port for the web server (we fix that later) - so '''1234'''
 +
** <code>Archive Name [Test Repository] ?</code> is the name that the EPrints.org system will use when it dynamically supplies the archive name (using the &lt;epc:pin /&gt; coding)
  
We need to make some changes to the basic ePrints install 'cos we are running through a proxy, and the eprints install doesn't cope with this...
+
* When asked <code>Configure database? [yes] ?</code>, say "Yes" and fill in the details
 +
** <code>Database Name</code> is the name of the database created for you by the database admin people
 +
** <code>MySQL Host</code> is the hostname for the server
 +
** <code>MySQL Port</code> <code>MySQL Socket</code> can probably be left blank, but check with the database admin people
 +
** <code>Database User</code> <code>Database Password</code> is as per agreed with the database admin people
 +
* <code>Create database "Deposit"</code> <b>NO</b> - you can't, as we don't have that level of access to the database.
  
We have to change '''one''' EPrints configuration file so all the software processes to be run in the future correctly implement the base URL we want to deploy (http://public.server.name/). The change required is simply commenting code lines and including a new one, as you can see below:
+
Now we need to do the rest of the building-work manually:
  
edit ''archives/ARCHIVEID/cfg/ArchiveConfig.pm'', in the section "Server of static HTML + images, including port"
+
When using a database that's '''not''' mysql, then you need to tell EPrints what driver to use.
 +
* Edit <pre> archives/<ARCHIVEID>/cfg/cfg.d/database.pl </pre> and add <pre> $c->{dbdriver} = "xxxx"; </pre> where '''xxx''' is the appropriate Perl DBD package (eg '''Pg''' for postgreSQL)
  
# Server of static HTML + images, including port
+
* Create the database tables: <pre> bin/epadmin create_tables <ARCHIVEID> </pre>
# $c->{base_url} = "http://$c->{host}";
+
* Create users (I suggest an admin user and a normal user): <pre> bin/epadmin create_user <ARCHIVEID> </pre>  
# if( $c->{port} != 80 )-]
+
* Build the subject tables: <pre> bin/import_subjects <ARCHIVEID> <path/to/subject/file></pre> Either use <code>lib/defaultcfg/subjects</code> for the shipped '''Library Of Congress''' tree, or download a subject tree from [http://files.eprints.org files.eprints.org]
# {
+
* Create the static web pages for a basic web site: <pre> bin/generate_static <ARCHIVEID> </pre>
#    # Not SSL port 443 friendly
+
* Create pages of abstracts: <pre> bin/generate_abstracts <ARCHIVEID> </pre> (should do nothing, as there are no abstracts in the system)
#    $c->{base_url}.= ":".$c->{port};
+
* Create the browse pages: <pre> bin/generate_views <ARCHIVEID> </pre>
# }
 
# $c->{base_url} .= $c->{urlpath};
 
$c->{base_url} = "http://public.server.name/";
 
  
Note: I also have to set <code>disable_df =&gt; 1</code> in <code>perl_lib/EPrints/SystemSettings.pm</code>
+
====Configuring the repository:====
  
<pre>
+
EPrints has a global Apache configuration file, and then separate config files for each &lt;ARCHIVEID&gt; archive running under the web server
  
%> bin/configure_archive
+
To be sure of the correct config files for your particular install (we are discovering that they seem to move around depending on version), do the following command:
  
 +
<pre> cd /home/MyUser/ePrints
 +
find ./ -name *.conf -print
 
</pre>
 
</pre>
  
====Finally, build the website:====
+
Generate the apache configuration files.
  
<pre>
+
<pre> cd /home/MyUser/ePrints
 
+
bin/generate_apacheconf </pre>
%> bin/create_tables ARCHIVEID
 
%> bin/import_subjects ARCHIVEID
 
  [long process]
 
%> bin/generate_static ARCHIVEID
 
%> bin/create_user ARCHIVEID UID EMAIL admin PASSWORD
 
  [an initial "admin" user, with a login ID of UID, a password of PASSWORD,
 
  and an email address of EMAIL]
 
%> bin/generate_views ARCHIVEID
 
  [long process]
 
%> bin/generate_apacheconf
 
  
</pre>
+
Edit httpd.conf
  
Now edit httpd.conf to include the generated apache.conf:
+
* Remove the document root and cgi-bin stuff from the httpd.conf file (the name and the <directory> section) - these are set on a per-repository basis
 +
* Include the generated apache.conf:
  
 
<pre>
 
<pre>
Line 286: Line 257:
 
</pre>
 
</pre>
  
Finally, I had to make a couple of other changes to make the service work:
 
  
# Move the document root and cgi-bin stuff from the httpd.conf file (the name and the <directory> section)
 
  
# Add access permissions to the <code>&lt;Directory "/home/MyUser/ePrints/cgi"&gt;</code> section in <code>ePrints/archives/&lt;ARCHIVEID&gt;/cfg/auto-apache.conf</code>:
 
  
 +
EPrints produces absolute URLs for everything (<code>http://web.host.name/</code>), so we need to ensure that the repository uses the correct address. Edit <code>archives/<i>ARCHIVEID</i>/cfg/cfg.d/10_core.pl</code>
 
<pre>
 
<pre>
 +
$c->{host} = 'eprints.example.com';
 +
$c->{port} = '80';
  
    Order deny,allow
+
</pre>
    Allow from all
+
 
 +
====Fix a bug-ette====
 +
There is a problem that has been found on a number of systems where the system goes into a loop of reporting
 +
<pre> [warn] (128)Network is unreachable: connect to listener on [::]:<PORTNO></pre>
 +
 
 +
The solution is to alter the "Listen" directive to include an IP number. Either use the IP number for the host, or cheat:
 +
<pre>
 +
  Listen: 0.0.0.0:<PORTNO>
  
 
</pre>
 
</pre>

Latest revision as of 09:48, 22 October 2013

Task

To build Apache 2 & mod_perl 2 for ePrints 3, installed as a standard user, with no superuser-like access to core services (including Perl and mySQL). The ePrints server will run as a normal user, and be accessed through a central proxy

Webserver proxy.png

We will build the web server on a private computer called services.example.com, running on port 1234; and the public access will be eprints.example.com, on port 80

Preparation

As we are installing software as a normal user (I'll use MyUser in this example), we are not adding any additional Perl modules centrally, but into a local tree. Cleverly, the Perl people have thought of this, and simply by running the cpan command for the first time will configure the whole thing for you!


%> cpan

When asked how you are installing Perl modules, the answer is localLib

Now we can start installing software.

Apache

Install a base Apache (previously downloaded into ~/distributions):


%> cd ~/distributions/
%> tar xvf httpd-2.2.x.tar
%> cd httpd_2.2.x

If you are returning to an existing source-tree, rather than a brand new untar'd bundle, clear any previous setup:


%> make distclean

Now configure and install an initial Apache server:


%> ./configure --prefix=/home/MyUser/www --disable-userdir --disable-status
%> make
%> make install

Edit http.conf (essentially, the port the server is listening on) and start the web server. Check the error log:


%> cat ~/www/logs/error_log
[...] Apache/2.2.x (Unix) Configured -- resuming normal operations

Mod-Perl

Stop web server and install the Mod-Perl extensions (previously downloaded into ~/distributions):


%> cd ~/distributions/
%> tar xvf mod_perl-2.0-current.tar
%> cd mod_perl-2.0.x

If you are returning to an existing source-tree, rather than a brand new untar'd bundle, clear any previous setup:


%> make clean 

Now configure and install mod-perl into the Apache tree, and (re)install Apache. In this example, I am specifying a version of Perl to use:


%> /path/to/specific/perl Makefile.PL PREFIX="/home/MyUser/perl5" MP_USE_DSO=1 \
   MP_APXS="/home/MyUser/www/bin/apxs" \
   MP_AP_CONFIGURE="--prefix=/home/MyUser/www --disable-userdir \
   --disable-status --enable-module=mod_perl"
%> make
%> make install

NOTE: To install under Debian/Ubuntu, you will need the libperl-dev package installed

NOTE: Notice that there is a PREFIX defined, which matches the prefix in the CPAN configuration; that we are stating we want mod-perl as a DSO; the full path to the previously installed Apache "apxs" command; and that the configure parameters to be passed to the apache rebuild include enabling mod-perl

Editing the new apache config file

We need to enable the mod-perl module, which I do using one of the Includes:

  • In ~/www/conf/httpd.conf, add:

# Mod-Perl
Include conf/extra/httpd-perl.conf

  • Create ~/www/conf/extra/httpd-perl.conf:

#
# Load the Mod_perl DSO.
#
LoadModule perl_module modules/mod_perl.so
PerlSwitches -I/home/MyUser/perl5/lib/perl5/site_perl/5.x.y/sun4-solaris/ \
             -I/home/MyUser/perl5/lib/perl5/5.x.y/sun4-solaris \
             -I/home/MyUser/perl5/lib/perl5/5.x.y \
             -I/home/MyUser/perl5/lib/perl5/site_perl/5.x.y/sun4-solaris \
             -I/home/MyUser/perl5/lib/perl5/site_perl/5.x.y \
             -I/home/MyUser/perl5/lib/perl5/site_perl

  • NOTE: the "PerlSwitches" line tells the Apache server where to look for extra libraries, and matches the PERL5LIB environment variable set earlier.

Start the web server. Check the error log:


%> cat ~/www/logs/error_log
[...] Apache/2.2.x (Unix) Configured -- resuming normal operations
[...] caught SIGTERM, shutting down
[...] Apache/2.2.x (Unix) mod_perl/2.0.x Perl/v5.x.y configured -- resuming normal operations

Stop the web server again.


ePrints

Before you can install ePrints, you need to check the Package requirements. CGI.pm builds against the installed Mod-Perl modules, so may well be wrong. You may need to install your own version.

eg:


%> /path/to/specific/perl -MCPAN -e shell
[snip]
cpan> install CGI
[...]
cpan> quit

Now we can install the ePrints software (previously downloaded into ~/distributions):


%> cd ~/distributions/
%> tar xvf eprints-3.zzz.tar
%> cd eprints-3.zzz./

There is no option to clean a previously configured eprints tree, so keep going..


%> ./configure --prefix=/home/MyUser/ePrints --with-perl=/path/to/specific/perl --with-user=MyUser \
    --with-group=MyUserGroup -with-toolpath=/path/to/tools --disable-diskfree \
    --with-smtp-server=your.mail.server

Note: the same version of perl is being defined again, and the /path/to/tools is a directory to find various external tools (tar, wget, (g)unzip, pdftotext, lynx, etc)

... and install:


%> ./install.pl

As we do not have root access to the MySQL database, you will need to get the database administrator to add a user to provide access the MySQL database. Note: Assuming your user is not given GRANT ALL (its a big security risk) you will need CREATE TEMPORARY TABLES as well as CREATE privilages.

Create the basic repository

The trick here is to install a basic eprint repository to match the web server (installed above), and then alter the core configuration to reflect the proxy settings.


%> cd /home/MyUser/ePrints

We also know that the service is going to run as a local user, so the file and directory permissions need to be altered

These are all defined in the file perl_lib/EPrints/SystemSettings.pm:

$EPrints::SystemSettings::conf = {
                                       // snip //
                                    'file_perms' => 0640,
                                       // snip //
                                    'dir_perms' => 0775

In EPrints 3, all the configuration is done using the bin/epadmin.

  • Initial configuration

NOTE: EPrints assumes a MySql backend, and the configuration routines assume it. This means that you need to have the mysql libraries available to the perl routines during this process. You may setup a Postgres, Oracle, or "Cloud" storage system once the basic configuration is in place, but you need mysql to kick-start the process.

NOTE In my install, all the bin/ files are installed with /usr/bin/perl, not whatever's defined in the .configure statement - you may need to edit them!

The command to create the basic framework for a repository is:

bin/epadmin create
  • When asked Configure vital settings? [yes] ?, say "Yes" and fill in the details
    • Hostname? is the actual address of the web server created above, not the public address (we fix that later) - so services.example.com
    • Webserver Port [80] ? is the actual address of the web server created above, not the port for the web server (we fix that later) - so 1234
    • Archive Name [Test Repository] ? is the name that the EPrints.org system will use when it dynamically supplies the archive name (using the <epc:pin /> coding)
  • When asked Configure database? [yes] ?, say "Yes" and fill in the details
    • Database Name is the name of the database created for you by the database admin people
    • MySQL Host is the hostname for the server
    • MySQL Port MySQL Socket can probably be left blank, but check with the database admin people
    • Database User Database Password is as per agreed with the database admin people
  • Create database "Deposit" NO - you can't, as we don't have that level of access to the database.

Now we need to do the rest of the building-work manually:

When using a database that's not mysql, then you need to tell EPrints what driver to use.

  • Edit
     archives/<ARCHIVEID>/cfg/cfg.d/database.pl 
    and add
     $c->{dbdriver} = "xxxx"; 
    where xxx is the appropriate Perl DBD package (eg Pg for postgreSQL)
  • Create the database tables:
     bin/epadmin create_tables <ARCHIVEID> 
  • Create users (I suggest an admin user and a normal user):
     bin/epadmin create_user <ARCHIVEID> 
  • Build the subject tables:
     bin/import_subjects <ARCHIVEID> <path/to/subject/file>
    Either use lib/defaultcfg/subjects for the shipped Library Of Congress tree, or download a subject tree from files.eprints.org
  • Create the static web pages for a basic web site:
     bin/generate_static <ARCHIVEID> 
  • Create pages of abstracts:
     bin/generate_abstracts <ARCHIVEID> 
    (should do nothing, as there are no abstracts in the system)
  • Create the browse pages:
     bin/generate_views <ARCHIVEID> 

Configuring the repository:

EPrints has a global Apache configuration file, and then separate config files for each <ARCHIVEID> archive running under the web server

To be sure of the correct config files for your particular install (we are discovering that they seem to move around depending on version), do the following command:

 cd /home/MyUser/ePrints
 find ./ -name *.conf -print

Generate the apache configuration files.

 cd /home/MyUser/ePrints
 bin/generate_apacheconf 

Edit httpd.conf

  • Remove the document root and cgi-bin stuff from the httpd.conf file (the name and the <directory> section) - these are set on a per-repository basis
  • Include the generated apache.conf:

# EPrints
Include /home/MyUser/ePrints/cfg/apache.conf



EPrints produces absolute URLs for everything (http://web.host.name/), so we need to ensure that the repository uses the correct address. Edit archives/ARCHIVEID/cfg/cfg.d/10_core.pl

$c->{host} = 'eprints.example.com';
$c->{port} = '80';

Fix a bug-ette

There is a problem that has been found on a number of systems where the system goes into a loop of reporting

 [warn] (128)Network is unreachable: connect to listener on [::]:<PORTNO>

The solution is to alter the "Listen" directive to include an IP number. Either use the IP number for the host, or cheat:

  Listen: 0.0.0.0:<PORTNO>

Start web server. Check the error log:


%> cat ~/www/logs/error_log
[...] Apache/2.2.0 (Unix) Configured -- resuming normal operations
[...] caught SIGTERM, shutting down
[...] Apache/2.2.0 (Unix) mod_perl/2.0.2 Perl/v5.8.0 configured -- resuming normal operations
[...] [notice] caught SIGTERM, shutting down
EPrints archives loaded: <ARCHIVEID>
EPrints archives loaded: <ARCHIVEID>
[...] Apache/2.2.0 (Unix) mod_perl/2.0.2 Perl/v5.8.0 configured -- resuming normal operations

GLORY IN YOUR NEW EPRINTS SYSTEM!!!!

To modify the general layout of the page, edit ePrints/archives/<ARCHIVEID>/cfg/template-en.xml and then re-run .../bin/generate_static <ARCHIVEID>