Difference between revisions of "Required software"

From EPrints Documentation
Jump to: navigation, search
(links added, formatting harmonized)
 
(22 intermediate revisions by 11 users not shown)
Line 1: Line 1:
 
{{manual}}
 
{{manual}}
 +
[[Category:Installation]]
  
 
==What Additional Software does EPrints Require?==
 
==What Additional Software does EPrints Require?==
In brief, EPrints requires Apache (with mod_perl), MySQL and Perl with some extra modules. Ideally you also want wget, tar and unzip.
 
  
EPrints bundles some perl modules which it uses, to save you installing them.
+
In brief, EPrints minimally requires Apache (with mod_perl), MySQL and Perl with some extra modules. Various utilities like wget, tar and unzip would also be useful.
  
===Where to get the Required Software===
+
EPrints bundles some [https://github.com/eprints/eprints3.4/blob/master/cpan_modules.pl Perl modules] which it uses, to save you installing them.
It's up to you. We have had best results with installing MySQL from RPM and apache from source.
 
  
The best place to get a software tool is the official site, but we've put a mirror of versions known to work at: http://www.eprints.org/files/tools/ - you don't need to install ''everything'' in the tools directory - just those described below.
 
  
* [[Installing MySQL]]
+
==Where to get the Required Software==
 +
Apache, MySQL, Perl and mod_perl are  all provided as operating system level packages that can be installed on EPrints' [[Recommended Platforms]].  If you wish to install on a  platform that is not recommended, then you will need to determine the best way to install these applications.  It may be possible to infer comparable packages for your platform by checking the dependencies installed on [[Installing EPrints on RHEL/Fedora/CentOS|Red Hat based]] and [[Installing EPrints on Debian/Ubuntu|Debian based]] Linux.
  
  
==Apache/mod_perl==
+
==Other Tools==
Tested on: apache 1.3.14 with mod_perl 1.25
 
  
Apache is the most commonly used webserver in the world, and it's free! EPrints requires Apache to be configured with mod_perl, as this allows Apache modules that are entirely written in perl, hence providing much improved efficiency.
+
===File uploads===
 +
<tt>wget</tt>, <tt>tar</tt>, <tt>gunzip</tt> and <tt>unzip</tt> are required to allow users to upload files as <tt>.tar.gz</tt> or <tt>.zip</tt> or to capture them from a URL.
  
Get Apache from http://httpd.apache.org/dist/httpd/
+
These all come installed with most modern versions of linux. If you cannot get them working, you can remove the relevant option by editing "archive_formats" in <tt>SystemSettings.pm</tt>
  
EPrints requires that the apache module '''mod_perl''' is enabled.
+
If there are problems you may need to tweak how these are invoked in <tt>SystemSettings.pm</tt>.
  
===Apache with mod_perl Installation - Step by Step===
 
; Download mod_perl and apache sources : ; Make mod_perl, I use this command (in the modperl src dir): :
 
<code>
 
% perl Makefile.PL APACHE_PREFIX=/usr/local/apache \
 
APACHE_SRC=../apache-1.3.14/src DO_HTTPD=1 USE_APACI=1 \
 
EVERYTHING=1
 
</code>
 
Remeber to change <tt>../apache-1.3.14/src</tt> to wherever your apache source is relative to this directory. The back slashes at the end of the line allow a single command to be split over multiple lines.
 
; Make and install apache. From the mod_perl src dir, I use: :
 
<code>
 
% make
 
% make install
 
</code>
 
  
( mod perl should have already run the apache ./configure script for us. )
+
===Full Text Indexing===
 +
The EPrints indexer requires various tools to extract plain (UTF-8) text from different types of document for indexing.
  
==Perl 5.6 and Perl Modules==
+
The full text indexer requires various tools to index each kind of document. These tools may or may not be already installed in your system. EPrints uses these tools to build a "words" file for each document (which contains the text of the document in UTF-8). If it can't run the tool, the "words" file will be empty and EPrints will not retry creating it unless you manually remove it.
EPrints is currently begin developed with perl 5.6.1, there are currently no plans for to make EPrints run under perl 6 on the theory of if-it-ain't-broke-don't-fix-it.
 
  
Some perl modules are bundled with the EPrints2 package, others must be installed by you.
+
====PDF====
 +
Full text indexing PDF documents requires <tt>pdftotext</tt> application provided by the ''poppler-utils'' Deb or RPM package.
  
===Installing a Perl Module===
+
====Microsoft Word====
This describes the way to simple perl module, some require a bit more effort. We will use the non-existant FOO module as an example.
+
Full text indexing of Microsoft Word documents is provided by the ''antiword'' Deb or RPM package. The RPM package is available through the [https://forensics.cert.org/cert-forensics-tools-release-el7.rpm forensics] RPM repository.
  
Some archives can be installed direct from CPAN. That's great when it works. It doesn't always work, but it's the quickest and easiest, so give it a go first. To install a perl module from CPAN run:
+
====HTML====
 +
Full test indexing of HTML documents requires the <tt>lynx</tt> text-based browser provided by the ''lynx'' Deb or RPM package.
  
  
<code>
+
===LaTeX Tools===
% perl -MCPAN -e 'install Foo::Bar'
+
There is an optional feature which allows you to instruct EPrints to look in certain fields (e.g. title and abstract) for strings that look like LaTeX equations and render them as images. These tools are only required if you want to use this feature.
</code>
 
Where <tt>Foo::Bar</tt> is the module you're installing.
 
  
I would like to make a list of which modules do/don't install OK from CPAN. If you're reading this before the end of Jan 2003, send me (Christopher Gutteridge) any comments on which ones worked, and on what operating system.
+
These are provided by the ''tetex-latex'' and ''ImageMagick'' RPMs or the ''texlive-base'', ''texlive-bin'' and ''imagemagick'' Deb packages.
  
; Download the archive. : Either from cpan.org, or from the tools directory on eprints.org described at the top of this chapter. Our example archive is <tt>FOO-5.23.tar.gz</tt>.
+
This is a "cosmetic" feature, it only affects the rendering of information, so you can always add it later if you want to save time initially.
; Unpack the archive: :
 
<code>
 
% gunzip FOO-5.23.tar.gz
 
% tar xf FOO-5.23.tar
 
</code>
 
; Enter the directory this creates: :
 
<code>
 
% cd FOO-5.23
 
</code>
 
; Run the following commands: :
 
<code>
 
% perl Makefile.PL
 
% make
 
% make test
 
% make install
 
</code>
 
 
 
===Perl Modules Bundled with EPrints===
 
You don't have to install these. They are included as part of the EPrints distribution.
 
 
 
<tt>XML::DOM</tt>, <tt>XML::RegExp</tt>, <tt>Filesys::DiskSpace</tt>, <tt>URI</tt>, <tt>Apache::AuthDBI</tt>, <tt>Unicode::Normalize</tt>, <tt>Proc::Reliable</tt>.
 
 
 
Please note that these modules are not part of the EPrints system and are only included to make things easier. Please note that XML::DOM has has a few lines commented out to prevent it requiring additional modules.
 
 
 
===Required Perl Modules (Which you will probably have to install)===
 
This modules are not built into EPrints - you must install them yourself. We recommend installing them in the order they are listed.
 
 
 
; '''Data::ShowTable''' : MySQL Interface Module requires this.
 
; '''DBI''' : Tested with: v1.14
 
MySQL Interface Module requires this.
 
; '''Msql-Mysql Module''' : Tested with: v1.2215
 
This one can be tricky. It requires access to .h and library files from MySQL. I install MySQL from source first, but some installs of MySQL don't put the lib and include dirs where this module expects. The answer to the first question is that you only need MySQL support.
 
Under Red Hat's GNU/Linux distribution, the '''zlib-devel''' RPM should be installed before you install this module.
 
; '''MIME::Base64''' : Tested with: v2.11
 
Unicode::String requires this.
 
; '''Unicode::String''' : Used for Unicode support. No known problems. Tested with v2.06.
 
; '''XML::Parser''' : Tested with v2.30
 
Used to parse XML files. Requres the '''expat library'''. A .tar.gz and an RPM are available in the tools dir on eprints.org.
 
; '''Apache''' : The perl Apache.pm module is acutally part of mod_perl - installing mod_perl as part of Apache should also have installed the perl Apache module.
 
  
Since version 2.3.7 The modules "Apache::Request" and "Apache::Test" (aka. "libapreq") are no longer required. They were a pain to install and the software has been redesigned to not use them at all.
 
  
===Required Perl Modules (Which you will probably already have)===
+
==Other Platforms==
Most PERL 5.6 or later systems should already include the following modules, but you may have to install some by hand on certain platforms.
+
Often the best way to find certain packages of other platforms is to use a search engine to look for the package name for Red Hat or Ubuntu Linux along with the name of your platform. (E.g. antiword Arch Linux). If you platform does not have comparable packages, then the next best option is to download the software tool is the official site. Below are links to the download pages for the essential components of EPrints:
 
+
* [https://httpd.apache.org/download.cgi Apache]
<tt>CGI</tt>, <tt>Carp</tt>, <tt>Cwd</tt>, <tt>Data::Dumper</tt>, <tt>Digest::MD5</tt>, <tt>File::Basename</tt>, <tt>File::Copy</tt>, <tt>File::Find</tt>, <tt>File::Path</tt>, <tt>Getopt::Long</tt>, <tt>Pod::Usage</tt>, <tt>Sys::Hostname</tt>.
+
* [https://dev.mysql.com/downloads/ MySQL] (or [https://downloads.mariadb.org/ MariaDB]) as well as [https://www.postgresql.org/download/ PostgreSQL] and even [https://www.oracle.com/de/downloads/ ORACLE]
 
+
* [https://www.perl.org/get.html Perl]
==Optional GDOME support==
+
* [https://perl.apache.org/download/ mod_perl]
Since EPrints 2.2 you may use either XML::DOM or XML::GDOME. XML::GDOME is recommended as it's faster and uses much less RAM, but it does require you to install a whole lot of extra libraries and perl modules. If you are running a pilot or demonstration service then XML::DOM is fine, and you can always switch over later by installing the required tools and setting the GDOME flag in perl_lib/EPrints/SystemSettings.pm
 
 
 
===Addional Libraries Required for GDOME support===
 
 
 
<code>
 
libxml2
 
libxml2-devel
 
</code>
 
either get the tarball from: ftp://ftp.gnome.org/pub/GNOME/sources/libxml2/
 
 
 
or the RPMs (but we have had problems with complex RPM dependencies):
 
 
 
 
 
<code>
 
http://rpmfind.net/linux/rpm2html/search.php?query=libxml2
 
http://rpmfind.net/linux/rpm2html/search.php?query=libxml2-devel
 
</code>
 
===The GDOME Library===
 
Obtain this from
 
 
 
 
 
<code>
 
http://gdome2.cs.unibo.it/#downloads
 
</code>
 
You may either use the RPMs (gdome2 and gdome2-devel) or the tarball.
 
 
 
===Additional Perl Modules Required for GDOME support===
 
 
 
<code>
 
XML-LibXML-Common
 
XML-NamespaceSupport
 
XML-GDOME
 
</code>
 
All of which are in http://www.cpan.org/modules/by-module/XML/
 
 
 
==Other Tools==
 
===File uploads===
 
'''wget''', '''tar''', '''gunzip''' and '''unzip''' are required to allow users to upload files as .tar.gz or .zip or to captures them from a URL.
 
 
 
These all come installed with most modern versions of linux. If you can get them working, you can remove the option by edditing "archive_formats" in SystemSettings.pm
 
 
 
Tested with wget 1.6.
 
 
 
If there are problems you may need to tweak how these are invoked in SystemSettings.pm
 
 
 
===Full Text Indexing===
 
The full text indexer requires various tools to index each kind of document. These tools may or may not be already installed in your system. EPrints uses these tools to build a "words" file for each document (which contains the text of the document in UTF-8). If it can't run the tool, the "words" file will be empty and EPrints will not retry creating it unless you manually remove it.
 
 
 
; PDF : To index pdfs you need "pdftotext" which is part of the "xpdf" package. RPM's are available.
 
; ASCII : To index ASCII files you don't need anything. That's easy.
 
; Microsoft Word : To index MS Word files you need a package called "wvware". It can be a bit of a bit of a pain to install.
 
; HTML : To index HTML files you need a tool called "lynx". It's a text-based web-browser.
 
 
 
===Latex Tools===
 
There is an optional feature which allows you to set eprints to look in certain fields (eg. title and abstract) for stuff which looks like latex equations and display it as an image of that equation instead. These tools are only required if you want to use this feature.
 
 
 
'''latex''', '''dvips''' and '''convert''' (convert is part of "imagemagick"). (These all ship with Red Hat's GNU/Linux distribution but you may have to install them yourself on other systems.)
 
 
 
This is a "cosmetic" feature, it only affects the rendering of information, so you can always add it later if you want to save time initially.
 

Latest revision as of 11:53, 19 February 2024

Manual Sections

What Additional Software does EPrints Require?

In brief, EPrints minimally requires Apache (with mod_perl), MySQL and Perl with some extra modules. Various utilities like wget, tar and unzip would also be useful.

EPrints bundles some Perl modules which it uses, to save you installing them.


Where to get the Required Software

Apache, MySQL, Perl and mod_perl are all provided as operating system level packages that can be installed on EPrints' Recommended Platforms. If you wish to install on a platform that is not recommended, then you will need to determine the best way to install these applications. It may be possible to infer comparable packages for your platform by checking the dependencies installed on Red Hat based and Debian based Linux.


Other Tools

File uploads

wget, tar, gunzip and unzip are required to allow users to upload files as .tar.gz or .zip or to capture them from a URL.

These all come installed with most modern versions of linux. If you cannot get them working, you can remove the relevant option by editing "archive_formats" in SystemSettings.pm

If there are problems you may need to tweak how these are invoked in SystemSettings.pm.


Full Text Indexing

The EPrints indexer requires various tools to extract plain (UTF-8) text from different types of document for indexing.

The full text indexer requires various tools to index each kind of document. These tools may or may not be already installed in your system. EPrints uses these tools to build a "words" file for each document (which contains the text of the document in UTF-8). If it can't run the tool, the "words" file will be empty and EPrints will not retry creating it unless you manually remove it.

PDF

Full text indexing PDF documents requires pdftotext application provided by the poppler-utils Deb or RPM package.

Microsoft Word

Full text indexing of Microsoft Word documents is provided by the antiword Deb or RPM package. The RPM package is available through the forensics RPM repository.

HTML

Full test indexing of HTML documents requires the lynx text-based browser provided by the lynx Deb or RPM package.


LaTeX Tools

There is an optional feature which allows you to instruct EPrints to look in certain fields (e.g. title and abstract) for strings that look like LaTeX equations and render them as images. These tools are only required if you want to use this feature.

These are provided by the tetex-latex and ImageMagick RPMs or the texlive-base, texlive-bin and imagemagick Deb packages.

This is a "cosmetic" feature, it only affects the rendering of information, so you can always add it later if you want to save time initially.


Other Platforms

Often the best way to find certain packages of other platforms is to use a search engine to look for the package name for Red Hat or Ubuntu Linux along with the name of your platform. (E.g. antiword Arch Linux). If you platform does not have comparable packages, then the next best option is to download the software tool is the official site. Below are links to the download pages for the essential components of EPrints: