Difference between revisions of "Files/CoverPDF"
(→pdftk) |
(→pdftk) |
||
Line 17: | Line 17: | ||
The PDF Tool Kit | The PDF Tool Kit | ||
http://www.accesspdf.com/pdftk/ | http://www.accesspdf.com/pdftk/ | ||
− | (2010-12-23 pdftk can | + | (2010-12-23 pdftk can now be found here: http://www.pdflabs.com/docs/install-pdftk/ ) |
yum install pdftk (Fedora 10) | yum install pdftk (Fedora 10) | ||
Revision as of 13:59, 23 December 2010
An EPrints extension to automatically generate cover pages for PDF documents.
Contents
- 1 Prerequisites
- 2 Installation (EPrints 3.1+)
- 3 Getting Started
- 4 Troubleshooting
- 4.1 GIF image not appearing on cover page
- 4.2 "Check that coverpage content is valid LaTeX" error message in log
- 4.3 "fmtutil: format directory does not exist" error message in log
- 4.4 Encrypted PDFs / "Check the PDF is not password-protected" error message in log
- 4.5 Ampersands and other characters do not render correctly on cover sheet
- 4.6 Older versions of pdflatex do not support -output-directory
Prerequisites
pdflatex
The LaTeX front end for the TeX text formatting system http://www.tug.org/texlive/
yum install texlive-latex (Fedora 10)
Note older distros may still be using tetex packages http://www.tug.org/tetex/
pdftk
The PDF Tool Kit http://www.accesspdf.com/pdftk/ (2010-12-23 pdftk can now be found here: http://www.pdflabs.com/docs/install-pdftk/ )
yum install pdftk (Fedora 10)
Installation (EPrints 3.1+)
Download the latest tarball to your local repository directory (eg. /opt/eprints3/archives/ARCHIVEID/)
Extract files:
tar xzvf coverpdf_install_xx.tgz
The following files should be extracted:
cfg/cfg.d/coverpage.pl
Allows you to configure which cover page gets applied to which document. For example you might like to use a single coverpage for all PDF documents, only apply the cover page to certain types of PDF document (eg. articles), or use different cover pages for different types of PDF document.
The default is to apply a single coverpage (defined in coverpage.xml - see below) to all PDF documents.
Note: check that the pdflatex and pdftk paths are correct for your system.
cfg/lang/en/phrases/coverpage.xml
Defines cover page template(s) which you can adjust to change the content and/or appearance of your cover page(s). Each cover page is defined as a LaTeX template which can include information about the repository (eg. repository name, admin email address) and metadata about the document (eg. citation, URL).
The default cover page template displays the repository logo and lists the document citation and URL. A brief "Usage Guidelines" section is also included.
Note: gif format logos are not supported by pdflatex - see Troubleshooting section below.
cfg/plugins/EPrints/Plugin/Convert/CoverPDF.pm
Conversion plugin which actually does the work of creating the cover page and prepending it to a PDF document.
Note: the original PDF document is never overwritten - the conversion plugin makes a separate copy with a cover page.
Getting Started
To activate cover pages, you need to make a small change to the EPrints/Apache/Rewrite.pm module.
vim /opt/eprints3/perl_lib/EPrints/Apache/Rewrite.pm
Around line 182 find the following code:
# let it fail if this isn't a real eprint if( !defined $eprint ) { $session->terminate; return OK; }
Add the following immediately after:
if( !$thumbnails && $session->get_repository->can_call( "coverpage", "process_request" ) ) { my $ret = $session->get_repository->call( [ "coverpage", "process_request" ], $session, $r, $eprint, $pos, $tail ); return $ret if defined $ret; }
This gives the coverpage extension a chance to look at the request and decide whether a cover page needs to be generated.
Save the file and restart Apache.
You should now find that all PDF documents in your repository have a cover page. If you change the layout or content of the cover page (by editing coverpage.xml), all cover pages should automatically be updated to reflect the change. Also, if the metadata of the record changes the cover page should also automatically update to reflect the new metadata.
Troubleshooting
GIF image not appearing on cover page
pdflatex does not support gif images - convert the gif image to a supported format such as png.
The default cover page uses the site_logo setting defined in cfg.d/branding.pl. By default the logo is a gif image:
$c->{site_logo} = "/images/sitelogo.gif";
To create a png version of the logo:
cd /opt/eprints3/archives/ARCHIVEID/cfg/static/images/ convert sitelogo.gif sitelogo.png
Edit branding.pl and change sitelogo.gif to sitelogo.png.
"Touch" coverpage.xml so that cover pages will be regenerated:
touch /opt/eprints3/archives/ARCHIVEID/cfg/lang/en/phrases/coverpage.xml
The logo should now appear on the cover page.
"Check that coverpage content is valid LaTeX" error message in log
This can sometimes appear after touching/editing coverpage.xml even if the LaTeX code is correct. Restart Apache and try again.
Check for any mktexfmt error messages in the log - see below.
If the problem persists, try running pdflatex against the cover page template manually:
cd /tmp mkdir coverpage-test cd coverpage-test vi cover.tex (copy LaTeX code from coverpage.xml into cover.tex) pdflatex cover.tex
Examine the pdflatex output for errors.
This problem can also appear because of an intermittent bug in pdflatex. The symptom is that cover pages generally work, but occasionally (and transiently) either fail to appear or, more rarely, cause an internal server error. In this case, a workaround is to use latex and dvipdf instead of pdflatex. A (clumsy) way to do this is to modify CoverPDF.pm, replacing the line
system( $pdflatex, "-interaction=nonstopmode", "-output-directory=$latex_dir", $latex_file );
with
system( "latex", "-interaction=nonstopmode", "-output-directory=$latex_dir", $latex_file ); system( "cd $latex_dir && dvipdf cover.dvi" );
"fmtutil: format directory does not exist" error message in log
Full error message:
kpathsea: Running mktexfmt pdflatex.fmt fmtutil: format directory `/.texlive2007/texmf-var/web2c' does not exist.
Some texlive packages do not include all the necessary TeX format files to run. To generate the missing files, run the following as the "eprints" user (or the user you configured EPrints to run as):
fmtutil --missing
Under Fedora 10, the home directory seen by the EPrints web server differs from that of the eprints user, so as root you may need to do
mkdir /.texlive2007 chown eprints.eprints /.texlive2007 chmod g+w /.texlive2007
and then as the eprints user
HOME=/ fmtutil --missing
Encrypted PDFs / "Check the PDF is not password-protected" error message in log
If a PDF document is encrypted (password protected), a cover page cannot be added.
To check for encrypted PDFs during the deposit process (and display a warning message) add the following to cfg.d/eprint_warnings.pl:
foreach my $doc ( @docs ) { if( $doc->get_type eq 'application/pdf' ) { use PDF::API2; my $pdf = PDF::API2->open( $doc->local_path.'/'.$doc->get_main ); if( defined $pdf && $pdf->isEncrypted ) { my $fieldname = $session->make_element( "span", class=>"ep_problem_field:documents" ); push @problems, $session->html_phrase( "validate:encrypted_pdf", fieldname => $fieldname ); } } }
Note: You will need to install the PDF::API2 Perl module.
Hint: If you want to prevent depositors from submitting encrypted PDFs, adapt the code to cfg.d/eprint_validate.pl instead.
Ampersands and other characters do not render correctly on cover sheet
Some characters, such as ampersands and copyright symbols, need to be quoted to render correctly in LaTeX. Try adding the following local subroutine to $coverpage->{getcontent}:
$coverpage->{get_content} = sub { [...] my $latex_encode = sub { my ($string) = @_; utf8::decode($string); return TeX::Encode::encode( "latex", $string ); }; [...] };
You can then use &$latex_encode() to wrapper the strings in %bits:
my %bits = ( citation => &$latex_encode( EPrints::Utils::tree_to_utf8( $eprint->render_citation() ) ), [...] );
Older versions of pdflatex do not support -output-directory
Error messages in the log like:
/usr/bin/pdflatex: unrecognized option `-output-directory=/tmp/1cS5K8nVch'
In CoverPDF.pm, try replacing:
system( $pdflatex, "-interaction=nonstopmode", "-output-directory=$latex_dir", $latex_file );
with:
use Cwd; my $prev_dir = getcwd; chdir( $latex_dir ); system( $pdflatex, "-interaction=nonstopmode", $latex_file ); chdir( $prev_dir );