Difference between revisions of "Contribute: Plugins/ExportPluginsZip"

From EPrints Documentation
Jump to: navigation, search
(In More Detail: WIP)
(List Handling)
Line 161: Line 161:
 
</pre>
 
</pre>
  
 +
=== Navigation ===
 +
Here we begin to setup the HTML file that we'll add to our archive for navigation. First we setup a header.
 
<pre>
 
<pre>
 
   my $index = <<END;
 
   my $index = <<END;
Line 172: Line 174:
 
   <body>
 
   <body>
 
END
 
END
 +
</pre>
  
 +
Now we get the session object for later. We'll be using it to manipulate DOM objects.
 +
<pre>
 
   my $session = $plugin->{session};
 
   my $session = $plugin->{session};
 +
</pre>
 +
 +
=== Handling DataObjs ===
 +
We loop over the DataObjs as we have done before.
  
  foreach my $dataobj ($opts{"list"}->get_records)
+
This time we setup some DOM objects to be added to our index. Each eprint will have it's title printed out followed by a unordered list of documents.
  {
+
<pre>
 
     my $div = $session->make_element("div");
 
     my $div = $session->make_element("div");
 
     my $heading = $session->make_element("h2");
 
     my $heading = $session->make_element("h2");
Line 184: Line 193:
 
     my $uldoc = $session->make_element("ul");
 
     my $uldoc = $session->make_element("ul");
 
     $div->appendChild($uldoc);
 
     $div->appendChild($uldoc);
 +
</pre>
  
 +
We create a directory for each eprint. Note it is not necessary to explicitly create a directory, we simply have to set the appropriate file path. However this means that if you do not add files to a certain directory it will not be created, rather than having an empty directory for a given eprint.
 +
 +
<pre>
 
     my $dirpath = "eprints-search/".$dataobj->get_id()."/";
 
     my $dirpath = "eprints-search/".$dataobj->get_id()."/";
 +
</pre>
  
 +
We then loop over all the documents belonging to each DataObj. The get_all_documents method returns an array of Document objects.
 +
<pre>
 
     my $i = 1;
 
     my $i = 1;
 
     foreach my $doc ($dataobj->get_all_documents)
 
     foreach my $doc ($dataobj->get_all_documents)
 
     {
 
     {
 
       my $subdirpath = $dirpath."doc$i/";
 
       my $subdirpath = $dirpath."doc$i/";
      my %files = $doc->files;
+
</pre>
  
 +
<pre>
 
       my $lidoc = $session->make_element("li");
 
       my $lidoc = $session->make_element("li");
 
       $uldoc->appendChild($lidoc);
 
       $uldoc->appendChild($lidoc);
Line 207: Line 224:
 
         $adoc->appendChild($session->make_text($doc->get_main));
 
         $adoc->appendChild($session->make_text($doc->get_main));
 
       }
 
       }
 
      foreach my $filename (sort keys %files)
 
      {
 
      my $filepath = $subdirpath.$filename;
 
        my $file = $doc->local_path."/".$filename;
 
 
        if (-d $file)
 
        {
 
          next;
 
        }
 
 
        my $data = '';
 
        open (my $datafh ,'>', \$data);
 
 
        open (INFH, "<$file") or die ("Could not open file $file");
 
        while (<INFH>)
 
        {
 
          print {$datafh} $_;
 
        }
 
        close INFH;
 
 
        $zip->add_file($filepath, $data);
 
      }
 
      $i++;
 
    }
 
    $index .= EPrints::XML::to_string($div);
 
  }
 
 
  $index .= "</body></html>";
 
  $zip->add_file("eprints-search/index.htm",$index);
 
 
  if (defined $opts{"fh"})
 
  {
 
    $zip->write_filehandle($opts{"fh"},"zip");
 
    return undef;
 
  }
 
  $zip->write_filehandle($FH,"zip");
 
  return $archive;
 
}
 
 
1;
 
</pre>
 
 
=== Metadata ===
 
Here we get another plugin object to create our metadata, in this case we load the Excel plugin we created in the last tutorial.
 
<pre>
 
  my $otherplugin = $plugin->{session}->plugin("Export::MyPlugins::Excel");
 
</pre>
 
 
If we're running the command line export plugin we don't want this other plugin to write to a file handle. Instead, we want to get a string back from this plugin. We create a copy of the %opts hash provided to our plugin and if a file handle has been defined, we set it to be undefined in our copy.
 
<pre>
 
  my %optscopy = %opts;
 
  if (defined $opts{"fh"})
 
  {
 
    $optscopy{"fh"} = undef;
 
  }
 
</pre>
 
 
We use the plugin to create some metadata and then create an appropriately named file in our zip containing that metadata.
 
<pre>
 
  my $mdata = $otherplugin->output_list(%optscopy);
 
 
  $zip->add_file("eprints-search/metadata".$otherplugin->{suffix},$mdata);
 
</pre>
 
 
=== Handling DataObjs ===
 
 
We loop over the DataObjs, creating a directory for each one. Note it is not necessary to explicitly create a directory, we simply have to set the appropriate file path.
 
<pre>
 
  foreach my $dataobj ($opts{"list"}->get_records)
 
  {
 
    my $dirpath = "eprints-search/".$dataobj->get_id()."/";
 
</pre>
 
 
We then loop over all the documents belonging to each DataObj. The get_all_documents method returns an array of Document objects.
 
<pre>
 
    my $i = 1;
 
    foreach my $doc ($dataobj->get_all_documents)
 
    {
 
      my $subdirpath = $dirpath."doc$i/";
 
 
</pre>
 
</pre>
  

Revision as of 14:25, 4 September 2007

Export Plugin Tutorial 5: Zip

In this tutorial we'll look at packaging the results of a search into a Zip file. We'll create a directory for each eprint, and a sub-directory for each document belonging to that eprint. We'll also add an HTML index file to the archive to make it easier to navigate.

To prepare for this tutorial you should install the Archive::Any::Create module. The following command as root, or using sudo should work.

cpan Archive::Any::Create

Zip.pm

package EPrints::Plugin::Export::MyPlugins::Zip;

@ISA = ("EPrints::Plugin::Export");

use strict;
use Archive::Any::Create;

sub new
{
  my ($class, %opts) = @_;

  my $self = $class->SUPER::new(%opts);

  $self->{name} = "Zip";
  $self->{accept} = [ 'list/eprint' ];
  $self->{visible} = "all";
  $self->{suffix} = ".zip";
  $self->{mimetype} = "application/zip";

  return $self;
}

sub output_list
{
  my ($plugin, %opts) = @_;

  my $archive = '';
  open (my $FH, '>', \$archive) or
    die("Could not create filehandle: $!");
  my $zip = Archive::Any::Create->new;

  my $index = <<END;
  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
      <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
      <title>EPrints Search Results</title>
    </head>
  <body>
END

  my $session = $plugin->{session};

  foreach my $dataobj ($opts{"list"}->get_records)
  {
    my $div = $session->make_element("div");
    my $heading = $session->make_element("h2");
    $heading->appendChild($session->make_text($dataobj->get_value("title")));
    $div->appendChild($heading);

    my $uldoc = $session->make_element("ul");
    $div->appendChild($uldoc);

    my $dirpath = "eprints-search/".$dataobj->get_id()."/";

    my $i = 1;
    foreach my $doc ($dataobj->get_all_documents)
    {
      my $subdirpath = $dirpath."doc$i/";
      my %files = $doc->files;

      my $lidoc = $session->make_element("li");
      $uldoc->appendChild($lidoc);

      my $adoc = $session->make_element("a", href=>$dataobj->get_id."/doc$i/".$doc->get_main);
      $lidoc->appendChild($adoc);

      if ($doc->exists_and_set("formatdesc"))
      {
        $adoc->appendChild($session->make_text($doc->get_value("formatdesc")));
      }
      else
      {
        $adoc->appendChild($session->make_text($doc->get_main));
      }

      foreach my $filename (sort keys %files)
      {
       my $filepath = $subdirpath.$filename;
        my $file = $doc->local_path."/".$filename;

        if (-d $file)
        {
          next;
        }

        my $data = '';
        open (my $datafh ,'>', \$data);

        open (INFH, "<$file") or die ("Could not open file $file");
        while (<INFH>)
        {
          print {$datafh} $_;
        }
        close INFH;

        $zip->add_file($filepath, $data);
      }
      $i++;
    }
    $index .= EPrints::XML::to_string($div);
  }

  $index .= "</body></html>";
  $zip->add_file("eprints-search/index.htm",$index);

  if (defined $opts{"fh"})
  {
    $zip->write_filehandle($opts{"fh"},"zip");
    return undef;
  }
  $zip->write_filehandle($FH,"zip");
  return $archive;
}

1;


In More Detail

Modules

We need to import a module for creating Zip files.

use Archive::Any::Create;

Constructor

For the sake of simplicity this plugin will only deal with lists of eprints. This avoids some code duplication, and it would be fairly easy to modify the plugin to deal with both individual eprints and lists of eprints sensibly.

  $self->{accept} = [ 'list/eprint' ];

The file extension and MIME type are set to values appropriate for Zip files.

  $self->{suffix} = ".zip";
  $self->{mimetype} = "application/zip";

List Handling

Setting Up

Here we setup an in-memory file for the Zip, and create an Archive object.

  my $archive = '';
  open (my $FH, '>', \$archive) or
    die("Could not create filehandle: $!");
  my $zip = Archive::Any::Create->new;

Navigation

Here we begin to setup the HTML file that we'll add to our archive for navigation. First we setup a header.

  my $index = <<END;
  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
      <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
      <title>EPrints Search Results</title>
    </head>
  <body>
END

Now we get the session object for later. We'll be using it to manipulate DOM objects.

  my $session = $plugin->{session};

Handling DataObjs

We loop over the DataObjs as we have done before.

This time we setup some DOM objects to be added to our index. Each eprint will have it's title printed out followed by a unordered list of documents.

    my $div = $session->make_element("div");
    my $heading = $session->make_element("h2");
    $heading->appendChild($session->make_text($dataobj->get_value("title")));
    $div->appendChild($heading);

    my $uldoc = $session->make_element("ul");
    $div->appendChild($uldoc);

We create a directory for each eprint. Note it is not necessary to explicitly create a directory, we simply have to set the appropriate file path. However this means that if you do not add files to a certain directory it will not be created, rather than having an empty directory for a given eprint.

    my $dirpath = "eprints-search/".$dataobj->get_id()."/";

We then loop over all the documents belonging to each DataObj. The get_all_documents method returns an array of Document objects.

    my $i = 1;
    foreach my $doc ($dataobj->get_all_documents)
    {
      my $subdirpath = $dirpath."doc$i/";
      my $lidoc = $session->make_element("li");
      $uldoc->appendChild($lidoc);

      my $adoc = $session->make_element("a", href=>$dataobj->get_id."/doc$i/".$doc->get_main);
      $lidoc->appendChild($adoc);

      if ($doc->exists_and_set("formatdesc"))
      {
        $adoc->appendChild($session->make_text($doc->get_value("formatdesc")));
      }
      else
      {
        $adoc->appendChild($session->make_text($doc->get_main));
      }

The files method of the Document object returns a hash whose keys are file names and values are file sizes.

      my %files = $doc->files;

We loop over each file belonging to the document, in most cases there will only be one file.

      foreach my $filename (sort keys %files)
      {
        my $filepath = $subdirpath.$filename;
        my $file = $doc->local_path."/".$filename;

We need to read the contents of the file and add it to a file in the zip. First we'll create another in-memory file to hold the contents.

        my $data = '';
        open (my $datafh ,'>', \$data);

We open our file and print it straight out to our in-memory file.

        open (INFH, "<$file") or die ("Could not open file $file");
        while (<INFH>)
        {
          print {$datafh} $_;
        }
        close INFH;

Finally we add the file data to our file.

        $zip->add_file($filepath, $data);

If a file handle has been provided we write to it, otherwise we write to the scalar file handle created earlier. We then return in the usual fashion.

  if (defined $opts{"fh"})
  {
    $zip->write_filehandle($opts{"fh"},"zip");
    return undef;
  }
  $zip->write_filehandle($FH,"zip");
  return $archive;

Testing Your Plugin

Restart your webserver and test the plugin as in the previous tutorial.

Sample Output

Expzip.png