Difference between revisions of "Migration"

From EPrints Documentation
Redirect page
Jump to: navigation, search
(Finishing up)
(Redirected page to Moving a repository)
 
(16 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 +
#REDIRECT [[Moving a repository]]
 +
 +
[[Category:Management]]
 
This page covers how to migrate from EPrints 2 to EPrints 3.
 
This page covers how to migrate from EPrints 2 to EPrints 3.
  
Line 34: Line 37:
  
 
This script exports the data from your EPrints 2 repostory in a format which can be imported by EPrints 3.
 
This script exports the data from your EPrints 2 repostory in a format which can be imported by EPrints 3.
 +
 +
There have been some problems with exporting non Latin characters (e.g. letters with accents).  If you have any problems, these can probably be solved by editing the export3data script and adding the following line (put it just under the first line).
 +
 +
  use encoding 'utf8';
  
 
To export the data do the following:
 
To export the data do the following:
Line 42: Line 49:
  
 
eprints.xml references the full paths of the files in EPrints 2. If your EPrints 3 is on a different machine you'll need to either make sure they are the same on the new machine or do a big search-and-replace on eprints.xml!
 
eprints.xml references the full paths of the files in EPrints 2. If your EPrints 3 is on a different machine you'll need to either make sure they are the same on the new machine or do a big search-and-replace on eprints.xml!
 +
 +
If the script has any problems, run with the 'skiplog' argument:
 +
 +
  export3data.pl --skiplog errors.txt ARCHIVEID eprints > eprints.xml
 +
 +
Any items with problems will be ignored, but the ids of them will be recorded in the 'errors.txt' file.  Export these by hand if they are important.
  
 
=== Importing ===
 
=== Importing ===
Line 55: Line 68:
 
=== Import the data ===
 
=== Import the data ===
  
To import the data do:
+
To import the subjects and users do:
  bin/import --verbose --force ARCHIVEID subject XML subjects.xml
+
  /opt/eprints3/bin/import_subjects --verbose --force --xml ARCHIVEID subjects.xml
  bin/import --verbose --force ARCHIVEID user XML users.xml
+
  /opt/eprints3/bin/import --verbose --migration ARCHIVEID user XML users.xml
bin/import --verbose --force ARCHIVEID eprint XML eprints.xml
+
If something goes wrong with subjects or users, use epadmin erase_data to empty the database and start again.
+
If something goes wrong use epadmin erase_data to empty the database and start again.
+
  
= Finishing up after using mtoolkit =
+
To import the EPrints do:
 +
/opt/eprints3/bin/import --verbose --migration ARCHIVEID eprint XML eprints.xml
 +
If something goes wrong with importing the eprints, use epadmin erase_eprints, to just erase the eprints data so you don't need to redo subjects and users.
 +
 
 +
the --migration option tells the importer to:
 +
* skip are-you-sure? messages.
 +
* use the eprintid and userid from the XML rather than assigning them.
 +
* use the "datestamp" from the XML rather than assign it.
 +
* load files from the local file system (normally this would be a security hole)
 +
 
 +
You may encounter some issues with badly formed XML. This is due to non correctly encoded data creeping into your database. It should all be utf-8 but earlier versions of EPrints didn't always check... If your EPrints 2 server is running perl 5.8 you can install the Perl module Encode which will clean up your data, but on our system our EPrints 2 was running on a machine with an older version of Perl and we didn't want to risk upgrading.
 +
 
 +
== Finishing up after using mtoolkit ==
  
 
You will probably still want to tweak some of the following things by hand, depending how much you customised EPrints 2:
 
You will probably still want to tweak some of the following things by hand, depending how much you customised EPrints 2:
Line 82: Line 105:
 
Feel free to add tips on the wiki, linked from this section.
 
Feel free to add tips on the wiki, linked from this section.
  
= Issues =
 
  
There's going to be lots, I'm sure. Please leave both comments and tips.
+
== Known bugs in version 1.0 of toolkit / importing into EPrints 3.0.2 ==
  
== Tips ==
+
=== Documents with subdirectories fail to import ===
  
After you've got it working, you probably want to clean up the workflow to make use of the Multi components. Look at the default /opt/eprints3/lib/defaultcfg/workflow/eprints/default.xml  config for some clues on how to do this, and how to add autocompleters.
+
FIX: do them by hand at the end.
  
== Known Issues ==
+
=== Warning messages about "hideemail" ===
  
* Citations not ported
+
hideemail was introduced in a version of EPrints 2 (I forget which). Earlier repositories may not have this field. Some of the EPrints 3 default config files assume it exists (user_fields_default.pl and user_render.pl).
* Template not ported
+
* Static pages not ported
+
* ArchiveRender methods not ported
+
* Annoying hack required to import
+
* Handy workflow features like autocomplete don't get turned on by default.
+
  
== Known bugs in current version of toolkit ==
+
FIX 1: Don't worry about it.
  
=== No option to set access to 'Anyone' in document upload ===
+
FIX 2: Before importing users.xml, add the hideemail field back into user_fields.pl
 +
          {
 +
            'name' => 'hideemail',
 +
            'input_style' => 'radio',
 +
            'type' => 'boolean',
 +
          },
  
Add 'public' to the namedset /archives/ARCHIVEID/cfg/namedsets/security.
+
=== Error missing field: X ===
  
=== Documents with subdirectories fail to import ===
+
The default EPrints 3 config. may reference a field not imported. If so you can almost always just remove the offending section of configuration. Examples: searches, citations, views.
 +
 
 +
=== Problems with bad characters in eprints.xml ===
 +
 
 +
This is not tested, but I think this should clean it up...
 +
iconv -c eprints.xml --output=eprints_cleaned.xml -f utf-8 -t utf-8
 +
 
 +
=== Warning about Pagerange ===
 +
 
 +
Argument "" isn't numeric in addition (+) at
 +
  /opt/eprints3/perl_lib/EPrints/MetaField/Pagerange.pm line 182.
 +
 
 +
This is a warning that is caused by having non-numeric data in the pagerange field. eg. "iii-xi".
 +
 
 +
FIX: Don't worry about it.
 +
 
 +
=== Can't import files which contain "/" ===
  
=== Current version does not properly escape & and angle brackets in document filenames. ===
+
eg if your document had index.html and images/dia.jpg
  
FIX: find the <nowiki><filename></nowiki> line in export3data.pl and change it to:
+
FIX: Make a note of the offenders, and just add those documents by hand.
  
    print $fh "          <filename>".esc(latin1($filename))."</filename>\n";
+
FIX2: Bug chris to add this to fix this in the final release of 3.0.2 (it's not in beta-1)

Latest revision as of 14:44, 18 May 2012