Difference between revisions of "Upgrading from 3.1"

From EPrints Documentation
Jump to: navigation, search
(Minimising downtime)
Line 1: Line 1:
The following is advice from the EPrints Services team, after carrying out many upgrades to 3.2. Please note that you follow this advice at your own risk.
+
 
 +
One of the core parts of upgrading from 3.1 is converting all database tables and data to the utf8 charset (previous version of EPrints used latin1). There are a couple of pitfalls in this process - the following tips are from the EPrints Services team, after carrying out many upgrades to 3.2. Please note that you follow this advice at your own risk.
  
 
==Avoiding subject__index_grep key length error==
 
==Avoiding subject__index_grep key length error==
Line 20: Line 21:
 
Whilst ''bin/epadmin upgrade'' is running, your repository will be offline.
 
Whilst ''bin/epadmin upgrade'' is running, your repository will be offline.
  
''bin/epadmin upgrade'' converts all database tables & data to the utf8 charset. If you have a LARGE repository this process can take a long time. However some of the tables are volatile (ie. the data can be generated) so you can reduce downtime by clearing these tables beforehand, and then regenerating the content afterwards (whilst your repository is back online).
+
If you have a LARGE repository the conversion to utf8 can take a LONG time (12+ hours). However some of the tables are volatile (ie. the data can be generated) so you can reduce downtime by clearing these tables beforehand, and then regenerating the content afterwards (whilst your repository is back online).
  
 
You only really need to consider this if your eprint__rindex table contains MILLIONS of rows.
 
You only really need to consider this if your eprint__rindex table contains MILLIONS of rows.
Line 34: Line 35:
 
  bin/epadmin reorder ARCHIVEID eprint
 
  bin/epadmin reorder ARCHIVEID eprint
  
Your search won't be at full capacity until the indexes are fully regenerated, but we have found that getting the repository back online is more desirable so worth the trade off.
+
Note that your search won't be at full capacity until the indexes are fully regenerated, but we have found that getting the repository back online ASAP is more desirable so worth the trade off.

Revision as of 13:16, 26 April 2012

One of the core parts of upgrading from 3.1 is converting all database tables and data to the utf8 charset (previous version of EPrints used latin1). There are a couple of pitfalls in this process - the following tips are from the EPrints Services team, after carrying out many upgrades to 3.2. Please note that you follow this advice at your own risk.

Avoiding subject__index_grep key length error

When converting subject__index_grep and subject__rindex to utf8, bin/epadmin upgrade will generate an error:

DBD::mysql::db do failed: Specified key was too long; max key length is 1000 bytes at /opt/eprints3/perl_lib/EPrints/Database.pm line 3213, <STDIN> line 1.

To avoid this error drop the tables before running bin/epadmin upgrade:

drop table subject__index_grep
drop table subject__rindex

Then reindex when bin/epadmin upgrade has completed:

bin/epadmin reindex ARCHIVEID subject

Minimising downtime

Whilst bin/epadmin upgrade is running, your repository will be offline.

If you have a LARGE repository the conversion to utf8 can take a LONG time (12+ hours). However some of the tables are volatile (ie. the data can be generated) so you can reduce downtime by clearing these tables beforehand, and then regenerating the content afterwards (whilst your repository is back online).

You only really need to consider this if your eprint__rindex table contains MILLIONS of rows.

Clear these tables before running bin/epadmin upgrade:

DELETE FROM eprint__rindex;
DELETE FROM eprint__ordervalues_en;

After running bin/epadmin upgrade and getting your repository back online:

bin/epadmin reindex ARCHIVEID eprint
bin/epadmin reorder ARCHIVEID eprint

Note that your search won't be at full capacity until the indexes are fully regenerated, but we have found that getting the repository back online ASAP is more desirable so worth the trade off.