Difference between revisions of "API:EPrints/XML"

From EPrints Documentation
Jump to: navigation, search
 
(15 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
<!-- Pod2Wiki=_preamble_  
 
<!-- Pod2Wiki=_preamble_  
 
This page has been automatically generated from the EPrints 3.2 source. Any wiki changes made between the 'Pod2Wiki=*' and 'Edit below this comment' comments will be lost.
 
This page has been automatically generated from the EPrints 3.2 source. Any wiki changes made between the 'Pod2Wiki=*' and 'Edit below this comment' comments will be lost.
  -->
+
  -->{{API}}{{Pod2Wiki}}{{API:Source|file=perl_lib/EPrints/XML.pm|package_name=EPrints::XML}}[[Category:API|XML]][[Category:API:EPrints/XML|XML]]<div><!-- Edit below this comment -->
__NOTOC__
 
{{API}}{{Pod2Wiki}}{{API:Source|file=EPrints/XML.pm|package_name=EPrints::XML}}[[Category:API|XML]]<div><!-- Edit below this comment -->
 
  
  
<!-- Pod2Wiki=head_name --></div>
+
<!-- Pod2Wiki=_private_ --><!-- Pod2Wiki=head_name -->
 
==NAME==
 
==NAME==
 
'''EPrints::XML''' - XML Abstraction Module
 
'''EPrints::XML''' - XML Abstraction Module
  
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
 
<!-- Edit below this comment -->
 
<!-- Edit below this comment -->
 
 
  
 
For preference, use the methods in [[API:EPrints/Handle/XML|EPrints::Handle::XML]]
 
For preference, use the methods in [[API:EPrints/Handle/XML|EPrints::Handle::XML]]
 
+
<!-- Pod2Wiki= -->
<!-- Pod2Wiki=head_synopsis --></div>
+
<!-- Pod2Wiki=head_synopsis -->
 
==SYNOPSIS==
 
==SYNOPSIS==
  $string = EPrints::XML::to_string( $node, "utf-8", 1 ); #use this to convert DOM trees to string
+
<source lang="perl">my $xml = $repository->xml;
 
 
  $dom = EPrints::XML::parse_xml_string( $string );
 
 
 
  $dom = EPrints::XML::parse_xml( $file, $basepath, $no_expand );
 
 
 
  $boolean = is_dom( $node, @nodestrings );
 
 
 
  $newnode = EPrints::XML::clone_node( $node, $deep );
 
 
 
  EPrints::XML::write_xhtml_file( $node, $filename );
 
 
 
  $document = EPrints::XML::make_document();
 
 
 
  $dom = EPrints:XML::parse_url($url, $no_expand);
 
 
 
  
 +
$doc = $xml->parse_string( $string );
 +
$doc = $xml->parse_file( $filename );
 +
$doc = $xml->parse_url( $url );
  
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
+
$utf8_string = $xml->to_string( $dom_node, %opts );
<h4><span style='display:none'>User Comments</span></h4>
 
<!-- Edit below this comment -->
 
  
 +
$dom_node = $xml->clone( $dom_node ); # deep
 +
$dom_node = $xml->clone_node( $dom_node ); # shallow
  
<!-- Pod2Wiki=head_description --></div>
+
# clone and return child nodes
==DESCRIPTION==
+
$dom_node = $xml->contents_of( $dom_node );
EPrints can use either XML::DOM, XML::LibXML or XML::GDOME modules to generate and process XML. Some of the functionality of these modules differs so this module abstracts such functionality so that all the module specific code is in one place.
+
# Return text child nodes as a string
 +
$utf8_string = $xml->text_contents_of( $dom_node );
  
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
+
$dom_node = $xml->create_element( $name, %attr );
<h4><span style='display:none'>User Comments</span></h4>
+
$dom_node = $xml->create_text_node( $value );
<!-- Edit below this comment -->
+
$dom_node = $xml->create_comment( $value );
 +
$dom_node = $xml->create_document_fragment;
  
 +
$xml->dispose( $dom_node );</source>
  
<!-- Pod2Wiki=item_parse_xml_string --></div>
 
===$doc = EPrints::XML::parse_xml_string( $string );===
 
 
Return a DOM document describing the XML string %string.
 
 
If we are using GDOME then it will create an XML::GDOME document instead.
 
 
In the event of an error in the XML file, report to STDERR and return undef.
 
 
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
 
<!-- Edit below this comment -->
 
<!-- Edit below this comment -->
  
  
<!-- Pod2Wiki=item_parse_xml --></div>
+
<!-- Pod2Wiki= -->
===$doc = EPrints::XML::parse_xml( $file, $basepath, $no_expand )===
+
<!-- Pod2Wiki=head_description -->
 
+
==DESCRIPTION==
Return a DOM document describing the XML file specified by $file. With the optional root path for looking for the DTD of $basepath. If $noexpand is true then entities will not be expanded.
+
EPrints can use either XML::DOM, XML::LibXML or XML::GDOME modules to generate and process XML. Some of the functionality of these modules differs so this module abstracts such functionality so that all the module specific code is in one place.  
 
 
If we are using GDOME then it will create an XML::GDOME document instead.
 
 
 
In the event of an error in the XML file, report to STDERR and return undef.
 
 
 
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
<!-- Edit below this comment -->
 
 
 
 
 
<!-- Pod2Wiki=item_event_parse --></div>
 
===event_parse( $fh, $handler )===
 
 
 
Parses the XML from filehandle $fh, calling the appropriate events in the handler where necessary.
 
 
 
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
<!-- Edit below this comment -->
 
 
 
 
 
<!-- Pod2Wiki=item_is_dom --></div>
 
===$boolean = is_dom( $node, @nodestrings )===
 
 
 
return true if node is an object of type XML::DOM/GDOME::$nodestring
 
where $nodestring is any value in @nodestrings.
 
 
 
if $nodestring is not defined then return true if $node is any
 
XML::DOM/GDOME object.
 
 
 
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
<!-- Edit below this comment -->
 
 
 
 
 
<!-- Pod2Wiki=item_dispose --></div>
 
===EPrints::XML::dispose( $node )===
 
  
Dispose of this node if needed. Only XML::DOM nodes need to be disposed as they have cyclic references. XML::GDOME nodes are C structs.
 
 
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
 
<!-- Edit below this comment -->
 
<!-- Edit below this comment -->
  
  
<!-- Pod2Wiki=item_clone_node --></div>
+
<!-- Pod2Wiki= -->
===$newnode = EPrints::XML::clone_node( $node, $deep )===
+
<!-- Pod2Wiki=head_methods -->
 
+
==METHODS==
Clone the given DOM node and return the new node. Always does a deep copy.
 
 
 
This function does different things for XML::DOM &amp; XML::GDOME but the result should be the same.
 
 
 
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
 
<!-- Edit below this comment -->
 
<!-- Edit below this comment -->
  
  
<!-- Pod2Wiki=item_clone_and_own --></div>
+
<!-- Pod2Wiki= -->
===$newnode = EPrints::XML::clone_and_own( $doc, $node, $deep )===
+
<!-- Pod2Wiki=head_parsing -->
 +
===Parsing===
 +
$doc = $xml-&gt;parse_string( $string, %opts )
 +
Returns an XML document parsed from $string.
  
This function abstracts the different ways that XML::DOM and XML::GDOME allow objects to be moved between documents.  
+
  $doc = $xml-&gt;parse_file( $filename, %opts )
 +
Returns an XML document parsed from the file called $filename.
  
It returns a clone of $node but belonging to the document $doc no matter what document $node belongs to.
+
<pre>  base_path - base path to load DTD files from
 +
  no_expand - don't expand entities</pre>
  
If $deep is true then the clone will also clone all nodes belonging to $node, recursively.
+
$doc = $xml-&gt;parse_url( $url, %opts )
 +
Returns an XML document parsed from the content located at $url.
  
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
 
<!-- Edit below this comment -->
 
<!-- Edit below this comment -->
  
  
<!-- Pod2Wiki=item_to_string --></div>
+
<!-- Pod2Wiki= -->
===$string = EPrints::XML::to_string( $node, [$enc], [$noxmlns] )===
+
<!-- Pod2Wiki=head_node_creation -->
 
+
===Node Creation===
Return the given node (and its children) as a UTF8 encoded string.
+
$node = $xml-&gt;create_element( $name [, @attrs ] )
 
+
Returns a new XML element named $name with optional attribute pairs @attrs.
$enc is only used when $node is a document.
 
 
 
If $stripxmlns is true then all xmlns attributes and namespace prefixes are removed. Handy for making legal XHTML.
 
 
 
Papers over some cracks, specifically that XML::GDOME does not  support toString on a DocumentFragment, and that XML::GDOME does not insert a space before the / in tags with no children, which confuses some browsers. Eg. &lt;br/&gt; vs &lt;br /&gt;
 
 
 
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
<!-- Edit below this comment -->
 
  
 +
$node = $xml-&gt;create_data_element( $name, $value [, @attrs ] )
 +
Returns a new XML element named $name with $value for contents and optional attribute pairs @attrs.
  
<!-- Pod2Wiki=item_make_document --></div>
+
$value may be undef, an XML tree or an array ref of children, otherwise it is stringified and appended as a text node. Child entries are passed de-referenced to [[API:EPrints/XML#create_data_element|create_data_element]].
===$document = EPrints::XML::make_document()===
 
  
Create and return an empty document.
+
<pre>  $xml-&gt;create_data_element(
 +
    "html",
 +
    [
 +
      [ "head" ],
 +
      [ "body",
 +
        [ [ "div", undef, id =&gt; "contents" ] ]
 +
      ],
 +
    ],
 +
    xmlns =&gt; "http://www.w3.org/1999/xhtml"
 +
  );</pre>
  
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
+
$node = $xml-&gt;create_cdata_section( $value )
<h4><span style='display:none'>User Comments</span></h4>
+
Returns a CDATA section containing $value.
<!-- Edit below this comment -->
 
  
 +
$node = $xml-&gt;create_text_node( $value )
 +
Returns a new XML text node containing $value.
  
<!-- Pod2Wiki=item_write_xml_file --></div>
+
$node = $xml-&gt;create_comment( $value )
===EPrints::XML::write_xml_file( $node, $filename )===
+
Returns a new XML comment containing $value.
  
Write the given XML node $node to file $filename.
+
$node = $xml-&gt;create_document_fragment
 +
Returns a new XML document fragment.
  
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
 
<!-- Edit below this comment -->
 
<!-- Edit below this comment -->
  
  
<!-- Pod2Wiki=item_write_xhtml_file --></div>
+
<!-- Pod2Wiki= -->
===EPrints::XML::write_xhtml_file( $node, $filename )===
+
<!-- Pod2Wiki=head_other -->
 +
===Other===
 +
$bool = $xml-&gt;is( $node, $type [, $type ... ] )
 +
Returns true if $node is one of the given node types: Document, DocumentFragment, Element, Comment, Text.
  
Write the given XML node $node to file $filename with an XHTML doctype.
+
$node = $xml-&gt;clone( $node )
 +
Returns a deep clone of $node. The new node(s) will be owned by this object.
  
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
+
$node = $xml-&gt;clone_node( $node )
<h4><span style='display:none'>User Comments</span></h4>
+
Returns a clone of $node only (no children). The new node will be owned by this object.
<!-- Edit below this comment -->
 
  
 +
$node = $xml-&gt;contents_of( $node )
 +
Returns a document fragment containing a copy of all the children of $node.
  
<!-- Pod2Wiki=item_tidy --></div>
+
$string = $xml-&gt;text_contents_of( $node )
===EPrints::XML::tidy( $domtree, { collapse=&gt;['element','element'...] }, [$indent] )===
+
Returns the concantenated value of all text nodes in $node (or the value of $node if $node is a text node).
  
Neatly indent the DOM tree.  
+
$utf8_string = $xml-&gt;to_string( $node, %opts )
 +
Serialises and returns the $node as a UTF-8 string.
  
Note that this should not be done to XHTML as the differenct between white space and no white space does matter sometimes.
+
To generate an XHTML string see [[API:EPrints/XHTML|EPrints::XHTML]].
  
This method modifies the tree it is given. Possibly there should be a version which returns a new version without modifying the tree.
+
Options:
 +
indent - if true will indent the XML tree
  
Indent is the number of levels to ident by.
+
$xml-&gt;dispose( $node )
 +
Dispose and free the memory used by $node.
  
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
+
<!-- Pod2Wiki=head_copyright -->
<h4><span style='display:none'>User Comments</span></h4>
+
==COPYRIGHT==
<!-- Edit below this comment -->
+
: Copyright 2000-2011 University of Southampton.
  
 +
: This file is part of EPrints http://www.eprints.org/.
  
<!-- Pod2Wiki=item_namespace --></div>
+
: EPrints is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
===$namespace = EPrints::XML::namespace( $thing, $version )===
 
  
Return the namespace for the given version of the eprints xml.
+
: EPrints is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more details.
  
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce; padding: 0em 1em 0em 1em; font-size: 80%; '>
+
: You should have received a copy of the GNU Lesser General Public License along with EPrintsIf not, see http://www.gnu.org/licenses/.
<h4><span style='display:none'>User Comments</span></h4>
 
<!-- Edit below this comment -->
 
  
 
<!-- Pod2Wiki=item_version --></div>
 
===$v = EPrints::XML::version()===
 
 
Returns a string description of the current XML library and version.
 
 
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
 
<!-- Edit below this comment -->
 
<!-- Edit below this comment -->
  
  
<!-- Pod2Wiki=item_parse_url --></div>
+
<!-- Pod2Wiki= -->
===$dom = EPrints:XML::parse_url($url, $no_expand)===
+
<!-- Pod2Wiki=_postamble_ -->
 
 
Return a DOM document found at the URL
 
 
 
$url - the url which resolves to the XML you wish to parse.
 
 
 
$no_expand - set to 1 if you do not want the xml to be indented.
 
 
 
<div style='background-color: #eef; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<h4><span style='display:none'>User Comments</span></h4>
 
 
<!-- Edit below this comment -->
 
<!-- Edit below this comment -->
 
 
<!-- Pod2Wiki=_postamble_ --><!-- Edit below this comment -->
 

Latest revision as of 09:56, 22 January 2013

EPrints 3 Reference: Directory Structure - Metadata Fields - Repository Configuration - XML Config Files - XML Export Format - EPrints data structure - Core API - Data Objects


API: Core API

Latest Source Code (3.4, 3.3) | Revision Log | Before editing this page please read Pod2Wiki


NAME

EPrints::XML - XML Abstraction Module


For preference, use the methods in EPrints::Handle::XML

SYNOPSIS

my $xml = $repository->xml;

$doc = $xml->parse_string( $string );
$doc = $xml->parse_file( $filename );
$doc = $xml->parse_url( $url );

$utf8_string = $xml->to_string( $dom_node, %opts );

$dom_node = $xml->clone( $dom_node ); # deep
$dom_node = $xml->clone_node( $dom_node ); # shallow

# clone and return child nodes
$dom_node = $xml->contents_of( $dom_node );
# Return text child nodes as a string
$utf8_string = $xml->text_contents_of( $dom_node );

$dom_node = $xml->create_element( $name, %attr );
$dom_node = $xml->create_text_node( $value );
$dom_node = $xml->create_comment( $value );
$dom_node = $xml->create_document_fragment;

$xml->dispose( $dom_node );


DESCRIPTION

EPrints can use either XML::DOM, XML::LibXML or XML::GDOME modules to generate and process XML. Some of the functionality of these modules differs so this module abstracts such functionality so that all the module specific code is in one place.


METHODS

Parsing

$doc = $xml->parse_string( $string, %opts )

Returns an XML document parsed from $string.

$doc = $xml->parse_file( $filename, %opts )

Returns an XML document parsed from the file called $filename.

  base_path - base path to load DTD files from
  no_expand - don't expand entities
$doc = $xml->parse_url( $url, %opts )

Returns an XML document parsed from the content located at $url.


Node Creation

$node = $xml->create_element( $name [, @attrs ] )

Returns a new XML element named $name with optional attribute pairs @attrs.

$node = $xml->create_data_element( $name, $value [, @attrs ] )

Returns a new XML element named $name with $value for contents and optional attribute pairs @attrs.

$value may be undef, an XML tree or an array ref of children, otherwise it is stringified and appended as a text node. Child entries are passed de-referenced to create_data_element.

  $xml->create_data_element(
    "html",
    [
      [ "head" ],
      [ "body",
        [ [ "div", undef, id => "contents" ] ]
      ],
    ],
    xmlns => "http://www.w3.org/1999/xhtml"
  );
$node = $xml->create_cdata_section( $value )

Returns a CDATA section containing $value.

$node = $xml->create_text_node( $value )

Returns a new XML text node containing $value.

$node = $xml->create_comment( $value )

Returns a new XML comment containing $value.

$node = $xml->create_document_fragment

Returns a new XML document fragment.


Other

$bool = $xml->is( $node, $type [, $type ... ] )

Returns true if $node is one of the given node types: Document, DocumentFragment, Element, Comment, Text.

$node = $xml->clone( $node )

Returns a deep clone of $node. The new node(s) will be owned by this object.

$node = $xml->clone_node( $node )

Returns a clone of $node only (no children). The new node will be owned by this object.

$node = $xml->contents_of( $node )

Returns a document fragment containing a copy of all the children of $node.

$string = $xml->text_contents_of( $node )

Returns the concantenated value of all text nodes in $node (or the value of $node if $node is a text node).

$utf8_string = $xml->to_string( $node, %opts )

Serialises and returns the $node as a UTF-8 string.

To generate an XHTML string see EPrints::XHTML.

Options: indent - if true will indent the XML tree

$xml->dispose( $node )

Dispose and free the memory used by $node.

COPYRIGHT

Copyright 2000-2011 University of Southampton.
This file is part of EPrints http://www.eprints.org/.
EPrints is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
EPrints is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with EPrints. If not, see http://www.gnu.org/licenses/.