Difference between revisions of "API:EPrints/Apache/SiteMap"

From EPrints Documentation
Jump to: navigation, search
Line 7: Line 7:
 
==NAME==
 
==NAME==
 
EPrints::Apache::SiteMap
 
EPrints::Apache::SiteMap
 
<div style='background-color: #e8e8f; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<span style='display:none'>User Comments</span>
 
<!-- Edit below this comment -->
 
  
  
Line 20: Line 16:
  
 
If the static sitemap XML is a sitemapindex, this handler inserts a new &lt;sitemap&gt; element into the index, which directs crawlers to a "sitemap-sc.xml" URL that contains the semantic web sitemap generated by _insert_semantic_web_extensions. This handler also implements the sitemap-sc.xml URL.
 
If the static sitemap XML is a sitemapindex, this handler inserts a new &lt;sitemap&gt; element into the index, which directs crawlers to a "sitemap-sc.xml" URL that contains the semantic web sitemap generated by _insert_semantic_web_extensions. This handler also implements the sitemap-sc.xml URL.
 
<div style='background-color: #e8e8f; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<span style='display:none'>User Comments</span>
 
<!-- Edit below this comment -->
 
  
  
Line 30: Line 22:
 
<!-- Pod2Wiki=head_methods -->
 
<!-- Pod2Wiki=head_methods -->
 
==METHODS==
 
==METHODS==
<div style='background-color: #e8e8f; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<span style='display:none'>User Comments</span>
 
<!-- Edit below this comment -->
 
 
  
 
<!-- Pod2Wiki= -->
 
<!-- Pod2Wiki= -->
Line 41: Line 29:
  
 
  $rc = EPrints::Apache::SiteMap::handler( $r )
 
  $rc = EPrints::Apache::SiteMap::handler( $r )
Handler for managing EPrints requests for dynamically generated sitemap.xml or sitemap-sc.xml (or returning static version if that exists). =cut ######################################################################
+
Handler for managing EPrints requests for dynamically generated sitemap.xml or sitemap-sc.xml (or returning static version if that exists).
 
 
sub handler {
 
my( $r ) = @_;
 
 
 
  my $repository = $EPrints::HANDLE-&gt;current_repository;
 
  my $xml = $repository-&gt;xml;
 
  my $sitemap;
 
 
 
  if ( $r-&gt;uri =~ m! sitemap-sc\.xml$ !x )
 
  {
 
    # this is a direct request for the semantic web extensions
 
    $sitemap = _new_urlset( $repository, $xml );
 
  }
 
  else
 
  {
 
    # get the static sitemap.xml
 
    my $langid = EPrints::Session::get_session_language( $repository, $r );
 
    my @static_dirs = $repository-&gt;get_static_dirs( $langid );
 
    foreach my $static_dir ( @static_dirs )
 
    {
 
      my $file = "$static_dir/sitemap.xml";
 
      next if( !-e $file );
 
 
 
      $sitemap = $xml-&gt;parse_file($file) || EPrints::abort( "Can't parse $file: $!" );
 
      last;
 
    }
 
 
 
    if( !defined $sitemap )
 
    {
 
      # no static sitemap file - create a new document
 
      $sitemap = _new_urlset( $repository, $xml );
 
    }
 
    elsif( $sitemap-&gt;documentElement-&gt;localname eq "urlset" )
 
    {
 
      # the static sitemap is a &lt;urlset&gt; - append the semantic web extensions to the end
 
      _insert_semantic_web_extensions($repository, $xml, $sitemap-&gt;documentElement);
 
    }
 
    elsif( $sitemap-&gt;documentElement-&gt;localname eq "sitemapindex" )
 
    {
 
      # the static sitemap is a &lt;sitemapindex&gt; - append a semantic web sitemap to the index
 
      my $sw_sitemap = $sitemap-&gt;createElement("sitemap");
 
      $sitemap-&gt;documentElement-&gt;appendChild($sw_sitemap);
 
 
 
      # append the location of the semantic web sitemap
 
      my $sw_loc = $sitemap-&gt;createElement("loc");
 
      $sw_sitemap-&gt;appendChild($sw_loc);
 
      $sw_loc-&gt;appendChild($sitemap-&gt;createTextNode($repository-&gt;config('base_url')."/sitemap-sc.xml"));
 
    }
 
  }
 
 
 
  # adds local sitemap URLs
 
  if( $sitemap-&gt;documentElement-&gt;localname eq "urlset" )
 
  {
 
    $repository-&gt;run_trigger( EPrints::Const::EP_TRIGGER_LOCAL_SITEMAP_URLS,
 
      urlset =&gt; $sitemap-&gt;documentElement,
 
    );
 
  } # TODO: else { call some other trigger, with the sitemapindex element }
 
 
 
  binmode( *STDOUT, ":utf8" );
 
  $repository-&gt;send_http_header( "content_type"=&gt;"text/xml; charset=UTF-8" );
 
  print $xml-&gt;to_string( $sitemap );
 
  return DONE; }
 
 
 
# # Creates a new XML document containing a urlset populated # by _insert_semantic_web_extensions #
 
sub _new_urlset {
 
my( $repository, $xml ) = @_;
 
 
 
  my $document = $xml-&gt;make_document();
 
  my $urlset = $xml-&gt;create_element(
 
      "urlset",
 
      "xmlns" =&gt; "http://www.sitemaps.org/schemas/sitemap/0.9"
 
  );
 
  _insert_semantic_web_extensions( $repository, $xml, $urlset );
 
  $document-&gt;appendChild( $urlset );
 
 
 
  return $document; }
 
 
 
# # Insert the semantic web extensions as children of the element given as the # third argument to the function. This function contains the body of the main # handler shipped with EPrints 3.2.x #
 
sub _insert_semantic_web_extensions {
 
my ( $repository, $xml, $urlset ) = @_;
 
 
 
  $urlset-&gt;setAttribute( "xmlns:sc" , "http://sw.deri.org/2007/07/sitemapextension/scschema.xsd" );
 
 
 
  my $sc_dataset = $xml-&gt;create_element( "sc:dataset" );
 
 
 
  $urlset-&gt;appendChild( $sc_dataset ); 
 
  $sc_dataset-&gt;appendChild( _create_data( $xml,
 
    "sc:linkedDataPrefix",
 
    $repository-&gt;config( 'base_url' )."/id/",
 
    slicing =&gt; "subject-object", ));
 
  $sc_dataset-&gt;appendChild( _create_data( $xml,
 
    "sc:datasetURI",
 
    $repository-&gt;config( 'base_url' )."/id/repository" ));
 
 
 
 
 
  $sc_dataset-&gt;appendChild( _create_data( $xml,
 
    "sc:dataDumpLocation",
 
    $repository-&gt;config( 'base_url' )."/id/repository" ));
 
  $sc_dataset-&gt;appendChild( _create_data( $xml,
 
    "sc:dataDumpLocation",
 
    $repository-&gt;config( 'base_url' )."/id/dump" ));
 
 
 
  my $root_subject = $repository-&gt;dataset("subject")-&gt;dataobj("ROOT");
 
  foreach my $top_subject ( $root_subject-&gt;get_children )
 
  {
 
    $sc_dataset-&gt;appendChild( _create_data( $xml,
 
      "sc:dataDumpLocation",
 
      $top_subject-&gt;uri ) );
 
  } }
 
 
 
sub _create_data {
 
my( $xml, $name, $data, %attr ) = @_;
 
 
 
  my $node = $xml-&gt;create_element( $name, %attr );
 
  $node-&gt;appendChild( $xml-&gt;create_text_node( $data ));
 
 
 
  return $node; }
 
 
 
1;
 
 
 
<div style='background-color: #e8e8f; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<span style='display:none'>User Comments</span>
 
<!-- Edit below this comment -->
 
  
  
Line 171: Line 36:
 
<!-- Pod2Wiki=head_copyright -->
 
<!-- Pod2Wiki=head_copyright -->
 
==COPYRIGHT==
 
==COPYRIGHT==
<div style='background-color: #e8e8f; margin: 0.5em 0em 1em 0em; border: solid 1px #cce;  padding: 0em 1em 0em 1em; font-size: 80%; '>
 
<span style='display:none'>User Comments</span>
 
<!-- Edit below this comment -->
 
 
  
 
<!-- Pod2Wiki= -->
 
<!-- Pod2Wiki= -->
 
</div>
 
</div>
 
<!-- Pod2Wiki=_postamble_ --><!-- Edit below this comment -->
 
<!-- Pod2Wiki=_postamble_ --><!-- Edit below this comment -->

Revision as of 16:52, 14 December 2021

EPrints 3 Reference: Directory Structure - Metadata Fields - Repository Configuration - XML Config Files - XML Export Format - EPrints data structure - Core API - Data Objects


API: Core API

Latest Source Code (3.4, 3.3) | Revision Log | Before editing this page please read Pod2Wiki


NAME

EPrints::Apache::SiteMap


DESCRIPTION

This handler has been heavily modified in order to support a static sitemap.xml file in addition to the semantic web crawling extensions provided by EPrints. The modified handler inserts the semantic web crawling extensions into the existing sitemap.xml if it exists, or creates a new document if it doesn't. The original handler is now in the _insert_semantic_web_extensions below.

If the static sitemap XML is a sitemapindex, this handler inserts a new <sitemap> element into the index, which directs crawlers to a "sitemap-sc.xml" URL that contains the semantic web sitemap generated by _insert_semantic_web_extensions. This handler also implements the sitemap-sc.xml URL.


METHODS

handler

$rc = EPrints::Apache::SiteMap::handler( $r )

Handler for managing EPrints requests for dynamically generated sitemap.xml or sitemap-sc.xml (or returning static version if that exists).


COPYRIGHT