Difference between revisions of "StyleGuide"

From EPrints Documentation
Jump to: navigation, search
m (restore link broken by accidentally editing old revision)
 
(16 intermediate revisions by 4 users not shown)
Line 1: Line 1:
EPrints has been under development for many years and has some fluff about the place. For new programmers this document is intended as a 'style guide' to at least keep the code and documentation consistent across new modules.
+
[[Category:Contribute]]
 +
EPrints has been under development for many years and has some fluff about the place. For new programmers this document is intended as a 'style guide' to at least keep the code and documentation consistent across new [[modules]].
  
 
==Programming Style==
 
==Programming Style==
  
 
===Naming===
 
===Naming===
 +
 +
It helps if all things are written in a consistent style: when one looks at code, we recognise things by the format of the name:
 +
 +
====Global Variables====
 +
Global variables are variables that are ''in scope'' for the entire system, not restricted to specific user sessions. Examples of Global Variables are the AUTH_OK code returned by the Apache web server subsystem.
 +
 +
EPrints follows the general Perl convention, and such variables (and '''CONSTANTS''') should be in CAPITAL LETTERS
 +
 +
As a general rule code outside the core should not really be creating global variables, however there are situations
 +
 +
            AUTH_OK = "1";
 +
           
 +
 +
====Local Variables====
 +
Variables in code must always be pre-declared, and should be given the most appropriate level of restriction
 +
 +
'''[http://perldoc.perl.org/functions/my.html my]''', '''[http://perldoc.perl.org/functions/our.html our]''' and '''[http://perldoc.perl.org/functions/local.html local]''' variables are all available to you, however we really ''really'' suggest you use '''my''' variables wherever possible.
 +
 +
Local variables should be given names that reflect their content ($authenticated) rather than a short nmenonic ($au). Local variables should be underscore seperated ($current_password), not camelCased ($currentPassword)
 +
 +
''The appropriate length of a name is inversely proportional to the size of its scope.'' --Mark-Jason Dominus
 +
 +
''Choose mnemonic identifiers. If you can't remember what mnemonic means, you've got a problem.'' -- Larry Wall
 +
 +
Preferred technique:
 +
 +
  my $proxy_depositor;
 +
  $proxy_depositor = get_proxy_depositor($session, $eprint);
 +
 +
Acceptable technicque:
 +
 +
  my $proxy_depositor = get_proxy_depositor($session, $eprint);
 +
 +
====Subroutines====
 +
Subroutines (aka ''functions'', aka ''methods'') should also be underscore seperated
 +
 +
=====Public methods=====
 +
These are the API calls that are used to interact with your code.
 +
 +
Public methods must start with a lowe-case letter [a-z], and are required to have [http://perldoc.perl.org/perlpod.html POD] above them explaining the function and use of the method.
 
<pre>
 
<pre>
            TYPE          EXAMPLE
+
=pod
  
            module        StyleGuide
+
=item * my_func( $var_1, %var_2 ):
            subroutine    get_value
+
 
            global var    AUTH_OK
+
Description of function, with examples
            local var      $field_name
+
 
 +
=cut
 +
 
 +
sub my_func
 +
{
 +
 
 +
}
 
</pre>
 
</pre>
 +
 +
=====Private methods=====
 +
Private functions are the subroutines in your modules that people should '''not''' be using.
 +
Private functions should start with an underscore, and have simple perl comments above them, explaining the function of the subroutine.
 +
<pre>
 +
#########
 +
#
 +
# _my_subby( $var_1, %var_2 ):
 +
#
 +
# Description of function, with examples
 +
#
 +
sub _my_subby
 +
{
 +
 +
}
 +
</pre>
 +
====Modules====
 +
Modules (aka ''packages'') are the file that contains the collection of methods & private funtions that makes your code do whatever it does.
 +
 +
Modules are named in the perl style: Perl informally reserves lowercase module names for ''pragma'' modules like integer and strict. Other modules should begin with a capital letter and use mixed case
 +
 +
    package EPrints::Plugin::Sword::Import::FiddleDeeDee
 +
 +
See the [[modules]] page for full details.
 +
 +
===General layout/presentation notes===
 +
We lay out code mostly in the GNU style: curly braces are on the next line, aligned with the start of the thing that defines the block; closing braces align with the opening brace.
 +
 +
When there is a literal list (variables being passed into a subroutine; an array being created; etc) and the sequence of items extends beyond 80 characters, wrap the line after a comma.
 +
 +
  $thinggy = my_subroutine( $session, $short_variable,
 +
                            $long_variable_name,
 +
                            { 'key_1' => $value_one,
 +
                              'key_2' => $value_two
 +
                            }
 +
                          );
 +
 +
If the literal list is a more regular block of items, padding things to make it more grid-like is preferred:
 +
 +
  @months_of_the_year = ( 'January', 'February', 'March',
 +
                          'April',  'May',      'June',
 +
                          'July',    'August',  'September',
 +
                          'October', 'November', 'December'
 +
                        );
 +
 
===Subroutines===
 
===Subroutines===
 +
This applies to ''all'' subroutines - public or private.
 +
 
<pre>
 
<pre>
 
             sub get_value
 
             sub get_value
 
             {
 
             {
                    my( $self, $arg1, $arg2 ) = @_;
+
                my( $self, $arg1, $arg2 ) = @_;
  
                    return $r;
+
                return $r;
 
             }
 
             }
 
</pre>
 
</pre>
 +
 +
When writing "if/else" sets, have the short option at the top, even if this means doing an "if not" or an "unless"
 +
 +
  if (!$authenticated)
 +
  {
 +
      log_error("Not authenticated");
 +
      return
 +
  }
 +
  else
 +
  {
 +
      # long sequence of code 
 +
  }
  
 
Where possible, use "return" rather than an "if" block.
 
Where possible, use "return" rather than an "if" block.
Line 28: Line 134:
 
             sub get_value
 
             sub get_value
 
             {
 
             {
                    my( $self, $arg1 ) = @_;
+
                my( $self, $arg1 ) = @_;
  
                    my $r;
+
                my $r;
                    if( $arg1 )
+
                if( $arg1 )
                    {
+
                {
                            $r = $arg1 * 2;
+
                    $r = $arg1 * 2;
                    }
+
                }
                    else
+
                else
                    {
+
                {
                            log( "some error" );
+
                    log( "some error" );
                    }
+
                }
 
                      
 
                      
                    return $r;
+
                return $r;
 
             }
 
             }
 
</pre>
 
</pre>
Line 48: Line 154:
 
             sub get_value
 
             sub get_value
 
             {
 
             {
                    my( $self, $arg1 ) = @_;
+
                my( $self, $arg1 ) = @_;
  
                    if( !defined $arg1 )
+
                if( !defined $arg1 )
                    {
+
                {
                            log( "some error" );
+
                    log( "some error" );
                            return;
+
                    return;
                    }
+
                }
 
      
 
      
                    return $arg1 * 2;
+
                return $arg1 * 2;
 
             }
 
             }
 
</pre>
 
</pre>
  
 
===Conditionals===
 
===Conditionals===
 +
Conditionals should always preceed the code block they control.
 +
 
<pre>
 
<pre>
 
             if( ref($a) eq "ARRAY" )
 
             if( ref($a) eq "ARRAY" )
 
             {
 
             {
                    return 0;
+
                print "Dang";
 +
                return 0;
 
             }
 
             }
 
</pre>
 
</pre>
 +
 +
If the code-block is a single line, and the conditional plus code-block can fit on a single line, then the perl short-form is acceptable:
 +
<pre>
 +
            return 0 if( ref($a) eq "ARRAY" );
 +
            $a = 23 if( !defined $a );
 +
            $dom->appendChild($foo_bar) if (!$bish_bash)
 +
</pre>
 +
However this is NEVER OK:
 +
<pre>
 +
            if( !defined $a ) $a = 23;
 +
</pre>
 +
 
===Loops===
 
===Loops===
 
<pre>
 
<pre>
Line 73: Line 194:
 
                     push @foo, $field->get_name();
 
                     push @foo, $field->get_name();
 
             }
 
             }
 
+
</pre>
 
OR, when nested:
 
OR, when nested:
 
+
<pre>
 
             FIELD: foreach my $field ( @fields )
 
             FIELD: foreach my $field ( @fields )
 
             {  
 
             {  
Line 87: Line 208:
 
</pre>
 
</pre>
  
Avoid $_ where possible.
+
Avoid $_ where possible, it is just confusing to everyone else (and has a tendency to be written over by Lazy Programmers)
  
 
Try and use "next" rather than if when inside loops.
 
Try and use "next" rather than if when inside loops.
  
==Licensing==
 
 
We would like everything under the same license.... the EPrints license:
 
  
<pre>
 
######################################################################
 
#
 
#  This file is part of GNU EPrints 3.
 
 
#  Copyright (c) 2000-2007 University of Southampton, UK. SO17 1BJ.
 
 
#  EPrints 3 is free software; you can redistribute it and/or modify
 
#  it under the terms of the GNU General Public License as published by
 
#  the Free Software Foundation; either version 2 of the License, or
 
#  (at your option) any later version.
 
 
#  EPrints 3 is distributed in the hope that it will be useful,
 
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
 
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 
#  GNU General Public License for more details.
 
 
#  You should have received a copy of the GNU General Public License
 
#  along with EPrints 3; if not, write to the Free Software
 
#  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 
#
 
######################################################################
 
</pre>
 
 
==Description==
 
 
Below the license block the name, description and synopsis (a synopsis is an example of usage). Lastly the METHODS title begins the section for inline subroutine documentation.
 
<pre>
 
=head1 NAME
 
 
EPrints::MyModule - A one line description of MyModule
 
 
=head1 DESCRIPTION
 
 
One or two paragraphs explaining the function of EPrints::MyModule.
 
 
=head1 SYNOPSIS
 
 
use EPrints::MyModule;
 
 
my $obj = EPrints::MyModule->new( $opts );
 
$obj->do_thing( $thingy );
 
 
=head1 METHODS
 
 
=over 4
 
 
=cut
 
</pre>
 
==Methods==
 
===Public Methods===
 
Each public subroutine should have POD documentation above it, with hashes to
 
separate it from the method above. A large module should probably be split into
 
different sections, e.g. "CONSTRUCTOR METHODS", "ACCESSOR METHODS", etc.
 
Private methods can be documented using Perl comments.
 
<pre>
 
######################################################################
 
 
=item $objname = EPrints::StyleGuide->my_sub( $arg1, [$opt_arg2], \%opts )
 
 
A description of what my_sub does and arguments it takes ($opt_arg2 is
 
shown as optional by using brackets).
 
 
A description of $arg1 if needed, along with an example:
 
 
EPrints::StyleGuide->my_sub( "eprintid" );
 
 
EPrints::StyleGuide->my_sub(
 
$arg1,
 
undef,
 
{
 
opt1 => $var1, # What is var1
 
opt2 => $var2, # What is var2
 
}
 
);
 
 
Further elaboration on the effect of $var2.
 
 
=cut
 
 
######################################################################
 
 
sub my_sub
 
{
 
...
 
}
 
</pre>
 
===Private Methods===
 
  
 
==Automatically reformatting code==
 
==Automatically reformatting code==
There is a wonderful utility called ''perltidy'' (find link)
+
There is a wonderful utility called ''perltidy''' ([http://perltidy.sourceforge.net/ website])
  
 
This reformats perl for you (and does a syntax check at the same time... lovely.
 
This reformats perl for you (and does a syntax check at the same time... lovely.

Latest revision as of 23:51, 11 September 2018

EPrints has been under development for many years and has some fluff about the place. For new programmers this document is intended as a 'style guide' to at least keep the code and documentation consistent across new modules.

Programming Style

Naming

It helps if all things are written in a consistent style: when one looks at code, we recognise things by the format of the name:

Global Variables

Global variables are variables that are in scope for the entire system, not restricted to specific user sessions. Examples of Global Variables are the AUTH_OK code returned by the Apache web server subsystem.

EPrints follows the general Perl convention, and such variables (and CONSTANTS) should be in CAPITAL LETTERS

As a general rule code outside the core should not really be creating global variables, however there are situations

            AUTH_OK = "1";
            

Local Variables

Variables in code must always be pre-declared, and should be given the most appropriate level of restriction

my, our and local variables are all available to you, however we really really suggest you use my variables wherever possible.

Local variables should be given names that reflect their content ($authenticated) rather than a short nmenonic ($au). Local variables should be underscore seperated ($current_password), not camelCased ($currentPassword)

The appropriate length of a name is inversely proportional to the size of its scope. --Mark-Jason Dominus

Choose mnemonic identifiers. If you can't remember what mnemonic means, you've got a problem. -- Larry Wall

Preferred technique:

 my $proxy_depositor;
 $proxy_depositor = get_proxy_depositor($session, $eprint);

Acceptable technicque:

 my $proxy_depositor = get_proxy_depositor($session, $eprint);

Subroutines

Subroutines (aka functions, aka methods) should also be underscore seperated

Public methods

These are the API calls that are used to interact with your code.

Public methods must start with a lowe-case letter [a-z], and are required to have POD above them explaining the function and use of the method.

 =pod

 =item * my_func( $var_1, %var_2 ):

 Description of function, with examples

=cut

 sub my_func
 {

 }
Private methods

Private functions are the subroutines in your modules that people should not be using. Private functions should start with an underscore, and have simple perl comments above them, explaining the function of the subroutine.

 #########
 #
 # _my_subby( $var_1, %var_2 ):
 #
 # Description of function, with examples
 #
 sub _my_subby
 {

 }

Modules

Modules (aka packages) are the file that contains the collection of methods & private funtions that makes your code do whatever it does.

Modules are named in the perl style: Perl informally reserves lowercase module names for pragma modules like integer and strict. Other modules should begin with a capital letter and use mixed case

    package EPrints::Plugin::Sword::Import::FiddleDeeDee

See the modules page for full details.

General layout/presentation notes

We lay out code mostly in the GNU style: curly braces are on the next line, aligned with the start of the thing that defines the block; closing braces align with the opening brace.

When there is a literal list (variables being passed into a subroutine; an array being created; etc) and the sequence of items extends beyond 80 characters, wrap the line after a comma.

  $thinggy = my_subroutine( $session, $short_variable,
                            $long_variable_name,
                            { 'key_1' => $value_one,
                              'key_2' => $value_two
                            }
                          );

If the literal list is a more regular block of items, padding things to make it more grid-like is preferred:

 @months_of_the_year = ( 'January', 'February', 'March',
                         'April',   'May',      'June',
                         'July',    'August',   'September',
                         'October', 'November', 'December'
                       );

Subroutines

This applies to all subroutines - public or private.

             sub get_value
             {
                 my( $self, $arg1, $arg2 ) = @_;

                 return $r;
             }

When writing "if/else" sets, have the short option at the top, even if this means doing an "if not" or an "unless"

  if (!$authenticated)
  {
      log_error("Not authenticated");
      return
  }
  else
  {
      # long sequence of code   
  }

Where possible, use "return" rather than an "if" block.

AVOID:

             sub get_value
             {
                 my( $self, $arg1 ) = @_;

                 my $r;
                 if( $arg1 )
                 {
                     $r = $arg1 * 2;
                 }
                 else
                 {
                     log( "some error" );
                 }
                     
                 return $r;
             }

Prefer this style instead, treating the problem like a basic exception to the normal running of the function:

             sub get_value
             {
                 my( $self, $arg1 ) = @_;

                 if( !defined $arg1 )
                 {
                     log( "some error" );
                     return;
                 }
     
                 return $arg1 * 2;
             }

Conditionals

Conditionals should always preceed the code block they control.

             if( ref($a) eq "ARRAY" )
             {
                 print "Dang";
                 return 0;
             }

If the code-block is a single line, and the conditional plus code-block can fit on a single line, then the perl short-form is acceptable:

             return 0 if( ref($a) eq "ARRAY" );
             $a = 23 if( !defined $a );
             $dom->appendChild($foo_bar) if (!$bish_bash)

However this is NEVER OK:

             if( !defined $a ) $a = 23;

Loops

             foreach my $field ( @fields )
             {
                     push @foo, $field->get_name();
             }

OR, when nested:

             FIELD: foreach my $field ( @fields )
             { 
                     VALUE: foreach my $value ( @{$field->{ "lsd" }} )
                     {
                             next VALUE if !defined $value; 
                             $values{ $value } = 1;
                     }
             }

Avoid $_ where possible, it is just confusing to everyone else (and has a tendency to be written over by Lazy Programmers)

Try and use "next" rather than if when inside loops.


Automatically reformatting code

There is a wonderful utility called perltidy' (website)

This reformats perl for you (and does a syntax check at the same time... lovely.

Use:

 perltidy -gnu -csc -b JSON.pm
 -gnu reformats into the general 'gnu' style (as opposed to the "Larry Wall/Perl" style)
 -csc adds comments to the end of longish loops
 -b edits the file in place (so you may want to leave that off initially)