StyleGuide

From EPrints Documentation
Revision as of 09:22, 12 August 2009 by Kiz (talk | contribs) (Local Variables)
Jump to: navigation, search

EPrints has been under development for many years and has some fluff about the place. For new programmers this document is intended as a 'style guide' to at least keep the code and documentation consistent across new modules.

Programming Style

Naming

It helps if all things are written in a consistent style: when one looks at code, we recognise things by the format of the name:

Global Variables

Global variables are variables that are in scope for the entire system, not restricted to specific user sessions. Examples of Global Variables are the AUTH_OK code returned by the Apache web server subsystem.

EPrints follows the general Perl convention, and such variables (and CONSTANTS) should be in CAPITAL LETTERS

As a general rule code outside the core should not really be creating global variables, however there are situations

            AUTH_OK = "1";
            

Local Variables

Variables in code must always be pre-declared, and should be given the most appropriate level of restriction

my, our and local variables are all available to you, however we really really suggest you use my variables wherever possible.

Local variables should be given names that reflect their content ($authenticated) rather than a short nmenonic ($au). Local variables should be underscore seperated ($current_password), not camelCased ($currentPassword)

The appropriate length of a name is inversely proportional to the size of its scope. --Mark-Jason Dominus

Choose mnemonic identifiers. If you can't remember what mnemonic means, you've got a problem. -- Larry Wall

Preferred technique:

 my $proxy_depositor;
 $proxy_depositor = get_proxy_depositor($session, $eprint);

Acceptable technicque:

 my $proxy_depositor = get_proxy_depositor($session, $eprint);

Subroutines

Subroutines (aka functions, aka methods) should also be underscore seperated

Public methods

These are the API calls that are used to interact with your code.

Public methods must start with a lowe-case letter [a-z], and are required to have POD above them explaining the function and use of the method.

 =pod

 =item * my_func( $var_1, %var_2 ):

 Description of function, with examples

=cut

 sub my_func
 {

 }
Private methods

Private functions are the subroutines in your modules that people should not be using. Private functions should start with an underscore, and have simple perl comments above them, explaining the function of the subroutine.

 #########
 #
 # _my_subby( $var_1, %var_2 ):
 #
 # Description of function, with examples
 #
 sub _my_subby
 {

 }

Modules

Modules (aka packages) are the file that contains the collection of methods & private funtions that makes your code do whatever it does.

Modules are named in the perl style: Perl informally reserves lowercase module names for pragma modules like integer and strict. Other modules should begin with a capital letter and use mixed case

    package EPrints::Plugin::Sword::Import::FiddleDeeDee

See the modules page for full details.

General layout/presentation notes

We lay out code mostly in the GNU style: curly braces are on the next line, aligned with the start of the thing that defines the block; closing braces align with the opening brace.

When there is a literal list (variables being passed into a subroutine; an array being created; etc) and the sequence of items extends beyond 80 characters, wrap the line after a comma.

  $thinggy = my_subroutine( $session, $short_variable,
                            $long_variable_name,
                            { 'key_1' => $value_one,
                              'key_2' => $value_two
                            }
                          );

If the literal list is a more regular block of items, padding things to make it more grid-like is preferred:

 @months_of_the_year = ( 'January', 'February', 'March',
                         'April',   'May',      'June',
                         'July',    'August',   'September',
                         'October', 'November', 'December'
                       );

Subroutines

This applies to all subroutines - public or private.

             sub get_value
             {
                 my( $self, $arg1, $arg2 ) = @_;

                 return $r;
             }

When writing "if/else" sets, have the short option at the top, even if this means doing an "if not" or an "unless"

  if (!$authenticated)
  {
      log_error("Not authenticated");
      return
  }
  else
  {
      # long sequence of code   
  }

Where possible, use "return" rather than an "if" block.

AVOID:

             sub get_value
             {
                 my( $self, $arg1 ) = @_;

                 my $r;
                 if( $arg1 )
                 {
                     $r = $arg1 * 2;
                 }
                 else
                 {
                     log( "some error" );
                 }
                     
                 return $r;
             }

Prefer this style instead, treating the problem like a basic exception to the normal running of the function:

             sub get_value
             {
                 my( $self, $arg1 ) = @_;

                 if( !defined $arg1 )
                 {
                     log( "some error" );
                     return;
                 }
     
                 return $arg1 * 2;
             }

Conditionals

Conditionals should always preceed the code block they control.

             if( ref($a) eq "ARRAY" )
             {
                 print "Dang";
                 return 0;
             }

If the code-block is a single line, and the conditional plus code-block can fit on a single line, then the perl short-form is acceptable:

             return 0 if( ref($a) eq "ARRAY" );
             $a = 23 if( !defined $a );
             $dom->appendChild($foo_bar) if (!$bish_bash)

However this is NEVER OK:

             if( !defined $a ) $a = 23;

Loops

             foreach my $field ( @fields )
             {
                     push @foo, $field->get_name();
             }

OR, when nested:

             FIELD: foreach my $field ( @fields )
             { 
                     VALUE: foreach my $value ( @{$field->{ "lsd" }} )
                     {
                             next VALUE if !defined $value; 
                             $values{ $value } = 1;
                     }
             }

Avoid $_ where possible, it is just confusing to everyone else (and has a tendency to be written over by Lazy Programmers)

Try and use "next" rather than if when inside loops.


Automatically reformatting code

There is a wonderful utility called perltidy (find link)

This reformats perl for you (and does a syntax check at the same time... lovely.

Use:

 perltidy -gnu -csc -b JSON.pm
 -gnu reformats into the general 'gnu' style (as opposed to the "Larry Wall/Perl" style)
 -csc adds comments to the end of longish loops
 -b edits the file in place (so you may want to leave that off initially)