| 1 | =head1 NAME
|
|---|
| 2 |
|
|---|
| 3 | perlstyle - Perl style guide
|
|---|
| 4 |
|
|---|
| 5 | =head1 DESCRIPTION
|
|---|
| 6 |
|
|---|
| 7 | Each programmer will, of course, have his or her own preferences in
|
|---|
| 8 | regards to formatting, but there are some general guidelines that will
|
|---|
| 9 | make your programs easier to read, understand, and maintain.
|
|---|
| 10 |
|
|---|
| 11 | The most important thing is to run your programs under the B<-w>
|
|---|
| 12 | flag at all times. You may turn it off explicitly for particular
|
|---|
| 13 | portions of code via the C<no warnings> pragma or the C<$^W> variable
|
|---|
| 14 | if you must. You should also always run under C<use strict> or know the
|
|---|
| 15 | reason why not. The C<use sigtrap> and even C<use diagnostics> pragmas
|
|---|
| 16 | may also prove useful.
|
|---|
| 17 |
|
|---|
| 18 | Regarding aesthetics of code lay out, about the only thing Larry
|
|---|
| 19 | cares strongly about is that the closing curly bracket of
|
|---|
| 20 | a multi-line BLOCK should line up with the keyword that started the construct.
|
|---|
| 21 | Beyond that, he has other preferences that aren't so strong:
|
|---|
| 22 |
|
|---|
| 23 | =over 4
|
|---|
| 24 |
|
|---|
| 25 | =item *
|
|---|
| 26 |
|
|---|
| 27 | 4-column indent.
|
|---|
| 28 |
|
|---|
| 29 | =item *
|
|---|
| 30 |
|
|---|
| 31 | Opening curly on same line as keyword, if possible, otherwise line up.
|
|---|
| 32 |
|
|---|
| 33 | =item *
|
|---|
| 34 |
|
|---|
| 35 | Space before the opening curly of a multi-line BLOCK.
|
|---|
| 36 |
|
|---|
| 37 | =item *
|
|---|
| 38 |
|
|---|
| 39 | One-line BLOCK may be put on one line, including curlies.
|
|---|
| 40 |
|
|---|
| 41 | =item *
|
|---|
| 42 |
|
|---|
| 43 | No space before the semicolon.
|
|---|
| 44 |
|
|---|
| 45 | =item *
|
|---|
| 46 |
|
|---|
| 47 | Semicolon omitted in "short" one-line BLOCK.
|
|---|
| 48 |
|
|---|
| 49 | =item *
|
|---|
| 50 |
|
|---|
| 51 | Space around most operators.
|
|---|
| 52 |
|
|---|
| 53 | =item *
|
|---|
| 54 |
|
|---|
| 55 | Space around a "complex" subscript (inside brackets).
|
|---|
| 56 |
|
|---|
| 57 | =item *
|
|---|
| 58 |
|
|---|
| 59 | Blank lines between chunks that do different things.
|
|---|
| 60 |
|
|---|
| 61 | =item *
|
|---|
| 62 |
|
|---|
| 63 | Uncuddled elses.
|
|---|
| 64 |
|
|---|
| 65 | =item *
|
|---|
| 66 |
|
|---|
| 67 | No space between function name and its opening parenthesis.
|
|---|
| 68 |
|
|---|
| 69 | =item *
|
|---|
| 70 |
|
|---|
| 71 | Space after each comma.
|
|---|
| 72 |
|
|---|
| 73 | =item *
|
|---|
| 74 |
|
|---|
| 75 | Long lines broken after an operator (except C<and> and C<or>).
|
|---|
| 76 |
|
|---|
| 77 | =item *
|
|---|
| 78 |
|
|---|
| 79 | Space after last parenthesis matching on current line.
|
|---|
| 80 |
|
|---|
| 81 | =item *
|
|---|
| 82 |
|
|---|
| 83 | Line up corresponding items vertically.
|
|---|
| 84 |
|
|---|
| 85 | =item *
|
|---|
| 86 |
|
|---|
| 87 | Omit redundant punctuation as long as clarity doesn't suffer.
|
|---|
| 88 |
|
|---|
| 89 | =back
|
|---|
| 90 |
|
|---|
| 91 | Larry has his reasons for each of these things, but he doesn't claim that
|
|---|
| 92 | everyone else's mind works the same as his does.
|
|---|
| 93 |
|
|---|
| 94 | Here are some other more substantive style issues to think about:
|
|---|
| 95 |
|
|---|
| 96 | =over 4
|
|---|
| 97 |
|
|---|
| 98 | =item *
|
|---|
| 99 |
|
|---|
| 100 | Just because you I<CAN> do something a particular way doesn't mean that
|
|---|
| 101 | you I<SHOULD> do it that way. Perl is designed to give you several
|
|---|
| 102 | ways to do anything, so consider picking the most readable one. For
|
|---|
| 103 | instance
|
|---|
| 104 |
|
|---|
| 105 | open(FOO,$foo) || die "Can't open $foo: $!";
|
|---|
| 106 |
|
|---|
| 107 | is better than
|
|---|
| 108 |
|
|---|
| 109 | die "Can't open $foo: $!" unless open(FOO,$foo);
|
|---|
| 110 |
|
|---|
| 111 | because the second way hides the main point of the statement in a
|
|---|
| 112 | modifier. On the other hand
|
|---|
| 113 |
|
|---|
| 114 | print "Starting analysis\n" if $verbose;
|
|---|
| 115 |
|
|---|
| 116 | is better than
|
|---|
| 117 |
|
|---|
| 118 | $verbose && print "Starting analysis\n";
|
|---|
| 119 |
|
|---|
| 120 | because the main point isn't whether the user typed B<-v> or not.
|
|---|
| 121 |
|
|---|
| 122 | Similarly, just because an operator lets you assume default arguments
|
|---|
| 123 | doesn't mean that you have to make use of the defaults. The defaults
|
|---|
| 124 | are there for lazy systems programmers writing one-shot programs. If
|
|---|
| 125 | you want your program to be readable, consider supplying the argument.
|
|---|
| 126 |
|
|---|
| 127 | Along the same lines, just because you I<CAN> omit parentheses in many
|
|---|
| 128 | places doesn't mean that you ought to:
|
|---|
| 129 |
|
|---|
| 130 | return print reverse sort num values %array;
|
|---|
| 131 | return print(reverse(sort num (values(%array))));
|
|---|
| 132 |
|
|---|
| 133 | When in doubt, parenthesize. At the very least it will let some poor
|
|---|
| 134 | schmuck bounce on the % key in B<vi>.
|
|---|
| 135 |
|
|---|
| 136 | Even if you aren't in doubt, consider the mental welfare of the person
|
|---|
| 137 | who has to maintain the code after you, and who will probably put
|
|---|
| 138 | parentheses in the wrong place.
|
|---|
| 139 |
|
|---|
| 140 | =item *
|
|---|
| 141 |
|
|---|
| 142 | Don't go through silly contortions to exit a loop at the top or the
|
|---|
| 143 | bottom, when Perl provides the C<last> operator so you can exit in
|
|---|
| 144 | the middle. Just "outdent" it a little to make it more visible:
|
|---|
| 145 |
|
|---|
| 146 | LINE:
|
|---|
| 147 | for (;;) {
|
|---|
| 148 | statements;
|
|---|
| 149 | last LINE if $foo;
|
|---|
| 150 | next LINE if /^#/;
|
|---|
| 151 | statements;
|
|---|
| 152 | }
|
|---|
| 153 |
|
|---|
| 154 | =item *
|
|---|
| 155 |
|
|---|
| 156 | Don't be afraid to use loop labels--they're there to enhance
|
|---|
| 157 | readability as well as to allow multilevel loop breaks. See the
|
|---|
| 158 | previous example.
|
|---|
| 159 |
|
|---|
| 160 | =item *
|
|---|
| 161 |
|
|---|
| 162 | Avoid using C<grep()> (or C<map()>) or `backticks` in a void context, that is,
|
|---|
| 163 | when you just throw away their return values. Those functions all
|
|---|
| 164 | have return values, so use them. Otherwise use a C<foreach()> loop or
|
|---|
| 165 | the C<system()> function instead.
|
|---|
| 166 |
|
|---|
| 167 | =item *
|
|---|
| 168 |
|
|---|
| 169 | For portability, when using features that may not be implemented on
|
|---|
| 170 | every machine, test the construct in an eval to see if it fails. If
|
|---|
| 171 | you know what version or patchlevel a particular feature was
|
|---|
| 172 | implemented, you can test C<$]> (C<$PERL_VERSION> in C<English>) to see if it
|
|---|
| 173 | will be there. The C<Config> module will also let you interrogate values
|
|---|
| 174 | determined by the B<Configure> program when Perl was installed.
|
|---|
| 175 |
|
|---|
| 176 | =item *
|
|---|
| 177 |
|
|---|
| 178 | Choose mnemonic identifiers. If you can't remember what mnemonic means,
|
|---|
| 179 | you've got a problem.
|
|---|
| 180 |
|
|---|
| 181 | =item *
|
|---|
| 182 |
|
|---|
| 183 | While short identifiers like C<$gotit> are probably ok, use underscores to
|
|---|
| 184 | separate words in longer identifiers. It is generally easier to read
|
|---|
| 185 | C<$var_names_like_this> than C<$VarNamesLikeThis>, especially for
|
|---|
| 186 | non-native speakers of English. It's also a simple rule that works
|
|---|
| 187 | consistently with C<VAR_NAMES_LIKE_THIS>.
|
|---|
| 188 |
|
|---|
| 189 | Package names are sometimes an exception to this rule. Perl informally
|
|---|
| 190 | reserves lowercase module names for "pragma" modules like C<integer> and
|
|---|
| 191 | C<strict>. Other modules should begin with a capital letter and use mixed
|
|---|
| 192 | case, but probably without underscores due to limitations in primitive
|
|---|
| 193 | file systems' representations of module names as files that must fit into a
|
|---|
| 194 | few sparse bytes.
|
|---|
| 195 |
|
|---|
| 196 | =item *
|
|---|
| 197 |
|
|---|
| 198 | You may find it helpful to use letter case to indicate the scope
|
|---|
| 199 | or nature of a variable. For example:
|
|---|
| 200 |
|
|---|
| 201 | $ALL_CAPS_HERE constants only (beware clashes with perl vars!)
|
|---|
| 202 | $Some_Caps_Here package-wide global/static
|
|---|
| 203 | $no_caps_here function scope my() or local() variables
|
|---|
| 204 |
|
|---|
| 205 | Function and method names seem to work best as all lowercase.
|
|---|
| 206 | E.g., C<$obj-E<gt>as_string()>.
|
|---|
| 207 |
|
|---|
| 208 | You can use a leading underscore to indicate that a variable or
|
|---|
| 209 | function should not be used outside the package that defined it.
|
|---|
| 210 |
|
|---|
| 211 | =item *
|
|---|
| 212 |
|
|---|
| 213 | If you have a really hairy regular expression, use the C</x> modifier and
|
|---|
| 214 | put in some whitespace to make it look a little less like line noise.
|
|---|
| 215 | Don't use slash as a delimiter when your regexp has slashes or backslashes.
|
|---|
| 216 |
|
|---|
| 217 | =item *
|
|---|
| 218 |
|
|---|
| 219 | Use the new C<and> and C<or> operators to avoid having to parenthesize
|
|---|
| 220 | list operators so much, and to reduce the incidence of punctuation
|
|---|
| 221 | operators like C<&&> and C<||>. Call your subroutines as if they were
|
|---|
| 222 | functions or list operators to avoid excessive ampersands and parentheses.
|
|---|
| 223 |
|
|---|
| 224 | =item *
|
|---|
| 225 |
|
|---|
| 226 | Use here documents instead of repeated C<print()> statements.
|
|---|
| 227 |
|
|---|
| 228 | =item *
|
|---|
| 229 |
|
|---|
| 230 | Line up corresponding things vertically, especially if it'd be too long
|
|---|
| 231 | to fit on one line anyway.
|
|---|
| 232 |
|
|---|
| 233 | $IDX = $ST_MTIME;
|
|---|
| 234 | $IDX = $ST_ATIME if $opt_u;
|
|---|
| 235 | $IDX = $ST_CTIME if $opt_c;
|
|---|
| 236 | $IDX = $ST_SIZE if $opt_s;
|
|---|
| 237 |
|
|---|
| 238 | mkdir $tmpdir, 0700 or die "can't mkdir $tmpdir: $!";
|
|---|
| 239 | chdir($tmpdir) or die "can't chdir $tmpdir: $!";
|
|---|
| 240 | mkdir 'tmp', 0777 or die "can't mkdir $tmpdir/tmp: $!";
|
|---|
| 241 |
|
|---|
| 242 | =item *
|
|---|
| 243 |
|
|---|
| 244 | Always check the return codes of system calls. Good error messages should
|
|---|
| 245 | go to C<STDERR>, include which program caused the problem, what the failed
|
|---|
| 246 | system call and arguments were, and (VERY IMPORTANT) should contain the
|
|---|
| 247 | standard system error message for what went wrong. Here's a simple but
|
|---|
| 248 | sufficient example:
|
|---|
| 249 |
|
|---|
| 250 | opendir(D, $dir) or die "can't opendir $dir: $!";
|
|---|
| 251 |
|
|---|
| 252 | =item *
|
|---|
| 253 |
|
|---|
| 254 | Line up your transliterations when it makes sense:
|
|---|
| 255 |
|
|---|
| 256 | tr [abc]
|
|---|
| 257 | [xyz];
|
|---|
| 258 |
|
|---|
| 259 | =item *
|
|---|
| 260 |
|
|---|
| 261 | Think about reusability. Why waste brainpower on a one-shot when you
|
|---|
| 262 | might want to do something like it again? Consider generalizing your
|
|---|
| 263 | code. Consider writing a module or object class. Consider making your
|
|---|
| 264 | code run cleanly with C<use strict> and C<use warnings> (or B<-w>) in
|
|---|
| 265 | effect. Consider giving away your code. Consider changing your whole
|
|---|
| 266 | world view. Consider... oh, never mind.
|
|---|
| 267 |
|
|---|
| 268 | =item *
|
|---|
| 269 |
|
|---|
| 270 | Try to document your code and use Pod formatting in a consistent way. Here
|
|---|
| 271 | are commonly expected conventions:
|
|---|
| 272 |
|
|---|
| 273 | =over 4
|
|---|
| 274 |
|
|---|
| 275 | =item *
|
|---|
| 276 |
|
|---|
| 277 | use C<CE<lt>E<gt>> for function, variable and module names (and more
|
|---|
| 278 | generally anything that can be considered part of code, like filehandles
|
|---|
| 279 | or specific values). Note that function names are considered more readable
|
|---|
| 280 | with parentheses after their name, that is C<function()>.
|
|---|
| 281 |
|
|---|
| 282 | =item *
|
|---|
| 283 |
|
|---|
| 284 | use C<BE<lt>E<gt>> for commands names like B<cat> or B<grep>.
|
|---|
| 285 |
|
|---|
| 286 | =item *
|
|---|
| 287 |
|
|---|
| 288 | use C<FE<lt>E<gt>> or C<CE<lt>E<gt>> for file names. C<FE<lt>E<gt>> should
|
|---|
| 289 | be the only Pod code for file names, but as most Pod formatters render it
|
|---|
| 290 | as italic, Unix and Windows paths with their slashes and backslashes may
|
|---|
| 291 | be less readable, and better rendered with C<CE<lt>E<gt>>.
|
|---|
| 292 |
|
|---|
| 293 | =back
|
|---|
| 294 |
|
|---|
| 295 | =item *
|
|---|
| 296 |
|
|---|
| 297 | Be consistent.
|
|---|
| 298 |
|
|---|
| 299 | =item *
|
|---|
| 300 |
|
|---|
| 301 | Be nice.
|
|---|
| 302 |
|
|---|
| 303 | =back
|
|---|