| 1 | =head1 NAME
|
|---|
| 2 |
|
|---|
| 3 | perlfaq7 - General Perl Language Issues ($Revision: 1.28 $, $Date: 2005/12/31 00:54:37 $)
|
|---|
| 4 |
|
|---|
| 5 | =head1 DESCRIPTION
|
|---|
| 6 |
|
|---|
| 7 | This section deals with general Perl language issues that don't
|
|---|
| 8 | clearly fit into any of the other sections.
|
|---|
| 9 |
|
|---|
| 10 | =head2 Can I get a BNF/yacc/RE for the Perl language?
|
|---|
| 11 |
|
|---|
| 12 | There is no BNF, but you can paw your way through the yacc grammar in
|
|---|
| 13 | perly.y in the source distribution if you're particularly brave. The
|
|---|
| 14 | grammar relies on very smart tokenizing code, so be prepared to
|
|---|
| 15 | venture into toke.c as well.
|
|---|
| 16 |
|
|---|
| 17 | In the words of Chaim Frenkel: "Perl's grammar can not be reduced to BNF.
|
|---|
| 18 | The work of parsing perl is distributed between yacc, the lexer, smoke
|
|---|
| 19 | and mirrors."
|
|---|
| 20 |
|
|---|
| 21 | =head2 What are all these $@%&* punctuation signs, and how do I know when to use them?
|
|---|
| 22 |
|
|---|
| 23 | They are type specifiers, as detailed in L<perldata>:
|
|---|
| 24 |
|
|---|
| 25 | $ for scalar values (number, string or reference)
|
|---|
| 26 | @ for arrays
|
|---|
| 27 | % for hashes (associative arrays)
|
|---|
| 28 | & for subroutines (aka functions, procedures, methods)
|
|---|
| 29 | * for all types of that symbol name. In version 4 you used them like
|
|---|
| 30 | pointers, but in modern perls you can just use references.
|
|---|
| 31 |
|
|---|
| 32 | There are couple of other symbols that you're likely to encounter that aren't
|
|---|
| 33 | really type specifiers:
|
|---|
| 34 |
|
|---|
| 35 | <> are used for inputting a record from a filehandle.
|
|---|
| 36 | \ takes a reference to something.
|
|---|
| 37 |
|
|---|
| 38 | Note that <FILE> is I<neither> the type specifier for files
|
|---|
| 39 | nor the name of the handle. It is the C<< <> >> operator applied
|
|---|
| 40 | to the handle FILE. It reads one line (well, record--see
|
|---|
| 41 | L<perlvar/$E<sol>>) from the handle FILE in scalar context, or I<all> lines
|
|---|
| 42 | in list context. When performing open, close, or any other operation
|
|---|
| 43 | besides C<< <> >> on files, or even when talking about the handle, do
|
|---|
| 44 | I<not> use the brackets. These are correct: C<eof(FH)>, C<seek(FH, 0,
|
|---|
| 45 | 2)> and "copying from STDIN to FILE".
|
|---|
| 46 |
|
|---|
| 47 | =head2 Do I always/never have to quote my strings or use semicolons and commas?
|
|---|
| 48 |
|
|---|
| 49 | Normally, a bareword doesn't need to be quoted, but in most cases
|
|---|
| 50 | probably should be (and must be under C<use strict>). But a hash key
|
|---|
| 51 | consisting of a simple word (that isn't the name of a defined
|
|---|
| 52 | subroutine) and the left-hand operand to the C<< => >> operator both
|
|---|
| 53 | count as though they were quoted:
|
|---|
| 54 |
|
|---|
| 55 | This is like this
|
|---|
| 56 | ------------ ---------------
|
|---|
| 57 | $foo{line} $foo{'line'}
|
|---|
| 58 | bar => stuff 'bar' => stuff
|
|---|
| 59 |
|
|---|
| 60 | The final semicolon in a block is optional, as is the final comma in a
|
|---|
| 61 | list. Good style (see L<perlstyle>) says to put them in except for
|
|---|
| 62 | one-liners:
|
|---|
| 63 |
|
|---|
| 64 | if ($whoops) { exit 1 }
|
|---|
| 65 | @nums = (1, 2, 3);
|
|---|
| 66 |
|
|---|
| 67 | if ($whoops) {
|
|---|
| 68 | exit 1;
|
|---|
| 69 | }
|
|---|
| 70 | @lines = (
|
|---|
| 71 | "There Beren came from mountains cold",
|
|---|
| 72 | "And lost he wandered under leaves",
|
|---|
| 73 | );
|
|---|
| 74 |
|
|---|
| 75 | =head2 How do I skip some return values?
|
|---|
| 76 |
|
|---|
| 77 | One way is to treat the return values as a list and index into it:
|
|---|
| 78 |
|
|---|
| 79 | $dir = (getpwnam($user))[7];
|
|---|
| 80 |
|
|---|
| 81 | Another way is to use undef as an element on the left-hand-side:
|
|---|
| 82 |
|
|---|
| 83 | ($dev, $ino, undef, undef, $uid, $gid) = stat($file);
|
|---|
| 84 |
|
|---|
| 85 | You can also use a list slice to select only the elements that
|
|---|
| 86 | you need:
|
|---|
| 87 |
|
|---|
| 88 | ($dev, $ino, $uid, $gid) = ( stat($file) )[0,1,4,5];
|
|---|
| 89 |
|
|---|
| 90 | =head2 How do I temporarily block warnings?
|
|---|
| 91 |
|
|---|
| 92 | If you are running Perl 5.6.0 or better, the C<use warnings> pragma
|
|---|
| 93 | allows fine control of what warning are produced.
|
|---|
| 94 | See L<perllexwarn> for more details.
|
|---|
| 95 |
|
|---|
| 96 | {
|
|---|
| 97 | no warnings; # temporarily turn off warnings
|
|---|
| 98 | $a = $b + $c; # I know these might be undef
|
|---|
| 99 | }
|
|---|
| 100 |
|
|---|
| 101 | Additionally, you can enable and disable categories of warnings.
|
|---|
| 102 | You turn off the categories you want to ignore and you can still
|
|---|
| 103 | get other categories of warnings. See L<perllexwarn> for the
|
|---|
| 104 | complete details, including the category names and hierarchy.
|
|---|
| 105 |
|
|---|
| 106 | {
|
|---|
| 107 | no warnings 'uninitialized';
|
|---|
| 108 | $a = $b + $c;
|
|---|
| 109 | }
|
|---|
| 110 |
|
|---|
| 111 | If you have an older version of Perl, the C<$^W> variable (documented
|
|---|
| 112 | in L<perlvar>) controls runtime warnings for a block:
|
|---|
| 113 |
|
|---|
| 114 | {
|
|---|
| 115 | local $^W = 0; # temporarily turn off warnings
|
|---|
| 116 | $a = $b + $c; # I know these might be undef
|
|---|
| 117 | }
|
|---|
| 118 |
|
|---|
| 119 | Note that like all the punctuation variables, you cannot currently
|
|---|
| 120 | use my() on C<$^W>, only local().
|
|---|
| 121 |
|
|---|
| 122 | =head2 What's an extension?
|
|---|
| 123 |
|
|---|
| 124 | An extension is a way of calling compiled C code from Perl. Reading
|
|---|
| 125 | L<perlxstut> is a good place to learn more about extensions.
|
|---|
| 126 |
|
|---|
| 127 | =head2 Why do Perl operators have different precedence than C operators?
|
|---|
| 128 |
|
|---|
| 129 | Actually, they don't. All C operators that Perl copies have the same
|
|---|
| 130 | precedence in Perl as they do in C. The problem is with operators that C
|
|---|
| 131 | doesn't have, especially functions that give a list context to everything
|
|---|
| 132 | on their right, eg. print, chmod, exec, and so on. Such functions are
|
|---|
| 133 | called "list operators" and appear as such in the precedence table in
|
|---|
| 134 | L<perlop>.
|
|---|
| 135 |
|
|---|
| 136 | A common mistake is to write:
|
|---|
| 137 |
|
|---|
| 138 | unlink $file || die "snafu";
|
|---|
| 139 |
|
|---|
| 140 | This gets interpreted as:
|
|---|
| 141 |
|
|---|
| 142 | unlink ($file || die "snafu");
|
|---|
| 143 |
|
|---|
| 144 | To avoid this problem, either put in extra parentheses or use the
|
|---|
| 145 | super low precedence C<or> operator:
|
|---|
| 146 |
|
|---|
| 147 | (unlink $file) || die "snafu";
|
|---|
| 148 | unlink $file or die "snafu";
|
|---|
| 149 |
|
|---|
| 150 | The "English" operators (C<and>, C<or>, C<xor>, and C<not>)
|
|---|
| 151 | deliberately have precedence lower than that of list operators for
|
|---|
| 152 | just such situations as the one above.
|
|---|
| 153 |
|
|---|
| 154 | Another operator with surprising precedence is exponentiation. It
|
|---|
| 155 | binds more tightly even than unary minus, making C<-2**2> product a
|
|---|
| 156 | negative not a positive four. It is also right-associating, meaning
|
|---|
| 157 | that C<2**3**2> is two raised to the ninth power, not eight squared.
|
|---|
| 158 |
|
|---|
| 159 | Although it has the same precedence as in C, Perl's C<?:> operator
|
|---|
| 160 | produces an lvalue. This assigns $x to either $a or $b, depending
|
|---|
| 161 | on the trueness of $maybe:
|
|---|
| 162 |
|
|---|
| 163 | ($maybe ? $a : $b) = $x;
|
|---|
| 164 |
|
|---|
| 165 | =head2 How do I declare/create a structure?
|
|---|
| 166 |
|
|---|
| 167 | In general, you don't "declare" a structure. Just use a (probably
|
|---|
| 168 | anonymous) hash reference. See L<perlref> and L<perldsc> for details.
|
|---|
| 169 | Here's an example:
|
|---|
| 170 |
|
|---|
| 171 | $person = {}; # new anonymous hash
|
|---|
| 172 | $person->{AGE} = 24; # set field AGE to 24
|
|---|
| 173 | $person->{NAME} = "Nat"; # set field NAME to "Nat"
|
|---|
| 174 |
|
|---|
| 175 | If you're looking for something a bit more rigorous, try L<perltoot>.
|
|---|
| 176 |
|
|---|
| 177 | =head2 How do I create a module?
|
|---|
| 178 |
|
|---|
| 179 | (contributed by brian d foy)
|
|---|
| 180 |
|
|---|
| 181 | L<perlmod>, L<perlmodlib>, L<perlmodstyle> explain modules
|
|---|
| 182 | in all the gory details. L<perlnewmod> gives a brief
|
|---|
| 183 | overview of the process along with a couple of suggestions
|
|---|
| 184 | about style.
|
|---|
| 185 |
|
|---|
| 186 | If you need to include C code or C library interfaces in
|
|---|
| 187 | your module, you'll need h2xs. h2xs will create the module
|
|---|
| 188 | distribution structure and the initial interface files
|
|---|
| 189 | you'll need. L<perlxs> and L<perlxstut> explain the details.
|
|---|
| 190 |
|
|---|
| 191 | If you don't need to use C code, other tools such as
|
|---|
| 192 | ExtUtils::ModuleMaker and Module::Starter, can help you
|
|---|
| 193 | create a skeleton module distribution.
|
|---|
| 194 |
|
|---|
| 195 | You may also want to see Sam Tregar's "Writing Perl Modules
|
|---|
| 196 | for CPAN" ( http://apress.com/book/bookDisplay.html?bID=14 )
|
|---|
| 197 | which is the best hands-on guide to creating module
|
|---|
| 198 | distributions.
|
|---|
| 199 |
|
|---|
| 200 | =head2 How do I create a class?
|
|---|
| 201 |
|
|---|
| 202 | See L<perltoot> for an introduction to classes and objects, as well as
|
|---|
| 203 | L<perlobj> and L<perlbot>.
|
|---|
| 204 |
|
|---|
| 205 | =head2 How can I tell if a variable is tainted?
|
|---|
| 206 |
|
|---|
| 207 | You can use the tainted() function of the Scalar::Util module, available
|
|---|
| 208 | from CPAN (or included with Perl since release 5.8.0).
|
|---|
| 209 | See also L<perlsec/"Laundering and Detecting Tainted Data">.
|
|---|
| 210 |
|
|---|
| 211 | =head2 What's a closure?
|
|---|
| 212 |
|
|---|
| 213 | Closures are documented in L<perlref>.
|
|---|
| 214 |
|
|---|
| 215 | I<Closure> is a computer science term with a precise but
|
|---|
| 216 | hard-to-explain meaning. Closures are implemented in Perl as anonymous
|
|---|
| 217 | subroutines with lasting references to lexical variables outside their
|
|---|
| 218 | own scopes. These lexicals magically refer to the variables that were
|
|---|
| 219 | around when the subroutine was defined (deep binding).
|
|---|
| 220 |
|
|---|
| 221 | Closures make sense in any programming language where you can have the
|
|---|
| 222 | return value of a function be itself a function, as you can in Perl.
|
|---|
| 223 | Note that some languages provide anonymous functions but are not
|
|---|
| 224 | capable of providing proper closures: the Python language, for
|
|---|
| 225 | example. For more information on closures, check out any textbook on
|
|---|
| 226 | functional programming. Scheme is a language that not only supports
|
|---|
| 227 | but encourages closures.
|
|---|
| 228 |
|
|---|
| 229 | Here's a classic function-generating function:
|
|---|
| 230 |
|
|---|
| 231 | sub add_function_generator {
|
|---|
| 232 | return sub { shift() + shift() };
|
|---|
| 233 | }
|
|---|
| 234 |
|
|---|
| 235 | $add_sub = add_function_generator();
|
|---|
| 236 | $sum = $add_sub->(4,5); # $sum is 9 now.
|
|---|
| 237 |
|
|---|
| 238 | The closure works as a I<function template> with some customization
|
|---|
| 239 | slots left out to be filled later. The anonymous subroutine returned
|
|---|
| 240 | by add_function_generator() isn't technically a closure because it
|
|---|
| 241 | refers to no lexicals outside its own scope.
|
|---|
| 242 |
|
|---|
| 243 | Contrast this with the following make_adder() function, in which the
|
|---|
| 244 | returned anonymous function contains a reference to a lexical variable
|
|---|
| 245 | outside the scope of that function itself. Such a reference requires
|
|---|
| 246 | that Perl return a proper closure, thus locking in for all time the
|
|---|
| 247 | value that the lexical had when the function was created.
|
|---|
| 248 |
|
|---|
| 249 | sub make_adder {
|
|---|
|
|---|