| 1 | =head1 NAME
|
|---|
| 2 | X<subroutine> X<function>
|
|---|
| 3 |
|
|---|
| 4 | perlsub - Perl subroutines
|
|---|
| 5 |
|
|---|
| 6 | =head1 SYNOPSIS
|
|---|
| 7 |
|
|---|
| 8 | To declare subroutines:
|
|---|
| 9 | X<subroutine, declaration> X<sub>
|
|---|
| 10 |
|
|---|
| 11 | sub NAME; # A "forward" declaration.
|
|---|
| 12 | sub NAME(PROTO); # ditto, but with prototypes
|
|---|
| 13 | sub NAME : ATTRS; # with attributes
|
|---|
| 14 | sub NAME(PROTO) : ATTRS; # with attributes and prototypes
|
|---|
| 15 |
|
|---|
| 16 | sub NAME BLOCK # A declaration and a definition.
|
|---|
| 17 | sub NAME(PROTO) BLOCK # ditto, but with prototypes
|
|---|
| 18 | sub NAME : ATTRS BLOCK # with attributes
|
|---|
| 19 | sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes
|
|---|
| 20 |
|
|---|
| 21 | To define an anonymous subroutine at runtime:
|
|---|
| 22 | X<subroutine, anonymous>
|
|---|
| 23 |
|
|---|
| 24 | $subref = sub BLOCK; # no proto
|
|---|
| 25 | $subref = sub (PROTO) BLOCK; # with proto
|
|---|
| 26 | $subref = sub : ATTRS BLOCK; # with attributes
|
|---|
| 27 | $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
|
|---|
| 28 |
|
|---|
| 29 | To import subroutines:
|
|---|
| 30 | X<import>
|
|---|
| 31 |
|
|---|
| 32 | use MODULE qw(NAME1 NAME2 NAME3);
|
|---|
| 33 |
|
|---|
| 34 | To call subroutines:
|
|---|
| 35 | X<subroutine, call> X<call>
|
|---|
| 36 |
|
|---|
| 37 | NAME(LIST); # & is optional with parentheses.
|
|---|
| 38 | NAME LIST; # Parentheses optional if predeclared/imported.
|
|---|
| 39 | &NAME(LIST); # Circumvent prototypes.
|
|---|
| 40 | &NAME; # Makes current @_ visible to called subroutine.
|
|---|
| 41 |
|
|---|
| 42 | =head1 DESCRIPTION
|
|---|
| 43 |
|
|---|
| 44 | Like many languages, Perl provides for user-defined subroutines.
|
|---|
| 45 | These may be located anywhere in the main program, loaded in from
|
|---|
| 46 | other files via the C<do>, C<require>, or C<use> keywords, or
|
|---|
| 47 | generated on the fly using C<eval> or anonymous subroutines.
|
|---|
| 48 | You can even call a function indirectly using a variable containing
|
|---|
| 49 | its name or a CODE reference.
|
|---|
| 50 |
|
|---|
| 51 | The Perl model for function call and return values is simple: all
|
|---|
| 52 | functions are passed as parameters one single flat list of scalars, and
|
|---|
| 53 | all functions likewise return to their caller one single flat list of
|
|---|
| 54 | scalars. Any arrays or hashes in these call and return lists will
|
|---|
| 55 | collapse, losing their identities--but you may always use
|
|---|
| 56 | pass-by-reference instead to avoid this. Both call and return lists may
|
|---|
| 57 | contain as many or as few scalar elements as you'd like. (Often a
|
|---|
| 58 | function without an explicit return statement is called a subroutine, but
|
|---|
| 59 | there's really no difference from Perl's perspective.)
|
|---|
| 60 | X<subroutine, parameter> X<parameter>
|
|---|
| 61 |
|
|---|
| 62 | Any arguments passed in show up in the array C<@_>. Therefore, if
|
|---|
| 63 | you called a function with two arguments, those would be stored in
|
|---|
| 64 | C<$_[0]> and C<$_[1]>. The array C<@_> is a local array, but its
|
|---|
| 65 | elements are aliases for the actual scalar parameters. In particular,
|
|---|
| 66 | if an element C<$_[0]> is updated, the corresponding argument is
|
|---|
| 67 | updated (or an error occurs if it is not updatable). If an argument
|
|---|
| 68 | is an array or hash element which did not exist when the function
|
|---|
| 69 | was called, that element is created only when (and if) it is modified
|
|---|
| 70 | or a reference to it is taken. (Some earlier versions of Perl
|
|---|
| 71 | created the element whether or not the element was assigned to.)
|
|---|
| 72 | Assigning to the whole array C<@_> removes that aliasing, and does
|
|---|
| 73 | not update any arguments.
|
|---|
| 74 | X<subroutine, argument> X<argument> X<@_>
|
|---|
| 75 |
|
|---|
| 76 | A C<return> statement may be used to exit a subroutine, optionally
|
|---|
| 77 | specifying the returned value, which will be evaluated in the
|
|---|
| 78 | appropriate context (list, scalar, or void) depending on the context of
|
|---|
| 79 | the subroutine call. If you specify no return value, the subroutine
|
|---|
| 80 | returns an empty list in list context, the undefined value in scalar
|
|---|
| 81 | context, or nothing in void context. If you return one or more
|
|---|
| 82 | aggregates (arrays and hashes), these will be flattened together into
|
|---|
| 83 | one large indistinguishable list.
|
|---|
| 84 |
|
|---|
| 85 | If no C<return> is found and if the last statement is an expression, its
|
|---|
| 86 | value is returned. If the last statement is a loop control structure
|
|---|
| 87 | like a C<foreach> or a C<while>, the returned value is unspecified. The
|
|---|
| 88 | empty sub returns the empty list.
|
|---|
| 89 | X<subroutine, return value> X<return value> X<return>
|
|---|
| 90 |
|
|---|
| 91 | Perl does not have named formal parameters. In practice all you
|
|---|
| 92 | do is assign to a C<my()> list of these. Variables that aren't
|
|---|
| 93 | declared to be private are global variables. For gory details
|
|---|
| 94 | on creating private variables, see L<"Private Variables via my()">
|
|---|
| 95 | and L<"Temporary Values via local()">. To create protected
|
|---|
| 96 | environments for a set of functions in a separate package (and
|
|---|
| 97 | probably a separate file), see L<perlmod/"Packages">.
|
|---|
| 98 | X<formal parameter> X<parameter, formal>
|
|---|
| 99 |
|
|---|
| 100 | Example:
|
|---|
| 101 |
|
|---|
| 102 | sub max {
|
|---|
| 103 | my $max = shift(@_);
|
|---|
| 104 | foreach $foo (@_) {
|
|---|
| 105 | $max = $foo if $max < $foo;
|
|---|
| 106 | }
|
|---|
| 107 | return $max;
|
|---|
| 108 | }
|
|---|
| 109 | $bestday = max($mon,$tue,$wed,$thu,$fri);
|
|---|
| 110 |
|
|---|
| 111 | Example:
|
|---|
| 112 |
|
|---|
| 113 | # get a line, combining continuation lines
|
|---|
| 114 | # that start with whitespace
|
|---|
| 115 |
|
|---|
| 116 | sub get_line {
|
|---|
| 117 | $thisline = $lookahead; # global variables!
|
|---|
| 118 | LINE: while (defined($lookahead = <STDIN>)) {
|
|---|
| 119 | if ($lookahead =~ /^[ \t]/) {
|
|---|
| 120 | $thisline .= $lookahead;
|
|---|
| 121 | }
|
|---|
| 122 | else {
|
|---|
| 123 | last LINE;
|
|---|
| 124 | }
|
|---|
| 125 | }
|
|---|
| 126 | return $thisline;
|
|---|
| 127 | }
|
|---|
| 128 |
|
|---|
| 129 | $lookahead = <STDIN>; # get first line
|
|---|
| 130 | while (defined($line = get_line())) {
|
|---|
| 131 | ...
|
|---|
| 132 | }
|
|---|
| 133 |
|
|---|
| 134 | Assigning to a list of private variables to name your arguments:
|
|---|
| 135 |
|
|---|
| 136 | sub maybeset {
|
|---|
| 137 | my($key, $value) = @_;
|
|---|
| 138 | $Foo{$key} = $value unless $Foo{$key};
|
|---|
| 139 | }
|
|---|
| 140 |
|
|---|
| 141 | Because the assignment copies the values, this also has the effect
|
|---|
| 142 | of turning call-by-reference into call-by-value. Otherwise a
|
|---|
| 143 | function is free to do in-place modifications of C<@_> and change
|
|---|
| 144 | its caller's values.
|
|---|
| 145 | X<call-by-reference> X<call-by-value>
|
|---|
| 146 |
|
|---|
| 147 | upcase_in($v1, $v2); # this changes $v1 and $v2
|
|---|
| 148 | sub upcase_in {
|
|---|
| 149 | for (@_) { tr/a-z/A-Z/ }
|
|---|
| 150 | }
|
|---|
| 151 |
|
|---|
| 152 | You aren't allowed to modify constants in this way, of course. If an
|
|---|
| 153 | argument were actually literal and you tried to change it, you'd take a
|
|---|
| 154 | (presumably fatal) exception. For example, this won't work:
|
|---|
| 155 | X<call-by-reference> X<call-by-value>
|
|---|
| 156 |
|
|---|
| 157 | upcase_in("frederick");
|
|---|
| 158 |
|
|---|
| 159 | It would be much safer if the C<upcase_in()> function
|
|---|
| 160 | were written to return a copy of its parameters instead
|
|---|
| 161 | of changing them in place:
|
|---|
| 162 |
|
|---|
| 163 | ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
|
|---|
| 164 | sub upcase {
|
|---|
| 165 | return unless defined wantarray; # void context, do nothing
|
|---|
| 166 | my @parms = @_;
|
|---|
| 167 | for (@parms) { tr/a-z/A-Z/ }
|
|---|
| 168 | return wantarray ? @parms : $parms[0];
|
|---|
| 169 | }
|
|---|
| 170 |
|
|---|
| 171 | Notice how this (unprototyped) function doesn't care whether it was
|
|---|
| 172 | passed real scalars or arrays. Perl sees all arguments as one big,
|
|---|
| 173 | long, flat parameter list in C<@_>. This is one area where
|
|---|
| 174 | Perl's simple argument-passing style shines. The C<upcase()>
|
|---|
| 175 | function would work perfectly well without changing the C<upcase()>
|
|---|
| 176 | definition even if we fed it things like this:
|
|---|
| 177 |
|
|---|
| 178 | @newlist = upcase(@list1, @list2);
|
|---|
| 179 | @newlist = upcase( split /:/, $var );
|
|---|
| 180 |
|
|---|
| 181 | Do not, however, be tempted to do this:
|
|---|
| 182 |
|
|---|
| 183 | (@a, @b) = upcase(@list1, @list2);
|
|---|
| 184 |
|
|---|
| 185 | Like the flattened incoming parameter list, the return list is also
|
|---|
| 186 | flattened on return. So all you have managed to do here is stored
|
|---|
| 187 | everything in C<@a> and made C<@b> empty. See
|
|---|
| 188 | L<Pass by Reference> for alternatives.
|
|---|
| 189 |
|
|---|
| 190 | A subroutine may be called using an explicit C<&> prefix. The
|
|---|
| 191 | C<&> is optional in modern Perl, as are parentheses if the
|
|---|
| 192 | subroutine has been predeclared. The C<&> is I<not> optional
|
|---|
| 193 | when just naming the subroutine, such as when it's used as
|
|---|
| 194 | an argument to defined() or undef(). Nor is it optional when you
|
|---|
| 195 | want to do an indirect subroutine call with a subroutine name or
|
|---|
| 196 | reference using the C<&$subref()> or C<&{$subref}()> constructs,
|
|---|
| 197 | although the C<< $subref->() >> notation solves that problem.
|
|---|
| 198 | See L<perlref> for more about all that.
|
|---|
| 199 | X<&>
|
|---|
| 200 |
|
|---|
| 201 | Subroutines may be called recursively. If a subroutine is called
|
|---|
| 202 | using the C<&> form, the argument list is optional, and if omitted,
|
|---|
| 203 | no C<@_> array is set up for the subroutine: the C<@_> array at the
|
|---|
| 204 | time of the call is visible to subroutine instead. This is an
|
|---|
| 205 | efficiency mechanism that new users may wish to avoid.
|
|---|
| 206 | X<recursion>
|
|---|
| 207 |
|
|---|
| 208 | &foo(1,2,3); # pass three arguments
|
|---|
| 209 | foo(1,2,3); # the same
|
|---|
| 210 |
|
|---|
| 211 | foo(); # pass a null list
|
|---|
| 212 | &foo(); # the same
|
|---|
| 213 |
|
|---|
| 214 | &foo; # foo() get current args, like foo(@_) !!
|
|---|
| 215 | foo; # like foo() IFF sub foo predeclared, else "foo"
|
|---|
| 216 |
|
|---|
| 217 | Not only does the C<&> form make the argument list optional, it also
|
|---|
| 218 | disables any prototype checking on arguments you do provide. This
|
|---|
| 219 | is partly for historical reasons, and partly for having a convenient way
|
|---|
| 220 | to cheat if you know what you're doing. See L<Prototypes> below.
|
|---|
| 221 | X<&>
|
|---|
| 222 |
|
|---|
| 223 | Subroutines whose names are in all upper case are reserved to the Perl
|
|---|
| 224 | core, as are modules whose names are in all lower case. A subroutine in
|
|---|
| 225 | all capitals is a loosely-held convention meaning it will be called
|
|---|
| 226 | indirectly by the run-time system itself, usually due to a triggered event.
|
|---|
| 227 | Subroutines that do special, pre-defined things include C<AUTOLOAD>, C<CLONE>,
|
|---|
| 228 | C<DESTROY> plus all functions mentioned in L<perltie> and L<PerlIO::via>.
|
|---|
| 229 |
|
|---|
| 230 | The C<BEGIN>, C<CHECK>, C<INIT> and C<END> subroutines are not so much
|
|---|
| 231 | subroutines as named special code blocks, of which you can have more
|
|---|
| 232 | than one in a package, and which you can B<not> call explicitly. See
|
|---|
| 233 | L<perlmod/"BEGIN, CHECK, INIT and END">
|
|---|
| 234 |
|
|---|
| 235 | =head2 Private Variables via my()
|
|---|
| 236 | X<my> X<variable, lexical> X<lexical> X<lexical variable> X<scope, lexical>
|
|---|
| 237 | X<lexical scope> X<attributes, my>
|
|---|
| 238 |
|
|---|
| 239 | Synopsis:
|
|---|
| 240 |
|
|---|
| 241 | my $foo; # declare $foo lexically local
|
|---|
| 242 | my (@wid, %get); # declare list of variables local
|
|---|
| 243 | my $foo = "flurp"; # declare $foo lexical, and init it
|
|---|
| 244 | my @oof = @bar; # declare @oof lexical, and init it
|
|---|
| 245 | my $x : Foo = $y; # similar, with an attribute applied
|
|---|
| 246 |
|
|---|
| 247 | B<WARNING>: The use of attribute lists on C<my> declarations is still
|
|---|
| 248 | evolving. The current semantics and interface are subject to change.
|
|---|
| 249 | See L<attributes> and L<Attribute::Handlers>.
|
|---|
| 250 |
|
|---|
| 251 | The C<my> operator declares the listed variables to be lexically
|
|---|
| 252 | confined to the enclosing block, conditional (C<if/unless/elsif/else>),
|
|---|
| 253 | loop (C<for/foreach/while/until/continue>), subroutine, C<eval>,
|
|---|
| 254 | or C<do/require/use>'d file. If more than one value is listed, the
|
|---|
| 255 | list must be placed in parentheses. All listed elements must be
|
|---|
| 256 | legal lvalues. Only alphanumeric identifiers may be lexically
|
|---|
| 257 | scoped--magical built-ins like C<$/> must currently be C<local>ized
|
|---|
| 258 | with C<local> instead.
|
|---|
| 259 |
|
|---|
| 260 | Unlike dynamic variables created by the C<local> operator, lexical
|
|---|
| 261 | variables declared with C<my> are totally hidden from the outside
|
|---|
| 262 | world, including any called subroutines. This is true if it's the
|
|---|
| 263 | same subroutine called from itself or elsewhere--every call gets
|
|---|
| 264 | its own copy.
|
|---|
| 265 | X<local>
|
|---|
| 266 |
|
|---|
| 267 | This doesn't mean that a C<my> variable declared in a statically
|
|---|
| 268 | enclosing lexical scope would be invisible. Only dynamic scopes
|
|---|
| 269 | are cut off. For example, the C<bumpx()> function below has access
|
|---|
| 270 | to the lexical $x variable because both the C<my> and the C<sub>
|
|---|
| 271 | occurred at the same scope, presumably file scope.
|
|---|
| 272 |
|
|---|
| 273 | my $x = 10;
|
|---|
| 274 | sub bumpx { $x++ }
|
|---|
| 275 |
|
|---|
| 276 | An C<eval()>, however, can see lexical variables of the scope it is
|
|---|
| 277 | being evaluated in, so long as the names aren't hidden by declarations within
|
|---|
| 278 | the C<eval()> itself. See L<perlref>.
|
|---|
| 279 | X<eval, scope of>
|
|---|
| 280 |
|
|---|
| 281 | The parameter list to my() may be assigned to if desired, which allows you
|
|---|
| 282 | to initialize your variables. (If no initializer is given for a
|
|---|
| 283 | particular variable, it is created with the undefined value.) Commonly
|
|---|
| 284 | this is used to name input parameters to a subroutine. Examples:
|
|---|
| 285 |
|
|---|
| 286 | $arg = "fred"; # "global" variable
|
|---|
| 287 | $n = cube_root(27);
|
|---|
| 288 | print "$arg thinks the root is $n\n";
|
|---|
| 289 | fred thinks the root is 3
|
|---|
| 290 |
|
|---|
| 291 | sub cube_root {
|
|---|
| 292 | my $arg = shift; # name doesn't matter
|
|---|
| 293 | $arg **= 1/3;
|
|---|
| 294 | return $arg;
|
|---|
| 295 | }
|
|---|
| 296 |
|
|---|
| 297 | The C<my> is simply a modifier on something you might assign to. So when
|
|---|
| 298 | you do assign to variables in its argument list, C<my> doesn't
|
|---|
| 299 | change whether those variables are viewed as a scalar or an array. So
|
|---|
| 300 |
|
|---|
| 301 | my ($foo) = <STDIN>; # WRONG?
|
|---|
| 302 | my @FOO = <STDIN>;
|
|---|
| 303 |
|
|---|
| 304 | both supply a list context to the right-hand side, while
|
|---|
| 305 |
|
|---|
| 306 | my $foo = <STDIN>;
|
|---|
| 307 |
|
|---|
| 308 | supplies a scalar context. But the following declares only one variable:
|
|---|
| 309 |
|
|---|
| 310 | my $foo, $bar = 1; # WRONG
|
|---|
| 311 |
|
|---|
| 312 | That has the same effect as
|
|---|
| 313 |
|
|---|
| 314 | my $foo;
|
|---|
| 315 | $bar = 1;
|
|---|
| 316 |
|
|---|
| 317 | The declared variable is not introduced (is not visible) until after
|
|---|
| 318 | the current statement. Thus,
|
|---|
| 319 |
|
|---|
| 320 | my $x = $x;
|
|---|
| 321 |
|
|---|
| 322 | can be used to initialize a new $x with the value of the old $x, and
|
|---|
| 323 | the expression
|
|---|
| 324 |
|
|---|
| 325 | my $x = 123 and $x == 123
|
|---|
| 326 |
|
|---|
| 327 | is false unless the old $x happened to have the value C<123>.
|
|---|
| 328 |
|
|---|
| 329 | Lexical scopes of control structures are not bounded precisely by the
|
|---|
| 330 | braces that delimit their controlled blocks; control expressions are
|
|---|
| 331 | part of that scope, too. Thus in the loop
|
|---|
| 332 |
|
|---|
| 333 | while (my $line = <>) {
|
|---|
| 334 | $line = lc $line;
|
|---|
| 335 | } continue {
|
|---|
| 336 | print $line;
|
|---|
| 337 | }
|
|---|
| 338 |
|
|---|
| 339 | the scope of $line extends from its declaration throughout the rest of
|
|---|
| 340 | the loop construct (including the C<continue> clause), but not beyond
|
|---|
| 341 | it. Similarly, in the conditional
|
|---|
| 342 |
|
|---|
| 343 | if ((my $answer = <STDIN>) =~ /^yes$/i) {
|
|---|
| 344 | user_agrees();
|
|---|
| 345 | } elsif ($answer =~ /^no$/i) {
|
|---|
| 346 | user_disagrees();
|
|---|
| 347 | } else {
|
|---|
| 348 | chomp $answer;
|
|---|
| 349 | die "'$answer' is neither 'yes' nor 'no'";
|
|---|
| 350 | }
|
|---|
| 351 |
|
|---|
| 352 | the scope of $answer extends from its declaration through the rest
|
|---|
| 353 | of that conditional, including any C<elsif> and C<else> clauses,
|
|---|
| 354 | but not beyond it. See L<perlsyn/"Simple statements"> for information
|
|---|
| 355 | on the scope of variables in statements with modifiers.
|
|---|
| 356 |
|
|---|
| 357 | The C<foreach> loop defaults to scoping its index variable dynamically
|
|---|
| 358 | in the manner of C<local>. However, if the index variable is
|
|---|
| 359 | prefixed with the keyword C<my>, or if there is already a lexical
|
|---|
| 360 | by that name in scope, then a new lexical is created instead. Thus
|
|---|
| 361 | in the loop
|
|---|
| 362 | X<foreach> X<for>
|
|---|
| 363 |
|
|---|
| 364 | for my $i (1, 2, 3) {
|
|---|
| 365 | some_function();
|
|---|
| 366 | }
|
|---|
| 367 |
|
|---|
| 368 | the scope of $i extends to the end of the loop, but not beyond it,
|
|---|
| 369 | rendering the value of $i inaccessible within C<some_function()>.
|
|---|
| 370 | X<foreach> X<for>
|
|---|
| 371 |
|
|---|
| 372 | Some users may wish to encourage the use of lexically scoped variables.
|
|---|
| 373 | As an aid to catching implicit uses to package variables,
|
|---|
| 374 | which are always global, if you say
|
|---|
| 375 |
|
|---|
| 376 | use strict 'vars';
|
|---|
| 377 |
|
|---|
| 378 | then any variable mentioned from there to the end of the enclosing
|
|---|
| 379 | block must either refer to a lexical variable, be predeclared via
|
|---|
| 380 | C<our> or C<use vars>, or else must be fully qualified with the package name.
|
|---|
| 381 | A compilation error results otherwise. An inner block may countermand
|
|---|
| 382 | this with C<no strict 'vars'>.
|
|---|
| 383 |
|
|---|
| 384 | A C<my> has both a compile-time and a run-time effect. At compile
|
|---|
| 385 | time, the compiler takes notice of it. The principal usefulness
|
|---|
| 386 | of this is to quiet C<use strict 'vars'>, but it is also essential
|
|---|
| 387 | for generation of closures as detailed in L<perlref>. Actual
|
|---|
| 388 | initialization is delayed until run time, though, so it gets executed
|
|---|
| 389 | at the appropriate time, such as each time through a loop, for
|
|---|
| 390 | example.
|
|---|
| 391 |
|
|---|
| 392 | Variables declared with C<my> are not part of any package and are therefore
|
|---|
| 393 | never fully qualified with the package name. In particular, you're not
|
|---|
| 394 | allowed to try to make a package variable (or other global) lexical:
|
|---|
| 395 |
|
|---|
| 396 | my $pack::var; # ERROR! Illegal syntax
|
|---|
| 397 | my $_; # also illegal (currently)
|
|---|
| 398 |
|
|---|
| 399 | In fact, a dynamic variable (also known as package or global variables)
|
|---|
| 400 | are still accessible using the fully qualified C<::> notation even while a
|
|---|
| 401 | lexical of the same name is also visible:
|
|---|
| 402 |
|
|---|
| 403 | package main;
|
|---|
| 404 | local $x = 10;
|
|---|
| 405 | my $x = 20;
|
|---|
| 406 | print "$x and $::x\n";
|
|---|
| 407 |
|
|---|
| 408 | That will print out C<20> and C<10>.
|
|---|
| 409 |
|
|---|
| 410 | You may declare C<my> variables at the outermost scope of a file
|
|---|
| 411 | to hide any such identifiers from the world outside that file. This
|
|---|
| 412 | is similar in spirit to C's static variables when they are used at
|
|---|
| 413 | the file level. To do this with a subroutine requires the use of
|
|---|
| 414 | a closure (an anonymous function that accesses enclosing lexicals).
|
|---|
| 415 | If you want to create a private subroutine that cannot be called
|
|---|
| 416 | from outside that block, it can declare a lexical variable containing
|
|---|
| 417 | an anonymous sub reference:
|
|---|
| 418 |
|
|---|
| 419 | my $secret_version = '1.001-beta';
|
|---|
| 420 | my $secret_sub = sub { print $secret_version };
|
|---|
| 421 | &$secret_sub();
|
|---|
| 422 |
|
|---|
| 423 | As long as the reference is never returned by any function within the
|
|---|
| 424 | module, no outside module can see the subroutine, because its name is not in
|
|---|
| 425 | any package's symbol table. Remember that it's not I<REALLY> called
|
|---|
| 426 | C<$some_pack::secret_version> or anything; it's just $secret_version,
|
|---|
| 427 | unqualified and unqualifiable.
|
|---|
| 428 |
|
|---|
| 429 | This does not work with object methods, however; all object methods
|
|---|
| 430 | have to be in the symbol table of some package to be found. See
|
|---|
| 431 | L<perlref/"Function Templates"> for something of a work-around to
|
|---|
| 432 | this.
|
|---|
| 433 |
|
|---|
| 434 | =head2 Persistent Private Variables
|
|---|
| 435 | X<static> X<variable, persistent> X<variable, static> X<closure>
|
|---|
| 436 |
|
|---|
| 437 | Just because a lexical variable is lexically (also called statically)
|
|---|
| 438 | scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that
|
|---|
| 439 | within a function it works like a C static. It normally works more
|
|---|
| 440 | like a C auto, but with implicit garbage collection.
|
|---|
| 441 |
|
|---|
| 442 | Unlike local variables in C or C++, Perl's lexical variables don't
|
|---|
| 443 | necessarily get recycled just because their scope has exited.
|
|---|
| 444 | If something more permanent is still aware of the lexical, it will
|
|---|
| 445 | stick around. So long as something else references a lexical, that
|
|---|
| 446 | lexical won't be freed--which is as it should be. You wouldn't want
|
|---|
| 447 | memory being free until you were done using it, or kept around once you
|
|---|
| 448 | were done. Automatic garbage collection takes care of this for you.
|
|---|
| 449 |
|
|---|
| 450 | This means that you can pass back or save away references to lexical
|
|---|
| 451 | variables, whereas to return a pointer to a C auto is a grave error.
|
|---|
| 452 | It also gives us a way to simulate C's function statics. Here's a
|
|---|
| 453 | mechanism for giving a function private variables with both lexical
|
|---|
| 454 | scoping and a static lifetime. If you do want to create something like
|
|---|
| 455 | C's static variables, just enclose the whole function in an extra block,
|
|---|
| 456 | and put the static variable outside the function but in the block.
|
|---|
| 457 |
|
|---|
| 458 | {
|
|---|
| 459 | my $secret_val = 0;
|
|---|
| 460 | sub gimme_another {
|
|---|
| 461 | return ++$secret_val;
|
|---|
| 462 | }
|
|---|
| 463 | }
|
|---|
| 464 | # $secret_val now becomes unreachable by the outside
|
|---|
| 465 | # world, but retains its value between calls to gimme_another
|
|---|
| 466 |
|
|---|
| 467 | If this function is being sourced in from a separate file
|
|---|
| 468 | via C<require> or C<use>, then this is probably just fine. If it's
|
|---|
| 469 | all in the main program, you'll need to arrange for the C<my>
|
|---|
| 470 | to be executed early, either by putting the whole block above
|
|---|
| 471 | your main program, or more likely, placing merely a C<BEGIN>
|
|---|
| 472 | code block around it to make sure it gets executed before your program
|
|---|
| 473 | starts to run:
|
|---|
| 474 |
|
|---|
| 475 | BEGIN {
|
|---|
| 476 | my $secret_val = 0;
|
|---|
| 477 | sub gimme_another {
|
|---|
| 478 | return ++$secret_val;
|
|---|
| 479 | }
|
|---|
| 480 | }
|
|---|
| 481 |
|
|---|
| 482 | See L<perlmod/"BEGIN, CHECK, INIT and END"> about the
|
|---|
| 483 | special triggered code blocks, C<BEGIN>, C<CHECK>, C<INIT> and C<END>.
|
|---|
| 484 |
|
|---|
| 485 | If declared at the outermost scope (the file scope), then lexicals
|
|---|
| 486 | work somewhat like C's file statics. They are available to all
|
|---|
| 487 | functions in that same file declared below them, but are inaccessible
|
|---|
| 488 | from outside that file. This strategy is sometimes used in modules
|
|---|
| 489 | to create private variables that the whole module can see.
|
|---|
| 490 |
|
|---|
| 491 | =head2 Temporary Values via local()
|
|---|
| 492 | X<local> X<scope, dynamic> X<dynamic scope> X<variable, local>
|
|---|
| 493 | X<variable, temporary>
|
|---|
| 494 |
|
|---|
| 495 | B<WARNING>: In general, you should be using C<my> instead of C<local>, because
|
|---|
| 496 | it's faster and safer. Exceptions to this include the global punctuation
|
|---|
| 497 | variables, global filehandles and formats, and direct manipulation of the
|
|---|
| 498 | Perl symbol table itself. C<local> is mostly used when the current value
|
|---|
| 499 | of a variable must be visible to called subroutines.
|
|---|
| 500 |
|
|---|
| 501 | Synopsis:
|
|---|
| 502 |
|
|---|
| 503 | # localization of values
|
|---|
| 504 |
|
|---|
| 505 | local $foo; # make $foo dynamically local
|
|---|
| 506 | local (@wid, %get); # make list of variables local
|
|---|
| 507 | local $foo = "flurp"; # make $foo dynamic, and init it
|
|---|
| 508 | local @oof = @bar; # make @oof dynamic, and init it
|
|---|
| 509 |
|
|---|
| 510 | local $hash{key} = "val"; # sets a local value for this hash entry
|
|---|
| 511 | local ($cond ? $v1 : $v2); # several types of lvalues support
|
|---|
| 512 | # localization
|
|---|
| 513 |
|
|---|
| 514 | # localization of symbols
|
|---|
| 515 |
|
|---|
| 516 | local *FH; # localize $FH, @FH, %FH, &FH ...
|
|---|
| 517 | local *merlyn = *randal; # now $merlyn is really $randal, plus
|
|---|
| 518 | # @merlyn is really @randal, etc
|
|---|
| 519 | local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
|
|---|
| 520 | local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
|
|---|
| 521 |
|
|---|
| 522 | A C<local> modifies its listed variables to be "local" to the
|
|---|
| 523 | enclosing block, C<eval>, or C<do FILE>--and to I<any subroutine
|
|---|
| 524 | called from within that block>. A C<local> just gives temporary
|
|---|
| 525 | values to global (meaning package) variables. It does I<not> create
|
|---|
| 526 | a local variable. This is known as dynamic scoping. Lexical scoping
|
|---|
| 527 | is done with C<my>, which works more like C's auto declarations.
|
|---|
| 528 |
|
|---|
| 529 | Some types of lvalues can be localized as well : hash and array elements
|
|---|
| 530 | and slices, conditionals (provided that their result is always
|
|---|
| 531 | localizable), and symbolic references. As for simple variables, this
|
|---|
| 532 | creates new, dynamically scoped values.
|
|---|
| 533 |
|
|---|
| 534 | If more than one variable or expression is given to C<local>, they must be
|
|---|
| 535 | placed in parentheses. This operator works
|
|---|
| 536 | by saving the current values of those variables in its argument list on a
|
|---|
| 537 | hidden stack and restoring them upon exiting the block, subroutine, or
|
|---|
| 538 | eval. This means that called subroutines can also reference the local
|
|---|
| 539 | variable, but not the global one. The argument list may be assigned to if
|
|---|
| 540 | desired, which allows you to initialize your local variables. (If no
|
|---|
| 541 | initializer is given for a particular variable, it is created with an
|
|---|
| 542 | undefined value.)
|
|---|
| 543 |
|
|---|
| 544 | Because C<local> is a run-time operator, it gets executed each time
|
|---|
| 545 | through a loop. Consequently, it's more efficient to localize your
|
|---|
| 546 | variables outside the loop.
|
|---|
| 547 |
|
|---|
| 548 | =head3 Grammatical note on local()
|
|---|
| 549 | X<local, context>
|
|---|
| 550 |
|
|---|
| 551 | A C<local> is simply a modifier on an lvalue expression. When you assign to
|
|---|
| 552 | a C<local>ized variable, the C<local> doesn't change whether its list is viewed
|
|---|
| 553 | as a scalar or an array. So
|
|---|
| 554 |
|
|---|
| 555 | local($foo) = <STDIN>;
|
|---|
| 556 | local @FOO = <STDIN>;
|
|---|
| 557 |
|
|---|
| 558 | both supply a list context to the right-hand side, while
|
|---|
| 559 |
|
|---|
| 560 | local $foo = <STDIN>;
|
|---|
| 561 |
|
|---|
| 562 | supplies a scalar context.
|
|---|
| 563 |
|
|---|
| 564 | =head3 Localization of special variables
|
|---|
| 565 | X<local, special variable>
|
|---|
| 566 |
|
|---|
| 567 | If you localize a special variable, you'll be giving a new value to it,
|
|---|
| 568 | but its magic won't go away. That means that all side-effects related
|
|---|
| 569 | to this magic still work with the localized value.
|
|---|
| 570 |
|
|---|
| 571 | This feature allows code like this to work :
|
|---|
| 572 |
|
|---|
| 573 | # Read the whole contents of FILE in $slurp
|
|---|
| 574 | { local $/ = undef; $slurp = <FILE>; }
|
|---|
| 575 |
|
|---|
| 576 | Note, however, that this restricts localization of some values ; for
|
|---|
| 577 | example, the following statement dies, as of perl 5.9.0, with an error
|
|---|
| 578 | I<Modification of a read-only value attempted>, because the $1 variable is
|
|---|
| 579 | magical and read-only :
|
|---|
| 580 |
|
|---|
| 581 | local $1 = 2;
|
|---|
| 582 |
|
|---|
| 583 | Similarly, but in a way more difficult to spot, the following snippet will
|
|---|
| 584 | die in perl 5.9.0 :
|
|---|
| 585 |
|
|---|
| 586 | sub f { local $_ = "foo"; print }
|
|---|
| 587 | for ($1) {
|
|---|
| 588 | # now $_ is aliased to $1, thus is magic and readonly
|
|---|
| 589 | f();
|
|---|
| 590 | }
|
|---|
| 591 |
|
|---|
| 592 | See next section for an alternative to this situation.
|
|---|
| 593 |
|
|---|
| 594 | B<WARNING>: Localization of tied arrays and hashes does not currently
|
|---|
| 595 | work as described.
|
|---|
| 596 | This will be fixed in a future release of Perl; in the meantime, avoid
|
|---|
| 597 | code that relies on any particular behaviour of localising tied arrays
|
|---|
| 598 | or hashes (localising individual elements is still okay).
|
|---|
| 599 | See L<perl58delta/"Localising Tied Arrays and Hashes Is Broken"> for more
|
|---|
| 600 | details.
|
|---|
| 601 | X<local, tie>
|
|---|
| 602 |
|
|---|
| 603 | =head3 Localization of globs
|
|---|
| 604 | X<local, glob> X<glob>
|
|---|
| 605 |
|
|---|
| 606 | The construct
|
|---|
| 607 |
|
|---|
| 608 | local *name;
|
|---|
| 609 |
|
|---|
| 610 | creates a whole new symbol table entry for the glob C<name> in the
|
|---|
| 611 | current package. That means that all variables in its glob slot ($name,
|
|---|
| 612 | @name, %name, &name, and the C<name> filehandle) are dynamically reset.
|
|---|
| 613 |
|
|---|
| 614 | This implies, among other things, that any magic eventually carried by
|
|---|
| 615 | those variables is locally lost. In other words, saying C<local */>
|
|---|
| 616 | will not have any effect on the internal value of the input record
|
|---|
| 617 | separator.
|
|---|
| 618 |
|
|---|
| 619 | Notably, if you want to work with a brand new value of the default scalar
|
|---|
| 620 | $_, and avoid the potential problem listed above about $_ previously
|
|---|
| 621 | carrying a magic value, you should use C<local *_> instead of C<local $_>.
|
|---|
| 622 |
|
|---|
| 623 | =head3 Localization of elements of composite types
|
|---|
| 624 | X<local, composite type element> X<local, array element> X<local, hash element>
|
|---|
| 625 |
|
|---|
| 626 | It's also worth taking a moment to explain what happens when you
|
|---|
| 627 | C<local>ize a member of a composite type (i.e. an array or hash element).
|
|---|
| 628 | In this case, the element is C<local>ized I<by name>. This means that
|
|---|
| 629 | when the scope of the C<local()> ends, the saved value will be
|
|---|
| 630 | restored to the hash element whose key was named in the C<local()>, or
|
|---|
| 631 | the array element whose index was named in the C<local()>. If that
|
|---|
| 632 | element was deleted while the C<local()> was in effect (e.g. by a
|
|---|
| 633 | C<delete()> from a hash or a C<shift()> of an array), it will spring
|
|---|
| 634 | back into existence, possibly extending an array and filling in the
|
|---|
| 635 | skipped elements with C<undef>. For instance, if you say
|
|---|
| 636 |
|
|---|
| 637 | %hash = ( 'This' => 'is', 'a' => 'test' );
|
|---|
| 638 | @ary = ( 0..5 );
|
|---|
| 639 | {
|
|---|
| 640 | local($ary[5]) = 6;
|
|---|
| 641 | local($hash{'a'}) = 'drill';
|
|---|
| 642 | while (my $e = pop(@ary)) {
|
|---|
| 643 | print "$e . . .\n";
|
|---|
| 644 | last unless $e > 3;
|
|---|
| 645 | }
|
|---|
| 646 | if (@ary) {
|
|---|
| 647 | $hash{'only a'} = 'test';
|
|---|
| 648 | delete $hash{'a'};
|
|---|
| 649 | }
|
|---|
| 650 | }
|
|---|
| 651 | print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
|
|---|
| 652 | print "The array has ",scalar(@ary)," elements: ",
|
|---|
| 653 | join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";
|
|---|
| 654 |
|
|---|
| 655 | Perl will print
|
|---|
| 656 |
|
|---|
| 657 | 6 . . .
|
|---|
| 658 | 4 . . .
|
|---|
| 659 | 3 . . .
|
|---|
| 660 | This is a test only a test.
|
|---|
| 661 | The array has 6 elements: 0, 1, 2, undef, undef, 5
|
|---|
| 662 |
|
|---|
| 663 | The behavior of local() on non-existent members of composite
|
|---|
| 664 | types is subject to change in future.
|
|---|
| 665 |
|
|---|
| 666 | =head2 Lvalue subroutines
|
|---|
| 667 | X<lvalue> X<subroutine, lvalue>
|
|---|
| 668 |
|
|---|
| 669 | B<WARNING>: Lvalue subroutines are still experimental and the
|
|---|
| 670 | implementation may change in future versions of Perl.
|
|---|
| 671 |
|
|---|
| 672 | It is possible to return a modifiable value from a subroutine.
|
|---|
| 673 | To do this, you have to declare the subroutine to return an lvalue.
|
|---|
| 674 |
|
|---|
| 675 | my $val;
|
|---|
| 676 | sub canmod : lvalue {
|
|---|
| 677 | # return $val; this doesn't work, don't say "return"
|
|---|
| 678 | $val;
|
|---|
| 679 | }
|
|---|
| 680 | sub nomod {
|
|---|
| 681 | $val;
|
|---|
| 682 | }
|
|---|
| 683 |
|
|---|
| 684 | canmod() = 5; # assigns to $val
|
|---|
| 685 | nomod() = 5; # ERROR
|
|---|
| 686 |
|
|---|
| 687 | The scalar/list context for the subroutine and for the right-hand
|
|---|
| 688 | side of assignment is determined as if the subroutine call is replaced
|
|---|
| 689 | by a scalar. For example, consider:
|
|---|
| 690 |
|
|---|
| 691 | data(2,3) = get_data(3,4);
|
|---|
| 692 |
|
|---|
| 693 | Both subroutines here are called in a scalar context, while in:
|
|---|
| 694 |
|
|---|
| 695 | (data(2,3)) = get_data(3,4);
|
|---|
| 696 |
|
|---|
| 697 | and in:
|
|---|
| 698 |
|
|---|
| 699 | (data(2),data(3)) = get_data(3,4);
|
|---|
| 700 |
|
|---|
| 701 | all the subroutines are called in a list context.
|
|---|
| 702 |
|
|---|
| 703 | =over 4
|
|---|
| 704 |
|
|---|
| 705 | =item Lvalue subroutines are EXPERIMENTAL
|
|---|
| 706 |
|
|---|
| 707 | They appear to be convenient, but there are several reasons to be
|
|---|
| 708 | circumspect.
|
|---|
| 709 |
|
|---|
| 710 | You can't use the return keyword, you must pass out the value before
|
|---|
| 711 | falling out of subroutine scope. (see comment in example above). This
|
|---|
| 712 | is usually not a problem, but it disallows an explicit return out of a
|
|---|
| 713 | deeply nested loop, which is sometimes a nice way out.
|
|---|
| 714 |
|
|---|
| 715 | They violate encapsulation. A normal mutator can check the supplied
|
|---|
| 716 | argument before setting the attribute it is protecting, an lvalue
|
|---|
| 717 | subroutine never gets that chance. Consider;
|
|---|
| 718 |
|
|---|
| 719 | my $some_array_ref = []; # protected by mutators ??
|
|---|
| 720 |
|
|---|
| 721 | sub set_arr { # normal mutator
|
|---|
| 722 | my $val = shift;
|
|---|
| 723 | die("expected array, you supplied ", ref $val)
|
|---|
| 724 | unless ref $val eq 'ARRAY';
|
|---|
| 725 | $some_array_ref = $val;
|
|---|
| 726 | }
|
|---|
| 727 | sub set_arr_lv : lvalue { # lvalue mutator
|
|---|
| 728 | $some_array_ref;
|
|---|
| 729 | }
|
|---|
| 730 |
|
|---|
| 731 | # set_arr_lv cannot stop this !
|
|---|
| 732 | set_arr_lv() = { a => 1 };
|
|---|
| 733 |
|
|---|
| 734 | =back
|
|---|
| 735 |
|
|---|
| 736 | =head2 Passing Symbol Table Entries (typeglobs)
|
|---|
| 737 | X<typeglob> X<*>
|
|---|
| 738 |
|
|---|
| 739 | B<WARNING>: The mechanism described in this section was originally
|
|---|
| 740 | the only way to simulate pass-by-reference in older versions of
|
|---|
| 741 | Perl. While it still works fine in modern versions, the new reference
|
|---|
| 742 | mechanism is generally easier to work with. See below.
|
|---|
| 743 |
|
|---|
| 744 | Sometimes you don't want to pass the value of an array to a subroutine
|
|---|
| 745 | but rather the name of it, so that the subroutine can modify the global
|
|---|
| 746 | copy of it rather than working with a local copy. In perl you can
|
|---|
| 747 | refer to all objects of a particular name by prefixing the name
|
|---|
| 748 | with a star: C<*foo>. This is often known as a "typeglob", because the
|
|---|
| 749 | star on the front can be thought of as a wildcard match for all the
|
|---|
| 750 | funny prefix characters on variables and subroutines and such.
|
|---|
| 751 |
|
|---|
| 752 | When evaluated, the typeglob produces a scalar value that represents
|
|---|
| 753 | all the objects of that name, including any filehandle, format, or
|
|---|
| 754 | subroutine. When assigned to, it causes the name mentioned to refer to
|
|---|
| 755 | whatever C<*> value was assigned to it. Example:
|
|---|
| 756 |
|
|---|
| 757 | sub doubleary {
|
|---|
| 758 | local(*someary) = @_;
|
|---|
| 759 | foreach $elem (@someary) {
|
|---|
| 760 | $elem *= 2;
|
|---|
| 761 | }
|
|---|
| 762 | }
|
|---|
| 763 | doubleary(*foo);
|
|---|
| 764 | doubleary(*bar);
|
|---|
| 765 |
|
|---|
| 766 | Scalars are already passed by reference, so you can modify
|
|---|
| 767 | scalar arguments without using this mechanism by referring explicitly
|
|---|
| 768 | to C<$_[0]> etc. You can modify all the elements of an array by passing
|
|---|
| 769 | all the elements as scalars, but you have to use the C<*> mechanism (or
|
|---|
| 770 | the equivalent reference mechanism) to C<push>, C<pop>, or change the size of
|
|---|
| 771 | an array. It will certainly be faster to pass the typeglob (or reference).
|
|---|
| 772 |
|
|---|
| 773 | Even if you don't want to modify an array, this mechanism is useful for
|
|---|
| 774 | passing multiple arrays in a single LIST, because normally the LIST
|
|---|
| 775 | mechanism will merge all the array values so that you can't extract out
|
|---|
| 776 | the individual arrays. For more on typeglobs, see
|
|---|
| 777 | L<perldata/"Typeglobs and Filehandles">.
|
|---|
| 778 |
|
|---|
| 779 | =head2 When to Still Use local()
|
|---|
| 780 | X<local> X<variable, local>
|
|---|
| 781 |
|
|---|
| 782 | Despite the existence of C<my>, there are still three places where the
|
|---|
| 783 | C<local> operator still shines. In fact, in these three places, you
|
|---|
| 784 | I<must> use C<local> instead of C<my>.
|
|---|
| 785 |
|
|---|
| 786 | =over 4
|
|---|
| 787 |
|
|---|
| 788 | =item 1.
|
|---|
| 789 |
|
|---|
| 790 | You need to give a global variable a temporary value, especially $_.
|
|---|
| 791 |
|
|---|
| 792 | The global variables, like C<@ARGV> or the punctuation variables, must be
|
|---|
| 793 | C<local>ized with C<local()>. This block reads in F</etc/motd>, and splits
|
|---|
| 794 | it up into chunks separated by lines of equal signs, which are placed
|
|---|
| 795 | in C<@Fields>.
|
|---|
| 796 |
|
|---|
| 797 | {
|
|---|
| 798 | local @ARGV = ("/etc/motd");
|
|---|
| 799 | local $/ = undef;
|
|---|
| 800 | local $_ = <>;
|
|---|
| 801 | @Fields = split /^\s*=+\s*$/;
|
|---|
| 802 | }
|
|---|
| 803 |
|
|---|
| 804 | It particular, it's important to C<local>ize $_ in any routine that assigns
|
|---|
| 805 | to it. Look out for implicit assignments in C<while> conditionals.
|
|---|
| 806 |
|
|---|
| 807 | =item 2.
|
|---|
| 808 |
|
|---|
| 809 | You need to create a local file or directory handle or a local function.
|
|---|
| 810 |
|
|---|
| 811 | A function that needs a filehandle of its own must use
|
|---|
| 812 | C<local()> on a complete typeglob. This can be used to create new symbol
|
|---|
| 813 | table entries:
|
|---|
| 814 |
|
|---|
| 815 | sub ioqueue {
|
|---|
| 816 | local (*READER, *WRITER); # not my!
|
|---|
| 817 | pipe (READER, WRITER) or die "pipe: $!";
|
|---|
| 818 | return (*READER, *WRITER);
|
|---|
| 819 | }
|
|---|
| 820 | ($head, $tail) = ioqueue();
|
|---|
| 821 |
|
|---|
| 822 | See the Symbol module for a way to create anonymous symbol table
|
|---|
| 823 | entries.
|
|---|
| 824 |
|
|---|
| 825 | Because assignment of a reference to a typeglob creates an alias, this
|
|---|
| 826 | can be used to create what is effectively a local function, or at least,
|
|---|
| 827 | a local alias.
|
|---|
| 828 |
|
|---|
| 829 | {
|
|---|
| 830 | local *grow = \&shrink; # only until this block exists
|
|---|
| 831 | grow(); # really calls shrink()
|
|---|
| 832 | move(); # if move() grow()s, it shrink()s too
|
|---|
| 833 | }
|
|---|
| 834 | grow(); # get the real grow() again
|
|---|
| 835 |
|
|---|
| 836 | See L<perlref/"Function Templates"> for more about manipulating
|
|---|
| 837 | functions by name in this way.
|
|---|
| 838 |
|
|---|
| 839 | =item 3.
|
|---|
| 840 |
|
|---|
| 841 | You want to temporarily change just one element of an array or hash.
|
|---|
| 842 |
|
|---|
| 843 | You can C<local>ize just one element of an aggregate. Usually this
|
|---|
| 844 | is done on dynamics:
|
|---|
| 845 |
|
|---|
| 846 | {
|
|---|
| 847 | local $SIG{INT} = 'IGNORE';
|
|---|
| 848 | funct(); # uninterruptible
|
|---|
| 849 | }
|
|---|
| 850 | # interruptibility automatically restored here
|
|---|
| 851 |
|
|---|
| 852 | But it also works on lexically declared aggregates. Prior to 5.005,
|
|---|
| 853 | this operation could on occasion misbehave.
|
|---|
| 854 |
|
|---|
| 855 | =back
|
|---|
| 856 |
|
|---|
| 857 | =head2 Pass by Reference
|
|---|
| 858 | X<pass by reference> X<pass-by-reference> X<reference>
|
|---|
| 859 |
|
|---|
| 860 | If you want to pass more than one array or hash into a function--or
|
|---|
| 861 | return them from it--and have them maintain their integrity, then
|
|---|
| 862 | you're going to have to use an explicit pass-by-reference. Before you
|
|---|
| 863 | do that, you need to understand references as detailed in L<perlref>.
|
|---|
| 864 | This section may not make much sense to you otherwise.
|
|---|
| 865 |
|
|---|
| 866 | Here are a few simple examples. First, let's pass in several arrays
|
|---|
| 867 | to a function and have it C<pop> all of then, returning a new list
|
|---|
| 868 | of all their former last elements:
|
|---|
| 869 |
|
|---|
| 870 | @tailings = popmany ( \@a, \@b, \@c, \@d );
|
|---|
| 871 |
|
|---|
| 872 | sub popmany {
|
|---|
| 873 | my $aref;
|
|---|
| 874 | my @retlist = ();
|
|---|
| 875 | foreach $aref ( @_ ) {
|
|---|
| 876 | push @retlist, pop @$aref;
|
|---|
| 877 | }
|
|---|
| 878 | return @retlist;
|
|---|
| 879 | }
|
|---|
| 880 |
|
|---|
| 881 | Here's how you might write a function that returns a
|
|---|
| 882 | list of keys occurring in all the hashes passed to it:
|
|---|
| 883 |
|
|---|
| 884 | @common = inter( \%foo, \%bar, \%joe );
|
|---|
| 885 | sub inter {
|
|---|
| 886 | my ($k, $href, %seen); # locals
|
|---|
| 887 | foreach $href (@_) {
|
|---|
| 888 | while ( $k = each %$href ) {
|
|---|
| 889 | $seen{$k}++;
|
|---|
| 890 | }
|
|---|
| 891 | }
|
|---|
| 892 | return grep { $seen{$_} == @_ } keys %seen;
|
|---|
| 893 | }
|
|---|
| 894 |
|
|---|
| 895 | So far, we're using just the normal list return mechanism.
|
|---|
| 896 | What happens if you want to pass or return a hash? Well,
|
|---|
| 897 | if you're using only one of them, or you don't mind them
|
|---|
| 898 | concatenating, then the normal calling convention is ok, although
|
|---|
| 899 | a little expensive.
|
|---|
| 900 |
|
|---|
| 901 | Where people get into trouble is here:
|
|---|
| 902 |
|
|---|
| 903 | (@a, @b) = func(@c, @d);
|
|---|
| 904 | or
|
|---|
| 905 | (%a, %b) = func(%c, %d);
|
|---|
| 906 |
|
|---|
| 907 | That syntax simply won't work. It sets just C<@a> or C<%a> and
|
|---|
| 908 | clears the C<@b> or C<%b>. Plus the function didn't get passed
|
|---|
| 909 | into two separate arrays or hashes: it got one long list in C<@_>,
|
|---|
| 910 | as always.
|
|---|
| 911 |
|
|---|
| 912 | If you can arrange for everyone to deal with this through references, it's
|
|---|
| 913 | cleaner code, although not so nice to look at. Here's a function that
|
|---|
| 914 | takes two array references as arguments, returning the two array elements
|
|---|
| 915 | in order of how many elements they have in them:
|
|---|
| 916 |
|
|---|
| 917 | ($aref, $bref) = func(\@c, \@d);
|
|---|
| 918 | print "@$aref has more than @$bref\n";
|
|---|
| 919 | sub func {
|
|---|
| 920 | my ($cref, $dref) = @_;
|
|---|
| 921 | if (@$cref > @$dref) {
|
|---|
| 922 | return ($cref, $dref);
|
|---|
| 923 | } else {
|
|---|
| 924 | return ($dref, $cref);
|
|---|
| 925 | }
|
|---|
| 926 | }
|
|---|
| 927 |
|
|---|
| 928 | It turns out that you can actually do this also:
|
|---|
| 929 |
|
|---|
| 930 | (*a, *b) = func(\@c, \@d);
|
|---|
| 931 | print "@a has more than @b\n";
|
|---|
| 932 | sub func {
|
|---|
| 933 | local (*c, *d) = @_;
|
|---|
| 934 | if (@c > @d) {
|
|---|
| 935 | return (\@c, \@d);
|
|---|
| 936 | } else {
|
|---|
| 937 | return (\@d, \@c);
|
|---|
| 938 | }
|
|---|
| 939 | }
|
|---|
| 940 |
|
|---|
| 941 | Here we're using the typeglobs to do symbol table aliasing. It's
|
|---|
| 942 | a tad subtle, though, and also won't work if you're using C<my>
|
|---|
| 943 | variables, because only globals (even in disguise as C<local>s)
|
|---|
| 944 | are in the symbol table.
|
|---|
| 945 |
|
|---|
| 946 | If you're passing around filehandles, you could usually just use the bare
|
|---|
| 947 | typeglob, like C<*STDOUT>, but typeglobs references work, too.
|
|---|
| 948 | For example:
|
|---|
| 949 |
|
|---|
| 950 | splutter(\*STDOUT);
|
|---|
| 951 | sub splutter {
|
|---|
| 952 | my $fh = shift;
|
|---|
| 953 | print $fh "her um well a hmmm\n";
|
|---|
| 954 | }
|
|---|
| 955 |
|
|---|
| 956 | $rec = get_rec(\*STDIN);
|
|---|
| 957 | sub get_rec {
|
|---|
| 958 | my $fh = shift;
|
|---|
| 959 | return scalar <$fh>;
|
|---|
| 960 | }
|
|---|
| 961 |
|
|---|
| 962 | If you're planning on generating new filehandles, you could do this.
|
|---|
| 963 | Notice to pass back just the bare *FH, not its reference.
|
|---|
| 964 |
|
|---|
| 965 | sub openit {
|
|---|
| 966 | my $path = shift;
|
|---|
| 967 | local *FH;
|
|---|
| 968 | return open (FH, $path) ? *FH : undef;
|
|---|
| 969 | }
|
|---|
| 970 |
|
|---|
| 971 | =head2 Prototypes
|
|---|
| 972 | X<prototype> X<subroutine, prototype>
|
|---|
| 973 |
|
|---|
| 974 | Perl supports a very limited kind of compile-time argument checking
|
|---|
| 975 | using function prototyping. If you declare
|
|---|
| 976 |
|
|---|
| 977 | sub mypush (\@@)
|
|---|
| 978 |
|
|---|
| 979 | then C<mypush()> takes arguments exactly like C<push()> does. The
|
|---|
| 980 | function declaration must be visible at compile time. The prototype
|
|---|
| 981 | affects only interpretation of new-style calls to the function,
|
|---|
| 982 | where new-style is defined as not using the C<&> character. In
|
|---|
| 983 | other words, if you call it like a built-in function, then it behaves
|
|---|
| 984 | like a built-in function. If you call it like an old-fashioned
|
|---|
| 985 | subroutine, then it behaves like an old-fashioned subroutine. It
|
|---|
| 986 | naturally falls out from this rule that prototypes have no influence
|
|---|
| 987 | on subroutine references like C<\&foo> or on indirect subroutine
|
|---|
| 988 | calls like C<&{$subref}> or C<< $subref->() >>.
|
|---|
| 989 |
|
|---|
| 990 | Method calls are not influenced by prototypes either, because the
|
|---|
| 991 | function to be called is indeterminate at compile time, since
|
|---|
| 992 | the exact code called depends on inheritance.
|
|---|
| 993 |
|
|---|
| 994 | Because the intent of this feature is primarily to let you define
|
|---|
| 995 | subroutines that work like built-in functions, here are prototypes
|
|---|
| 996 | for some other functions that parse almost exactly like the
|
|---|
| 997 | corresponding built-in.
|
|---|
| 998 |
|
|---|
| 999 | Declared as Called as
|
|---|
| 1000 |
|
|---|
| 1001 | sub mylink ($$) mylink $old, $new
|
|---|
| 1002 | sub myvec ($$$) myvec $var, $offset, 1
|
|---|
| 1003 | sub myindex ($$;$) myindex &getstring, "substr"
|
|---|
| 1004 | sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
|
|---|
| 1005 | sub myreverse (@) myreverse $a, $b, $c
|
|---|
| 1006 | sub myjoin ($@) myjoin ":", $a, $b, $c
|
|---|
| 1007 | sub mypop (\@) mypop @array
|
|---|
| 1008 | sub mysplice (\@$$@) mysplice @array, @array, 0, @pushme
|
|---|
| 1009 | sub mykeys (\%) mykeys %{$hashref}
|
|---|
| 1010 | sub myopen (*;$) myopen HANDLE, $name
|
|---|
| 1011 | sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
|
|---|
| 1012 | sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
|
|---|
| 1013 | sub myrand ($) myrand 42
|
|---|
| 1014 | sub mytime () mytime
|
|---|
| 1015 |
|
|---|
| 1016 | Any backslashed prototype character represents an actual argument
|
|---|
| 1017 | that absolutely must start with that character. The value passed
|
|---|
| 1018 | as part of C<@_> will be a reference to the actual argument given
|
|---|
| 1019 | in the subroutine call, obtained by applying C<\> to that argument.
|
|---|
| 1020 |
|
|---|
| 1021 | You can also backslash several argument types simultaneously by using
|
|---|
| 1022 | the C<\[]> notation:
|
|---|
| 1023 |
|
|---|
| 1024 | sub myref (\[$@%&*])
|
|---|
| 1025 |
|
|---|
| 1026 | will allow calling myref() as
|
|---|
| 1027 |
|
|---|
| 1028 | myref $var
|
|---|
| 1029 | myref @array
|
|---|
| 1030 | myref %hash
|
|---|
| 1031 | myref &sub
|
|---|
| 1032 | myref *glob
|
|---|
| 1033 |
|
|---|
| 1034 | and the first argument of myref() will be a reference to
|
|---|
| 1035 | a scalar, an array, a hash, a code, or a glob.
|
|---|
| 1036 |
|
|---|
| 1037 | Unbackslashed prototype characters have special meanings. Any
|
|---|
| 1038 | unbackslashed C<@> or C<%> eats all remaining arguments, and forces
|
|---|
| 1039 | list context. An argument represented by C<$> forces scalar context. An
|
|---|
| 1040 | C<&> requires an anonymous subroutine, which, if passed as the first
|
|---|
| 1041 | argument, does not require the C<sub> keyword or a subsequent comma.
|
|---|
| 1042 |
|
|---|
| 1043 | A C<*> allows the subroutine to accept a bareword, constant, scalar expression,
|
|---|
| 1044 | typeglob, or a reference to a typeglob in that slot. The value will be
|
|---|
| 1045 | available to the subroutine either as a simple scalar, or (in the latter
|
|---|
| 1046 | two cases) as a reference to the typeglob. If you wish to always convert
|
|---|
| 1047 | such arguments to a typeglob reference, use Symbol::qualify_to_ref() as
|
|---|
| 1048 | follows:
|
|---|
| 1049 |
|
|---|
| 1050 | use Symbol 'qualify_to_ref';
|
|---|
| 1051 |
|
|---|
| 1052 | sub foo (*) {
|
|---|
| 1053 | my $fh = qualify_to_ref(shift, caller);
|
|---|
| 1054 | ...
|
|---|
| 1055 | }
|
|---|
| 1056 |
|
|---|
| 1057 | A semicolon separates mandatory arguments from optional arguments.
|
|---|
| 1058 | It is redundant before C<@> or C<%>, which gobble up everything else.
|
|---|
| 1059 |
|
|---|
| 1060 | Note how the last three examples in the table above are treated
|
|---|
| 1061 | specially by the parser. C<mygrep()> is parsed as a true list
|
|---|
| 1062 | operator, C<myrand()> is parsed as a true unary operator with unary
|
|---|
| 1063 | precedence the same as C<rand()>, and C<mytime()> is truly without
|
|---|
| 1064 | arguments, just like C<time()>. That is, if you say
|
|---|
| 1065 |
|
|---|
| 1066 | mytime +2;
|
|---|
| 1067 |
|
|---|
| 1068 | you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed
|
|---|
| 1069 | without a prototype.
|
|---|
| 1070 |
|
|---|
| 1071 | The interesting thing about C<&> is that you can generate new syntax with it,
|
|---|
| 1072 | provided it's in the initial position:
|
|---|
| 1073 | X<&>
|
|---|
| 1074 |
|
|---|
| 1075 | sub try (&@) {
|
|---|
| 1076 | my($try,$catch) = @_;
|
|---|
| 1077 | eval { &$try };
|
|---|
| 1078 | if ($@) {
|
|---|
| 1079 | local $_ = $@;
|
|---|
| 1080 | &$catch;
|
|---|
| 1081 | }
|
|---|
| 1082 | }
|
|---|
| 1083 | sub catch (&) { $_[0] }
|
|---|
| 1084 |
|
|---|
| 1085 | try {
|
|---|
| 1086 | die "phooey";
|
|---|
| 1087 | } catch {
|
|---|
| 1088 | /phooey/ and print "unphooey\n";
|
|---|
| 1089 | };
|
|---|
| 1090 |
|
|---|
| 1091 | That prints C<"unphooey">. (Yes, there are still unresolved
|
|---|
| 1092 | issues having to do with visibility of C<@_>. I'm ignoring that
|
|---|
| 1093 | question for the moment. (But note that if we make C<@_> lexically
|
|---|
| 1094 | scoped, those anonymous subroutines can act like closures... (Gee,
|
|---|
| 1095 | is this sounding a little Lispish? (Never mind.))))
|
|---|
| 1096 |
|
|---|
| 1097 | And here's a reimplementation of the Perl C<grep> operator:
|
|---|
| 1098 | X<grep>
|
|---|
| 1099 |
|
|---|
| 1100 | sub mygrep (&@) {
|
|---|
| 1101 | my $code = shift;
|
|---|
| 1102 | my @result;
|
|---|
| 1103 | foreach $_ (@_) {
|
|---|
| 1104 | push(@result, $_) if &$code;
|
|---|
| 1105 | }
|
|---|
| 1106 | @result;
|
|---|
| 1107 | }
|
|---|
| 1108 |
|
|---|
| 1109 | Some folks would prefer full alphanumeric prototypes. Alphanumerics have
|
|---|
| 1110 | been intentionally left out of prototypes for the express purpose of
|
|---|
| 1111 | someday in the future adding named, formal parameters. The current
|
|---|
| 1112 | mechanism's main goal is to let module writers provide better diagnostics
|
|---|
| 1113 | for module users. Larry feels the notation quite understandable to Perl
|
|---|
| 1114 | programmers, and that it will not intrude greatly upon the meat of the
|
|---|
| 1115 | module, nor make it harder to read. The line noise is visually
|
|---|
| 1116 | encapsulated into a small pill that's easy to swallow.
|
|---|
| 1117 |
|
|---|
| 1118 | If you try to use an alphanumeric sequence in a prototype you will
|
|---|
| 1119 | generate an optional warning - "Illegal character in prototype...".
|
|---|
| 1120 | Unfortunately earlier versions of Perl allowed the prototype to be
|
|---|
| 1121 | used as long as its prefix was a valid prototype. The warning may be
|
|---|
| 1122 | upgraded to a fatal error in a future version of Perl once the
|
|---|
| 1123 | majority of offending code is fixed.
|
|---|
| 1124 |
|
|---|
| 1125 | It's probably best to prototype new functions, not retrofit prototyping
|
|---|
| 1126 | into older ones. That's because you must be especially careful about
|
|---|
| 1127 | silent impositions of differing list versus scalar contexts. For example,
|
|---|
| 1128 | if you decide that a function should take just one parameter, like this:
|
|---|
| 1129 |
|
|---|
| 1130 | sub func ($) {
|
|---|
| 1131 | my $n = shift;
|
|---|
| 1132 | print "you gave me $n\n";
|
|---|
| 1133 | }
|
|---|
| 1134 |
|
|---|
| 1135 | and someone has been calling it with an array or expression
|
|---|
| 1136 | returning a list:
|
|---|
| 1137 |
|
|---|
| 1138 | func(@foo);
|
|---|
| 1139 | func( split /:/ );
|
|---|
| 1140 |
|
|---|
| 1141 | Then you've just supplied an automatic C<scalar> in front of their
|
|---|
| 1142 | argument, which can be more than a bit surprising. The old C<@foo>
|
|---|
| 1143 | which used to hold one thing doesn't get passed in. Instead,
|
|---|
| 1144 | C<func()> now gets passed in a C<1>; that is, the number of elements
|
|---|
| 1145 | in C<@foo>. And the C<split> gets called in scalar context so it
|
|---|
| 1146 | starts scribbling on your C<@_> parameter list. Ouch!
|
|---|
| 1147 |
|
|---|
| 1148 | This is all very powerful, of course, and should be used only in moderation
|
|---|
| 1149 | to make the world a better place.
|
|---|
| 1150 |
|
|---|
| 1151 | =head2 Constant Functions
|
|---|
| 1152 | X<constant>
|
|---|
| 1153 |
|
|---|
| 1154 | Functions with a prototype of C<()> are potential candidates for
|
|---|
| 1155 | inlining. If the result after optimization and constant folding
|
|---|
| 1156 | is either a constant or a lexically-scoped scalar which has no other
|
|---|
| 1157 | references, then it will be used in place of function calls made
|
|---|
| 1158 | without C<&>. Calls made using C<&> are never inlined. (See
|
|---|
| 1159 | F<constant.pm> for an easy way to declare most constants.)
|
|---|
| 1160 |
|
|---|
| 1161 | The following functions would all be inlined:
|
|---|
| 1162 |
|
|---|
| 1163 | sub pi () { 3.14159 } # Not exact, but close.
|
|---|
| 1164 | sub PI () { 4 * atan2 1, 1 } # As good as it gets,
|
|---|
| 1165 | # and it's inlined, too!
|
|---|
| 1166 | sub ST_DEV () { 0 }
|
|---|
| 1167 | sub ST_INO () { 1 }
|
|---|
| 1168 |
|
|---|
| 1169 | sub FLAG_FOO () { 1 << 8 }
|
|---|
| 1170 | sub FLAG_BAR () { 1 << 9 }
|
|---|
| 1171 | sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
|
|---|
| 1172 |
|
|---|
| 1173 | sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
|
|---|
| 1174 |
|
|---|
| 1175 | sub N () { int(OPT_BAZ) / 3 }
|
|---|
| 1176 |
|
|---|
| 1177 | sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO }
|
|---|
| 1178 |
|
|---|
| 1179 | Be aware that these will not be inlined; as they contain inner scopes,
|
|---|
| 1180 | the constant folding doesn't reduce them to a single constant:
|
|---|
| 1181 |
|
|---|
| 1182 | sub foo_set () { if (FLAG_MASK & FLAG_FOO) { 1 } }
|
|---|
| 1183 |
|
|---|
| 1184 | sub baz_val () {
|
|---|
| 1185 | if (OPT_BAZ) {
|
|---|
| 1186 | return 23;
|
|---|
| 1187 | }
|
|---|
| 1188 | else {
|
|---|
| 1189 | return 42;
|
|---|
| 1190 | }
|
|---|
| 1191 | }
|
|---|
| 1192 |
|
|---|
| 1193 | If you redefine a subroutine that was eligible for inlining, you'll get
|
|---|
| 1194 | a mandatory warning. (You can use this warning to tell whether or not a
|
|---|
| 1195 | particular subroutine is considered constant.) The warning is
|
|---|
| 1196 | considered severe enough not to be optional because previously compiled
|
|---|
| 1197 | invocations of the function will still be using the old value of the
|
|---|
| 1198 | function. If you need to be able to redefine the subroutine, you need to
|
|---|
| 1199 | ensure that it isn't inlined, either by dropping the C<()> prototype
|
|---|
| 1200 | (which changes calling semantics, so beware) or by thwarting the
|
|---|
| 1201 | inlining mechanism in some other way, such as
|
|---|
| 1202 |
|
|---|
| 1203 | sub not_inlined () {
|
|---|
| 1204 | 23 if $];
|
|---|
| 1205 | }
|
|---|
| 1206 |
|
|---|
| 1207 | =head2 Overriding Built-in Functions
|
|---|
| 1208 | X<built-in> X<override> X<CORE> X<CORE::GLOBAL>
|
|---|
| 1209 |
|
|---|
| 1210 | Many built-in functions may be overridden, though this should be tried
|
|---|
| 1211 | only occasionally and for good reason. Typically this might be
|
|---|
| 1212 | done by a package attempting to emulate missing built-in functionality
|
|---|
| 1213 | on a non-Unix system.
|
|---|
| 1214 |
|
|---|
| 1215 | Overriding may be done only by importing the name from a module at
|
|---|
| 1216 | compile time--ordinary predeclaration isn't good enough. However, the
|
|---|
| 1217 | C<use subs> pragma lets you, in effect, predeclare subs
|
|---|
| 1218 | via the import syntax, and these names may then override built-in ones:
|
|---|
| 1219 |
|
|---|
| 1220 | use subs 'chdir', 'chroot', 'chmod', 'chown';
|
|---|
| 1221 | chdir $somewhere;
|
|---|
| 1222 | sub chdir { ... }
|
|---|
| 1223 |
|
|---|
| 1224 | To unambiguously refer to the built-in form, precede the
|
|---|
| 1225 | built-in name with the special package qualifier C<CORE::>. For example,
|
|---|
| 1226 | saying C<CORE::open()> always refers to the built-in C<open()>, even
|
|---|
| 1227 | if the current package has imported some other subroutine called
|
|---|
| 1228 | C<&open()> from elsewhere. Even though it looks like a regular
|
|---|
| 1229 | function call, it isn't: you can't take a reference to it, such as
|
|---|
| 1230 | the incorrect C<\&CORE::open> might appear to produce.
|
|---|
| 1231 |
|
|---|
| 1232 | Library modules should not in general export built-in names like C<open>
|
|---|
| 1233 | or C<chdir> as part of their default C<@EXPORT> list, because these may
|
|---|
| 1234 | sneak into someone else's namespace and change the semantics unexpectedly.
|
|---|
| 1235 | Instead, if the module adds that name to C<@EXPORT_OK>, then it's
|
|---|
| 1236 | possible for a user to import the name explicitly, but not implicitly.
|
|---|
| 1237 | That is, they could say
|
|---|
| 1238 |
|
|---|
| 1239 | use Module 'open';
|
|---|
| 1240 |
|
|---|
| 1241 | and it would import the C<open> override. But if they said
|
|---|
| 1242 |
|
|---|
| 1243 | use Module;
|
|---|
| 1244 |
|
|---|
| 1245 | they would get the default imports without overrides.
|
|---|
| 1246 |
|
|---|
| 1247 | The foregoing mechanism for overriding built-in is restricted, quite
|
|---|
| 1248 | deliberately, to the package that requests the import. There is a second
|
|---|
| 1249 | method that is sometimes applicable when you wish to override a built-in
|
|---|
| 1250 | everywhere, without regard to namespace boundaries. This is achieved by
|
|---|
| 1251 | importing a sub into the special namespace C<CORE::GLOBAL::>. Here is an
|
|---|
| 1252 | example that quite brazenly replaces the C<glob> operator with something
|
|---|
| 1253 | that understands regular expressions.
|
|---|
| 1254 |
|
|---|
| 1255 | package REGlob;
|
|---|
| 1256 | require Exporter;
|
|---|
| 1257 | @ISA = 'Exporter';
|
|---|
| 1258 | @EXPORT_OK = 'glob';
|
|---|
| 1259 |
|
|---|
| 1260 | sub import {
|
|---|
| 1261 | my $pkg = shift;
|
|---|
| 1262 | return unless @_;
|
|---|
| 1263 | my $sym = shift;
|
|---|
| 1264 | my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
|
|---|
| 1265 | $pkg->export($where, $sym, @_);
|
|---|
| 1266 | }
|
|---|
| 1267 |
|
|---|
| 1268 | sub glob {
|
|---|
| 1269 | my $pat = shift;
|
|---|
| 1270 | my @got;
|
|---|
| 1271 | local *D;
|
|---|
| 1272 | if (opendir D, '.') {
|
|---|
| 1273 | @got = grep /$pat/, readdir D;
|
|---|
| 1274 | closedir D;
|
|---|
| 1275 | }
|
|---|
| 1276 | return @got;
|
|---|
| 1277 | }
|
|---|
| 1278 | 1;
|
|---|
| 1279 |
|
|---|
| 1280 | And here's how it could be (ab)used:
|
|---|
| 1281 |
|
|---|
| 1282 | #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
|
|---|
| 1283 | package Foo;
|
|---|
| 1284 | use REGlob 'glob'; # override glob() in Foo:: only
|
|---|
| 1285 | print for <^[a-z_]+\.pm\$>; # show all pragmatic modules
|
|---|
| 1286 |
|
|---|
| 1287 | The initial comment shows a contrived, even dangerous example.
|
|---|
| 1288 | By overriding C<glob> globally, you would be forcing the new (and
|
|---|
| 1289 | subversive) behavior for the C<glob> operator for I<every> namespace,
|
|---|
| 1290 | without the complete cognizance or cooperation of the modules that own
|
|---|
| 1291 | those namespaces. Naturally, this should be done with extreme caution--if
|
|---|
| 1292 | it must be done at all.
|
|---|
| 1293 |
|
|---|
| 1294 | The C<REGlob> example above does not implement all the support needed to
|
|---|
| 1295 | cleanly override perl's C<glob> operator. The built-in C<glob> has
|
|---|
| 1296 | different behaviors depending on whether it appears in a scalar or list
|
|---|
| 1297 | context, but our C<REGlob> doesn't. Indeed, many perl built-in have such
|
|---|
| 1298 | context sensitive behaviors, and these must be adequately supported by
|
|---|
| 1299 | a properly written override. For a fully functional example of overriding
|
|---|
| 1300 | C<glob>, study the implementation of C<File::DosGlob> in the standard
|
|---|
| 1301 | library.
|
|---|
| 1302 |
|
|---|
| 1303 | When you override a built-in, your replacement should be consistent (if
|
|---|
| 1304 | possible) with the built-in native syntax. You can achieve this by using
|
|---|
| 1305 | a suitable prototype. To get the prototype of an overridable built-in,
|
|---|
| 1306 | use the C<prototype> function with an argument of C<"CORE::builtin_name">
|
|---|
| 1307 | (see L<perlfunc/prototype>).
|
|---|
| 1308 |
|
|---|
| 1309 | Note however that some built-ins can't have their syntax expressed by a
|
|---|
| 1310 | prototype (such as C<system> or C<chomp>). If you override them you won't
|
|---|
| 1311 | be able to fully mimic their original syntax.
|
|---|
| 1312 |
|
|---|
| 1313 | The built-ins C<do>, C<require> and C<glob> can also be overridden, but due
|
|---|
| 1314 | to special magic, their original syntax is preserved, and you don't have
|
|---|
| 1315 | to define a prototype for their replacements. (You can't override the
|
|---|
| 1316 | C<do BLOCK> syntax, though).
|
|---|
| 1317 |
|
|---|
| 1318 | C<require> has special additional dark magic: if you invoke your
|
|---|
| 1319 | C<require> replacement as C<require Foo::Bar>, it will actually receive
|
|---|
| 1320 | the argument C<"Foo/Bar.pm"> in @_. See L<perlfunc/require>.
|
|---|
| 1321 |
|
|---|
| 1322 | And, as you'll have noticed from the previous example, if you override
|
|---|
| 1323 | C<glob>, the C<E<lt>*E<gt>> glob operator is overridden as well.
|
|---|
| 1324 |
|
|---|
| 1325 | In a similar fashion, overriding the C<readline> function also overrides
|
|---|
| 1326 | the equivalent I/O operator C<< <FILEHANDLE> >>.
|
|---|
| 1327 |
|
|---|
| 1328 | Finally, some built-ins (e.g. C<exists> or C<grep>) can't be overridden.
|
|---|
| 1329 |
|
|---|
| 1330 | =head2 Autoloading
|
|---|
| 1331 | X<autoloading> X<AUTOLOAD>
|
|---|
| 1332 |
|
|---|
| 1333 | If you call a subroutine that is undefined, you would ordinarily
|
|---|
| 1334 | get an immediate, fatal error complaining that the subroutine doesn't
|
|---|
| 1335 | exist. (Likewise for subroutines being used as methods, when the
|
|---|
| 1336 | method doesn't exist in any base class of the class's package.)
|
|---|
| 1337 | However, if an C<AUTOLOAD> subroutine is defined in the package or
|
|---|
| 1338 | packages used to locate the original subroutine, then that
|
|---|
| 1339 | C<AUTOLOAD> subroutine is called with the arguments that would have
|
|---|
| 1340 | been passed to the original subroutine. The fully qualified name
|
|---|
| 1341 | of the original subroutine magically appears in the global $AUTOLOAD
|
|---|
| 1342 | variable of the same package as the C<AUTOLOAD> routine. The name
|
|---|
| 1343 | is not passed as an ordinary argument because, er, well, just
|
|---|
| 1344 | because, that's why...
|
|---|
| 1345 |
|
|---|
| 1346 | Many C<AUTOLOAD> routines load in a definition for the requested
|
|---|
| 1347 | subroutine using eval(), then execute that subroutine using a special
|
|---|
| 1348 | form of goto() that erases the stack frame of the C<AUTOLOAD> routine
|
|---|
| 1349 | without a trace. (See the source to the standard module documented
|
|---|
| 1350 | in L<AutoLoader>, for example.) But an C<AUTOLOAD> routine can
|
|---|
| 1351 | also just emulate the routine and never define it. For example,
|
|---|
| 1352 | let's pretend that a function that wasn't defined should just invoke
|
|---|
| 1353 | C<system> with those arguments. All you'd do is:
|
|---|
| 1354 |
|
|---|
| 1355 | sub AUTOLOAD {
|
|---|
| 1356 | my $program = $AUTOLOAD;
|
|---|
| 1357 | $program =~ s/.*:://;
|
|---|
| 1358 | system($program, @_);
|
|---|
| 1359 | }
|
|---|
| 1360 | date();
|
|---|
| 1361 | who('am', 'i');
|
|---|
| 1362 | ls('-l');
|
|---|
| 1363 |
|
|---|
| 1364 | In fact, if you predeclare functions you want to call that way, you don't
|
|---|
| 1365 | even need parentheses:
|
|---|
| 1366 |
|
|---|
| 1367 | use subs qw(date who ls);
|
|---|
| 1368 | date;
|
|---|
| 1369 | who "am", "i";
|
|---|
| 1370 | ls -l;
|
|---|
| 1371 |
|
|---|
| 1372 | A more complete example of this is the standard Shell module, which
|
|---|
| 1373 | can treat undefined subroutine calls as calls to external programs.
|
|---|
| 1374 |
|
|---|
| 1375 | Mechanisms are available to help modules writers split their modules
|
|---|
| 1376 | into autoloadable files. See the standard AutoLoader module
|
|---|
| 1377 | described in L<AutoLoader> and in L<AutoSplit>, the standard
|
|---|
| 1378 | SelfLoader modules in L<SelfLoader>, and the document on adding C
|
|---|
| 1379 | functions to Perl code in L<perlxs>.
|
|---|
| 1380 |
|
|---|
| 1381 | =head2 Subroutine Attributes
|
|---|
| 1382 | X<attribute> X<subroutine, attribute> X<attrs>
|
|---|
| 1383 |
|
|---|
| 1384 | A subroutine declaration or definition may have a list of attributes
|
|---|
| 1385 | associated with it. If such an attribute list is present, it is
|
|---|
| 1386 | broken up at space or colon boundaries and treated as though a
|
|---|
| 1387 | C<use attributes> had been seen. See L<attributes> for details
|
|---|
| 1388 | about what attributes are currently supported.
|
|---|
| 1389 | Unlike the limitation with the obsolescent C<use attrs>, the
|
|---|
| 1390 | C<sub : ATTRLIST> syntax works to associate the attributes with
|
|---|
| 1391 | a pre-declaration, and not just with a subroutine definition.
|
|---|
| 1392 |
|
|---|
| 1393 | The attributes must be valid as simple identifier names (without any
|
|---|
| 1394 | punctuation other than the '_' character). They may have a parameter
|
|---|
| 1395 | list appended, which is only checked for whether its parentheses ('(',')')
|
|---|
| 1396 | nest properly.
|
|---|
| 1397 |
|
|---|
| 1398 | Examples of valid syntax (even though the attributes are unknown):
|
|---|
| 1399 |
|
|---|
| 1400 | sub fnord (&\%) : switch(10,foo(7,3)) : expensive;
|
|---|
| 1401 | sub plugh () : Ugly('\(") :Bad;
|
|---|
| 1402 | sub xyzzy : _5x5 { ... }
|
|---|
| 1403 |
|
|---|
| 1404 | Examples of invalid syntax:
|
|---|
| 1405 |
|
|---|
| 1406 | sub fnord : switch(10,foo(); # ()-string not balanced
|
|---|
| 1407 | sub snoid : Ugly('('); # ()-string not balanced
|
|---|
| 1408 | sub xyzzy : 5x5; # "5x5" not a valid identifier
|
|---|
| 1409 | sub plugh : Y2::north; # "Y2::north" not a simple identifier
|
|---|
| 1410 | sub snurt : foo + bar; # "+" not a colon or space
|
|---|
| 1411 |
|
|---|
| 1412 | The attribute list is passed as a list of constant strings to the code
|
|---|
| 1413 | which associates them with the subroutine. In particular, the second example
|
|---|
| 1414 | of valid syntax above currently looks like this in terms of how it's
|
|---|
| 1415 | parsed and invoked:
|
|---|
| 1416 |
|
|---|
| 1417 | use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';
|
|---|
| 1418 |
|
|---|
| 1419 | For further details on attribute lists and their manipulation,
|
|---|
| 1420 | see L<attributes> and L<Attribute::Handlers>.
|
|---|
| 1421 |
|
|---|
| 1422 | =head1 SEE ALSO
|
|---|
| 1423 |
|
|---|
| 1424 | See L<perlref/"Function Templates"> for more about references and closures.
|
|---|
| 1425 | See L<perlxs> if you'd like to learn about calling C subroutines from Perl.
|
|---|
| 1426 | See L<perlembed> if you'd like to learn about calling Perl subroutines from C.
|
|---|
| 1427 | See L<perlmod> to learn about bundling up your functions in separate files.
|
|---|
| 1428 | See L<perlmodlib> to learn what library modules come standard on your system.
|
|---|
| 1429 | See L<perltoot> to learn how to make object method calls.
|
|---|