| 1 | =head1 NAME
|
|---|
| 2 |
|
|---|
| 3 | perlintro -- a brief introduction and overview of Perl
|
|---|
| 4 |
|
|---|
| 5 | =head1 DESCRIPTION
|
|---|
| 6 |
|
|---|
| 7 | This document is intended to give you a quick overview of the Perl
|
|---|
| 8 | programming language, along with pointers to further documentation. It
|
|---|
| 9 | is intended as a "bootstrap" guide for those who are new to the
|
|---|
| 10 | language, and provides just enough information for you to be able to
|
|---|
| 11 | read other peoples' Perl and understand roughly what it's doing, or
|
|---|
| 12 | write your own simple scripts.
|
|---|
| 13 |
|
|---|
| 14 | This introductory document does not aim to be complete. It does not
|
|---|
| 15 | even aim to be entirely accurate. In some cases perfection has been
|
|---|
| 16 | sacrificed in the goal of getting the general idea across. You are
|
|---|
| 17 | I<strongly> advised to follow this introduction with more information
|
|---|
| 18 | from the full Perl manual, the table of contents to which can be found
|
|---|
| 19 | in L<perltoc>.
|
|---|
| 20 |
|
|---|
| 21 | Throughout this document you'll see references to other parts of the
|
|---|
| 22 | Perl documentation. You can read that documentation using the C<perldoc>
|
|---|
| 23 | command or whatever method you're using to read this document.
|
|---|
| 24 |
|
|---|
| 25 | =head2 What is Perl?
|
|---|
| 26 |
|
|---|
| 27 | Perl is a general-purpose programming language originally developed for
|
|---|
| 28 | text manipulation and now used for a wide range of tasks including
|
|---|
| 29 | system administration, web development, network programming, GUI
|
|---|
| 30 | development, and more.
|
|---|
| 31 |
|
|---|
| 32 | The language is intended to be practical (easy to use, efficient,
|
|---|
| 33 | complete) rather than beautiful (tiny, elegant, minimal). Its major
|
|---|
| 34 | features are that it's easy to use, supports both procedural and
|
|---|
| 35 | object-oriented (OO) programming, has powerful built-in support for text
|
|---|
| 36 | processing, and has one of the world's most impressive collections of
|
|---|
| 37 | third-party modules.
|
|---|
| 38 |
|
|---|
| 39 | Different definitions of Perl are given in L<perl>, L<perlfaq1> and
|
|---|
| 40 | no doubt other places. From this we can determine that Perl is different
|
|---|
| 41 | things to different people, but that lots of people think it's at least
|
|---|
| 42 | worth writing about.
|
|---|
| 43 |
|
|---|
| 44 | =head2 Running Perl programs
|
|---|
| 45 |
|
|---|
| 46 | To run a Perl program from the Unix command line:
|
|---|
| 47 |
|
|---|
| 48 | perl progname.pl
|
|---|
| 49 |
|
|---|
| 50 | Alternatively, put this as the first line of your script:
|
|---|
| 51 |
|
|---|
| 52 | #!/usr/bin/env perl
|
|---|
| 53 |
|
|---|
| 54 | ... and run the script as C</path/to/script.pl>. Of course, it'll need
|
|---|
| 55 | to be executable first, so C<chmod 755 script.pl> (under Unix).
|
|---|
| 56 |
|
|---|
| 57 | For more information, including instructions for other platforms such as
|
|---|
| 58 | Windows and Mac OS, read L<perlrun>.
|
|---|
| 59 |
|
|---|
| 60 | =head2 Basic syntax overview
|
|---|
| 61 |
|
|---|
| 62 | A Perl script or program consists of one or more statements. These
|
|---|
| 63 | statements are simply written in the script in a straightforward
|
|---|
| 64 | fashion. There is no need to have a C<main()> function or anything of
|
|---|
| 65 | that kind.
|
|---|
| 66 |
|
|---|
| 67 | Perl statements end in a semi-colon:
|
|---|
| 68 |
|
|---|
| 69 | print "Hello, world";
|
|---|
| 70 |
|
|---|
| 71 | Comments start with a hash symbol and run to the end of the line
|
|---|
| 72 |
|
|---|
| 73 | # This is a comment
|
|---|
| 74 |
|
|---|
| 75 | Whitespace is irrelevant:
|
|---|
| 76 |
|
|---|
| 77 | print
|
|---|
| 78 | "Hello, world"
|
|---|
| 79 | ;
|
|---|
| 80 |
|
|---|
| 81 | ... except inside quoted strings:
|
|---|
| 82 |
|
|---|
| 83 | # this would print with a linebreak in the middle
|
|---|
| 84 | print "Hello
|
|---|
| 85 | world";
|
|---|
| 86 |
|
|---|
| 87 | Double quotes or single quotes may be used around literal strings:
|
|---|
| 88 |
|
|---|
| 89 | print "Hello, world";
|
|---|
| 90 | print 'Hello, world';
|
|---|
| 91 |
|
|---|
| 92 | However, only double quotes "interpolate" variables and special
|
|---|
| 93 | characters such as newlines (C<\n>):
|
|---|
| 94 |
|
|---|
| 95 | print "Hello, $name\n"; # works fine
|
|---|
| 96 | print 'Hello, $name\n'; # prints $name\n literally
|
|---|
| 97 |
|
|---|
| 98 | Numbers don't need quotes around them:
|
|---|
| 99 |
|
|---|
| 100 | print 42;
|
|---|
| 101 |
|
|---|
| 102 | You can use parentheses for functions' arguments or omit them
|
|---|
| 103 | according to your personal taste. They are only required
|
|---|
| 104 | occasionally to clarify issues of precedence.
|
|---|
| 105 |
|
|---|
| 106 | print("Hello, world\n");
|
|---|
| 107 | print "Hello, world\n";
|
|---|
| 108 |
|
|---|
| 109 | More detailed information about Perl syntax can be found in L<perlsyn>.
|
|---|
| 110 |
|
|---|
| 111 | =head2 Perl variable types
|
|---|
| 112 |
|
|---|
| 113 | Perl has three main variable types: scalars, arrays, and hashes.
|
|---|
| 114 |
|
|---|
| 115 | =over 4
|
|---|
| 116 |
|
|---|
| 117 | =item Scalars
|
|---|
| 118 |
|
|---|
| 119 | A scalar represents a single value:
|
|---|
| 120 |
|
|---|
| 121 | my $animal = "camel";
|
|---|
| 122 | my $answer = 42;
|
|---|
| 123 |
|
|---|
| 124 | Scalar values can be strings, integers or floating point numbers, and Perl
|
|---|
| 125 | will automatically convert between them as required. There is no need
|
|---|
| 126 | to pre-declare your variable types.
|
|---|
| 127 |
|
|---|
| 128 | Scalar values can be used in various ways:
|
|---|
| 129 |
|
|---|
| 130 | print $animal;
|
|---|
| 131 | print "The animal is $animal\n";
|
|---|
| 132 | print "The square of $answer is ", $answer * $answer, "\n";
|
|---|
| 133 |
|
|---|
| 134 | There are a number of "magic" scalars with names that look like
|
|---|
| 135 | punctuation or line noise. These special variables are used for all
|
|---|
| 136 | kinds of purposes, and are documented in L<perlvar>. The only one you
|
|---|
| 137 | need to know about for now is C<$_> which is the "default variable".
|
|---|
| 138 | It's used as the default argument to a number of functions in Perl, and
|
|---|
| 139 | it's set implicitly by certain looping constructs.
|
|---|
| 140 |
|
|---|
| 141 | print; # prints contents of $_ by default
|
|---|
| 142 |
|
|---|
| 143 | =item Arrays
|
|---|
| 144 |
|
|---|
| 145 | An array represents a list of values:
|
|---|
| 146 |
|
|---|
| 147 | my @animals = ("camel", "llama", "owl");
|
|---|
| 148 | my @numbers = (23, 42, 69);
|
|---|
| 149 | my @mixed = ("camel", 42, 1.23);
|
|---|
| 150 |
|
|---|
| 151 | Arrays are zero-indexed. Here's how you get at elements in an array:
|
|---|
| 152 |
|
|---|
| 153 | print $animals[0]; # prints "camel"
|
|---|
| 154 | print $animals[1]; # prints "llama"
|
|---|
| 155 |
|
|---|
| 156 | The special variable C<$#array> tells you the index of the last element
|
|---|
| 157 | of an array:
|
|---|
| 158 |
|
|---|
| 159 | print $mixed[$#mixed]; # last element, prints 1.23
|
|---|
| 160 |
|
|---|
| 161 | You might be tempted to use C<$#array + 1> to tell you how many items there
|
|---|
| 162 | are in an array. Don't bother. As it happens, using C<@array> where Perl
|
|---|
| 163 | expects to find a scalar value ("in scalar context") will give you the number
|
|---|
| 164 | of elements in the array:
|
|---|
| 165 |
|
|---|
| 166 | if (@animals < 5) { ... }
|
|---|
| 167 |
|
|---|
| 168 | The elements we're getting from the array start with a C<$> because
|
|---|
| 169 | we're getting just a single value out of the array -- you ask for a scalar,
|
|---|
| 170 | you get a scalar.
|
|---|
| 171 |
|
|---|
| 172 | To get multiple values from an array:
|
|---|
| 173 |
|
|---|
| 174 | @animals[0,1]; # gives ("camel", "llama");
|
|---|
| 175 | @animals[0..2]; # gives ("camel", "llama", "owl");
|
|---|
| 176 | @animals[1..$#animals]; # gives all except the first element
|
|---|
| 177 |
|
|---|
| 178 | This is called an "array slice".
|
|---|
| 179 |
|
|---|
| 180 | You can do various useful things to lists:
|
|---|
| 181 |
|
|---|
| 182 | my @sorted = sort @animals;
|
|---|
| 183 | my @backwards = reverse @numbers;
|
|---|
| 184 |
|
|---|
| 185 | There are a couple of special arrays too, such as C<@ARGV> (the command
|
|---|
| 186 | line arguments to your script) and C<@_> (the arguments passed to a
|
|---|
| 187 | subroutine). These are documented in L<perlvar>.
|
|---|
| 188 |
|
|---|
| 189 | =item Hashes
|
|---|
| 190 |
|
|---|
| 191 | A hash represents a set of key/value pairs:
|
|---|
| 192 |
|
|---|
| 193 | my %fruit_color = ("apple", "red", "banana", "yellow");
|
|---|
| 194 |
|
|---|
| 195 | You can use whitespace and the C<< => >> operator to lay them out more
|
|---|
| 196 | nicely:
|
|---|
| 197 |
|
|---|
| 198 | my %fruit_color = (
|
|---|
| 199 | apple => "red",
|
|---|
| 200 | banana => "yellow",
|
|---|
| 201 | );
|
|---|
| 202 |
|
|---|
| 203 | To get at hash elements:
|
|---|
| 204 |
|
|---|
| 205 | $fruit_color{"apple"}; # gives "red"
|
|---|
| 206 |
|
|---|
| 207 | You can get at lists of keys and values with C<keys()> and
|
|---|
| 208 | C<values()>.
|
|---|
| 209 |
|
|---|
| 210 | my @fruits = keys %fruit_colors;
|
|---|
| 211 | my @colors = values %fruit_colors;
|
|---|
| 212 |
|
|---|
| 213 | Hashes have no particular internal order, though you can sort the keys
|
|---|
| 214 | and loop through them.
|
|---|
| 215 |
|
|---|
| 216 | Just like special scalars and arrays, there are also special hashes.
|
|---|
| 217 | The most well known of these is C<%ENV> which contains environment
|
|---|
| 218 | variables. Read all about it (and other special variables) in
|
|---|
| 219 | L<perlvar>.
|
|---|
| 220 |
|
|---|
| 221 | =back
|
|---|
| 222 |
|
|---|
| 223 | Scalars, arrays and hashes are documented more fully in L<perldata>.
|
|---|
| 224 |
|
|---|
| 225 | More complex data types can be constructed using references, which allow
|
|---|
| 226 | you to build lists and hashes within lists and hashes.
|
|---|
| 227 |
|
|---|
| 228 | A reference is a scalar value and can refer to any other Perl data
|
|---|
| 229 | type. So by storing a reference as the value of an array or hash
|
|---|
| 230 | element, you can easily create lists and hashes within lists and
|
|---|
| 231 | hashes. The following example shows a 2 level hash of hash
|
|---|
| 232 | structure using anonymous hash references.
|
|---|
| 233 |
|
|---|
| 234 | my $variables = {
|
|---|
| 235 | scalar => {
|
|---|
| 236 | description => "single item",
|
|---|
| 237 | sigil => '$',
|
|---|
| 238 | },
|
|---|
| 239 | array => {
|
|---|
| 240 | description => "ordered list of items",
|
|---|
| 241 | sigil => '@',
|
|---|
| 242 | },
|
|---|
| 243 | hash => {
|
|---|
| 244 | description => "key/value pairs",
|
|---|
| 245 | sigil => '%',
|
|---|
| 246 | },
|
|---|
| 247 | };
|
|---|
| 248 |
|
|---|
| 249 | print "Scalars begin with a $variables->{'scalar'}->{'sigil'}\n";
|
|---|
| 250 |
|
|---|
| 251 | Exhaustive information on the topic of references can be found in
|
|---|
| 252 | L<perlreftut>, L<perllol>, L<perlref> and L<perldsc>.
|
|---|
| 253 |
|
|---|
| 254 | =head2 Variable scoping
|
|---|
| 255 |
|
|---|
| 256 | Throughout the previous section all the examples have used the syntax:
|
|---|
| 257 |
|
|---|
| 258 | my $var = "value";
|
|---|
| 259 |
|
|---|
| 260 | The C<my> is actually not required; you could just use:
|
|---|
| 261 |
|
|---|
| 262 | $var = "value";
|
|---|
| 263 |
|
|---|
| 264 | However, the above usage will create global variables throughout your
|
|---|
| 265 | program, which is bad programming practice. C<my> creates lexically
|
|---|
| 266 | scoped variables instead. The variables are scoped to the block
|
|---|
| 267 | (i.e. a bunch of statements surrounded by curly-braces) in which they
|
|---|
| 268 | are defined.
|
|---|
| 269 |
|
|---|
| 270 | my $a = "foo";
|
|---|
| 271 | if ($some_condition) {
|
|---|
| 272 | my $b = "bar";
|
|---|
| 273 | print $a; # prints "foo"
|
|---|
| 274 | print $b; # prints "bar"
|
|---|
| 275 | }
|
|---|
| 276 | print $a; # prints "foo"
|
|---|
| 277 | print $b; # prints nothing; $b has fallen out of scope
|
|---|
| 278 |
|
|---|
| 279 | Using C<my> in combination with a C<use strict;> at the top of
|
|---|
| 280 | your Perl scripts means that the interpreter will pick up certain common
|
|---|
| 281 | programming errors. For instance, in the example above, the final
|
|---|
| 282 | C<print $b> would cause a compile-time error and prevent you from
|
|---|
| 283 | running the program. Using C<strict> is highly recommended.
|
|---|
| 284 |
|
|---|
| 285 | =head2 Conditional and looping constructs
|
|---|
| 286 |
|
|---|
| 287 | Perl has most of the usual conditional and looping constructs except for
|
|---|
| 288 | case/switch (but if you really want it, there is a Switch module in Perl
|
|---|
| 289 | 5.8 and newer, and on CPAN. See the section on modules, below, for more
|
|---|
| 290 | information about modules and CPAN).
|
|---|
| 291 |
|
|---|
| 292 | The conditions can be any Perl expression. See the list of operators in
|
|---|
| 293 | the next section for information on comparison and boolean logic operators,
|
|---|
| 294 | which are commonly used in conditional statements.
|
|---|
| 295 |
|
|---|
| 296 | =over 4
|
|---|
| 297 |
|
|---|
| 298 | =item if
|
|---|
| 299 |
|
|---|
| 300 | if ( condition ) {
|
|---|
| 301 | ...
|
|---|
| 302 | } elsif ( other condition ) {
|
|---|
| 303 | ...
|
|---|
| 304 | } else {
|
|---|
| 305 | ...
|
|---|
| 306 | }
|
|---|
| 307 |
|
|---|
| 308 | There's also a negated version of it:
|
|---|
| 309 |
|
|---|
| 310 | unless ( condition ) {
|
|---|
| 311 | ...
|
|---|
| 312 | }
|
|---|
| 313 |
|
|---|
| 314 | This is provided as a more readable version of C<if (!I<condition>)>.
|
|---|
| 315 |
|
|---|
| 316 | Note that the braces are required in Perl, even if you've only got one
|
|---|
| 317 | line in the block. However, there is a clever way of making your one-line
|
|---|
| 318 | conditional blocks more English like:
|
|---|
| 319 |
|
|---|
| 320 | # the traditional way
|
|---|
| 321 | if ($zippy) {
|
|---|
| 322 | print "Yow!";
|
|---|
| 323 | }
|
|---|
| 324 |
|
|---|
| 325 | # the Perlish post-condition way
|
|---|
| 326 | print "Yow!" if $zippy;
|
|---|
| 327 | print "We have no bananas" unless $bananas;
|
|---|
| 328 |
|
|---|
| 329 | =item while
|
|---|
| 330 |
|
|---|
| 331 | while ( condition ) {
|
|---|
| 332 | ...
|
|---|
| 333 | }
|
|---|
| 334 |
|
|---|
| 335 | There's also a negated version, for the same reason we have C<unless>:
|
|---|
| 336 |
|
|---|
| 337 | until ( condition ) {
|
|---|
| 338 | ...
|
|---|
| 339 | }
|
|---|
| 340 |
|
|---|
| 341 | You can also use C<while> in a post-condition:
|
|---|
| 342 |
|
|---|
| 343 | print "LA LA LA\n" while 1; # loops forever
|
|---|
| 344 |
|
|---|
| 345 | =item for
|
|---|
| 346 |
|
|---|
| 347 | Exactly like C:
|
|---|
| 348 |
|
|---|
| 349 | for ($i=0; $i <= $max; $i++) {
|
|---|
| 350 | ...
|
|---|
| 351 | }
|
|---|
| 352 |
|
|---|
| 353 | The C style for loop is rarely needed in Perl since Perl provides
|
|---|
| 354 | the more friendly list scanning C<foreach> loop.
|
|---|
| 355 |
|
|---|
| 356 | =item foreach
|
|---|
| 357 |
|
|---|
| 358 | foreach (@array) {
|
|---|
| 359 | print "This element is $_\n";
|
|---|
| 360 | }
|
|---|
| 361 |
|
|---|
| 362 | # you don't have to use the default $_ either...
|
|---|
| 363 | foreach my $key (keys %hash) {
|
|---|
| 364 | print "The value of $key is $hash{$key}\n";
|
|---|
| 365 | }
|
|---|
| 366 |
|
|---|
| 367 | =back
|
|---|
| 368 |
|
|---|
| 369 | For more detail on looping constructs (and some that weren't mentioned in
|
|---|
| 370 | this overview) see L<perlsyn>.
|
|---|
| 371 |
|
|---|
| 372 | =head2 Builtin operators and functions
|
|---|
| 373 |
|
|---|
| 374 | Perl comes with a wide selection of builtin functions. Some of the ones
|
|---|
| 375 | we've already seen include C<print>, C<sort> and C<reverse>. A list of
|
|---|
| 376 | them is given at the start of L<perlfunc> and you can easily read
|
|---|
| 377 | about any given function by using C<perldoc -f I<functionname>>.
|
|---|
| 378 |
|
|---|
| 379 | Perl operators are documented in full in L<perlop>, but here are a few
|
|---|
| 380 | of the most common ones:
|
|---|
| 381 |
|
|---|
| 382 | =over 4
|
|---|
| 383 |
|
|---|
| 384 | =item Arithmetic
|
|---|
| 385 |
|
|---|
| 386 | + addition
|
|---|
| 387 | - subtraction
|
|---|
| 388 | * multiplication
|
|---|
| 389 | / division
|
|---|
| 390 |
|
|---|
| 391 | =item Numeric comparison
|
|---|
| 392 |
|
|---|
| 393 | == equality
|
|---|
| 394 | != inequality
|
|---|
| 395 | < less than
|
|---|
| 396 | > greater than
|
|---|
| 397 | <= less than or equal
|
|---|
| 398 | >= greater than or equal
|
|---|
| 399 |
|
|---|
| 400 | =item String comparison
|
|---|
| 401 |
|
|---|
| 402 | eq equality
|
|---|
| 403 | ne inequality
|
|---|
| 404 | lt less than
|
|---|
| 405 | gt greater than
|
|---|
| 406 | le less than or equal
|
|---|
| 407 | ge greater than or equal
|
|---|
| 408 |
|
|---|
| 409 | (Why do we have separate numeric and string comparisons? Because we don't
|
|---|
| 410 | have special variable types, and Perl needs to know whether to sort
|
|---|
| 411 | numerically (where 99 is less than 100) or alphabetically (where 100 comes
|
|---|
| 412 | before 99).
|
|---|
| 413 |
|
|---|
| 414 | =item Boolean logic
|
|---|
| 415 |
|
|---|
| 416 | && and
|
|---|
| 417 | || or
|
|---|
| 418 | ! not
|
|---|
| 419 |
|
|---|
| 420 | (C<and>, C<or> and C<not> aren't just in the above table as descriptions
|
|---|
| 421 | of the operators -- they're also supported as operators in their own
|
|---|
| 422 | right. They're more readable than the C-style operators, but have
|
|---|
| 423 | different precedence to C<&&> and friends. Check L<perlop> for more
|
|---|
| 424 | detail.)
|
|---|
| 425 |
|
|---|
| 426 | =item Miscellaneous
|
|---|
| 427 |
|
|---|
| 428 | = assignment
|
|---|
| 429 | . string concatenation
|
|---|
| 430 | x string multiplication
|
|---|
| 431 | .. range operator (creates a list of numbers)
|
|---|
| 432 |
|
|---|
| 433 | =back
|
|---|
| 434 |
|
|---|
| 435 | Many operators can be combined with a C<=> as follows:
|
|---|
| 436 |
|
|---|
| 437 | $a += 1; # same as $a = $a + 1
|
|---|
| 438 | $a -= 1; # same as $a = $a - 1
|
|---|
| 439 | $a .= "\n"; # same as $a = $a . "\n";
|
|---|
| 440 |
|
|---|
| 441 | =head2 Files and I/O
|
|---|
| 442 |
|
|---|
| 443 | You can open a file for input or output using the C<open()> function.
|
|---|
| 444 | It's documented in extravagant detail in L<perlfunc> and L<perlopentut>,
|
|---|
| 445 | but in short:
|
|---|
| 446 |
|
|---|
| 447 | open(INFILE, "input.txt") or die "Can't open input.txt: $!";
|
|---|
| 448 | open(OUTFILE, ">output.txt") or die "Can't open output.txt: $!";
|
|---|
| 449 | open(LOGFILE, ">>my.log") or die "Can't open logfile: $!";
|
|---|
| 450 |
|
|---|
| 451 | You can read from an open filehandle using the C<< <> >> operator. In
|
|---|
| 452 | scalar context it reads a single line from the filehandle, and in list
|
|---|
| 453 | context it reads the whole file in, assigning each line to an element of
|
|---|
| 454 | the list:
|
|---|
| 455 |
|
|---|
| 456 | my $line = <INFILE>;
|
|---|
| 457 | my @lines = <INFILE>;
|
|---|
| 458 |
|
|---|
| 459 | Reading in the whole file at one time is called slurping. It can
|
|---|
| 460 | be useful but it may be a memory hog. Most text file processing
|
|---|
| 461 | can be done a line at a time with Perl's looping constructs.
|
|---|
| 462 |
|
|---|
| 463 | The C<< <> >> operator is most often seen in a C<while> loop:
|
|---|
| 464 |
|
|---|
| 465 | while (<INFILE>) { # assigns each line in turn to $_
|
|---|
| 466 | print "Just read in this line: $_";
|
|---|
| 467 | }
|
|---|
| 468 |
|
|---|
| 469 | We've already seen how to print to standard output using C<print()>.
|
|---|
| 470 | However, C<print()> can also take an optional first argument specifying
|
|---|
| 471 | which filehandle to print to:
|
|---|
| 472 |
|
|---|
| 473 | print STDERR "This is your final warning.\n";
|
|---|
| 474 | print OUTFILE $record;
|
|---|
| 475 | print LOGFILE $logmessage;
|
|---|
| 476 |
|
|---|
| 477 | When you're done with your filehandles, you should C<close()> them
|
|---|
| 478 | (though to be honest, Perl will clean up after you if you forget):
|
|---|
| 479 |
|
|---|
| 480 | close INFILE;
|
|---|
| 481 |
|
|---|
| 482 | =head2 Regular expressions
|
|---|
| 483 |
|
|---|
| 484 | Perl's regular expression support is both broad and deep, and is the
|
|---|
| 485 | subject of lengthy documentation in L<perlrequick>, L<perlretut>, and
|
|---|
| 486 | elsewhere. However, in short:
|
|---|
| 487 |
|
|---|
| 488 | =over 4
|
|---|
| 489 |
|
|---|
| 490 | =item Simple matching
|
|---|
| 491 |
|
|---|
| 492 | if (/foo/) { ... } # true if $_ contains "foo"
|
|---|
| 493 | if ($a =~ /foo/) { ... } # true if $a contains "foo"
|
|---|
| 494 |
|
|---|
| 495 | The C<//> matching operator is documented in L<perlop>. It operates on
|
|---|
| 496 | C<$_> by default, or can be bound to another variable using the C<=~>
|
|---|
| 497 | binding operator (also documented in L<perlop>).
|
|---|
| 498 |
|
|---|
| 499 | =item Simple substitution
|
|---|
| 500 |
|
|---|
| 501 | s/foo/bar/; # replaces foo with bar in $_
|
|---|
| 502 | $a =~ s/foo/bar/; # replaces foo with bar in $a
|
|---|
| 503 | $a =~ s/foo/bar/g; # replaces ALL INSTANCES of foo with bar in $a
|
|---|
| 504 |
|
|---|
| 505 | The C<s///> substitution operator is documented in L<perlop>.
|
|---|
| 506 |
|
|---|
| 507 | =item More complex regular expressions
|
|---|
| 508 |
|
|---|
| 509 | You don't just have to match on fixed strings. In fact, you can match
|
|---|
| 510 | on just about anything you could dream of by using more complex regular
|
|---|
| 511 | expressions. These are documented at great length in L<perlre>, but for
|
|---|
| 512 | the meantime, here's a quick cheat sheet:
|
|---|
| 513 |
|
|---|
| 514 | . a single character
|
|---|
| 515 | \s a whitespace character (space, tab, newline)
|
|---|
| 516 | \S non-whitespace character
|
|---|
| 517 | \d a digit (0-9)
|
|---|
| 518 | \D a non-digit
|
|---|
| 519 | \w a word character (a-z, A-Z, 0-9, _)
|
|---|
| 520 | \W a non-word character
|
|---|
| 521 | [aeiou] matches a single character in the given set
|
|---|
| 522 | [^aeiou] matches a single character outside the given set
|
|---|
| 523 | (foo|bar|baz) matches any of the alternatives specified
|
|---|
| 524 |
|
|---|
| 525 | ^ start of string
|
|---|
| 526 | $ end of string
|
|---|
| 527 |
|
|---|
| 528 | Quantifiers can be used to specify how many of the previous thing you
|
|---|
| 529 | want to match on, where "thing" means either a literal character, one
|
|---|
| 530 | of the metacharacters listed above, or a group of characters or
|
|---|
| 531 | metacharacters in parentheses.
|
|---|
| 532 |
|
|---|
| 533 | * zero or more of the previous thing
|
|---|
| 534 | + one or more of the previous thing
|
|---|
| 535 | ? zero or one of the previous thing
|
|---|
| 536 | {3} matches exactly 3 of the previous thing
|
|---|
| 537 | {3,6} matches between 3 and 6 of the previous thing
|
|---|
| 538 | {3,} matches 3 or more of the previous thing
|
|---|
| 539 |
|
|---|
| 540 | Some brief examples:
|
|---|
| 541 |
|
|---|
| 542 | /^\d+/ string starts with one or more digits
|
|---|
| 543 | /^$/ nothing in the string (start and end are adjacent)
|
|---|
| 544 | /(\d\s){3}/ a three digits, each followed by a whitespace
|
|---|
| 545 | character (eg "3 4 5 ")
|
|---|
| 546 | /(a.)+/ matches a string in which every odd-numbered letter
|
|---|
| 547 | is a (eg "abacadaf")
|
|---|
| 548 |
|
|---|
| 549 | # This loop reads from STDIN, and prints non-blank lines:
|
|---|
| 550 | while (<>) {
|
|---|
| 551 | next if /^$/;
|
|---|
| 552 | print;
|
|---|
| 553 | }
|
|---|
| 554 |
|
|---|
| 555 | =item Parentheses for capturing
|
|---|
| 556 |
|
|---|
| 557 | As well as grouping, parentheses serve a second purpose. They can be
|
|---|
| 558 | used to capture the results of parts of the regexp match for later use.
|
|---|
| 559 | The results end up in C<$1>, C<$2> and so on.
|
|---|
| 560 |
|
|---|
| 561 | # a cheap and nasty way to break an email address up into parts
|
|---|
| 562 |
|
|---|
| 563 | if ($email =~ /([^@]+)@(.+)/) {
|
|---|
| 564 | print "Username is $1\n";
|
|---|
| 565 | print "Hostname is $2\n";
|
|---|
| 566 | }
|
|---|
| 567 |
|
|---|
| 568 | =item Other regexp features
|
|---|
| 569 |
|
|---|
| 570 | Perl regexps also support backreferences, lookaheads, and all kinds of
|
|---|
| 571 | other complex details. Read all about them in L<perlrequick>,
|
|---|
| 572 | L<perlretut>, and L<perlre>.
|
|---|
| 573 |
|
|---|
| 574 | =back
|
|---|
| 575 |
|
|---|
| 576 | =head2 Writing subroutines
|
|---|
| 577 |
|
|---|
| 578 | Writing subroutines is easy:
|
|---|
| 579 |
|
|---|
| 580 | sub log {
|
|---|
| 581 | my $logmessage = shift;
|
|---|
| 582 | print LOGFILE $logmessage;
|
|---|
| 583 | }
|
|---|
| 584 |
|
|---|
| 585 | What's that C<shift>? Well, the arguments to a subroutine are available
|
|---|
| 586 | to us as a special array called C<@_> (see L<perlvar> for more on that).
|
|---|
| 587 | The default argument to the C<shift> function just happens to be C<@_>.
|
|---|
| 588 | So C<my $logmessage = shift;> shifts the first item off the list of
|
|---|
| 589 | arguments and assigns it to C<$logmessage>.
|
|---|
| 590 |
|
|---|
| 591 | We can manipulate C<@_> in other ways too:
|
|---|
| 592 |
|
|---|
| 593 | my ($logmessage, $priority) = @_; # common
|
|---|
| 594 | my $logmessage = $_[0]; # uncommon, and ugly
|
|---|
| 595 |
|
|---|
| 596 | Subroutines can also return values:
|
|---|
| 597 |
|
|---|
| 598 | sub square {
|
|---|
| 599 | my $num = shift;
|
|---|
| 600 | my $result = $num * $num;
|
|---|
| 601 | return $result;
|
|---|
| 602 | }
|
|---|
| 603 |
|
|---|
| 604 | For more information on writing subroutines, see L<perlsub>.
|
|---|
| 605 |
|
|---|
| 606 | =head2 OO Perl
|
|---|
| 607 |
|
|---|
| 608 | OO Perl is relatively simple and is implemented using references which
|
|---|
| 609 | know what sort of object they are based on Perl's concept of packages.
|
|---|
| 610 | However, OO Perl is largely beyond the scope of this document.
|
|---|
| 611 | Read L<perlboot>, L<perltoot>, L<perltooc> and L<perlobj>.
|
|---|
| 612 |
|
|---|
| 613 | As a beginning Perl programmer, your most common use of OO Perl will be
|
|---|
| 614 | in using third-party modules, which are documented below.
|
|---|
| 615 |
|
|---|
| 616 | =head2 Using Perl modules
|
|---|
| 617 |
|
|---|
| 618 | Perl modules provide a range of features to help you avoid reinventing
|
|---|
| 619 | the wheel, and can be downloaded from CPAN ( http://www.cpan.org/ ). A
|
|---|
| 620 | number of popular modules are included with the Perl distribution
|
|---|
| 621 | itself.
|
|---|
| 622 |
|
|---|
| 623 | Categories of modules range from text manipulation to network protocols
|
|---|
| 624 | to database integration to graphics. A categorized list of modules is
|
|---|
| 625 | also available from CPAN.
|
|---|
| 626 |
|
|---|
| 627 | To learn how to install modules you download from CPAN, read
|
|---|
| 628 | L<perlmodinstall>
|
|---|
| 629 |
|
|---|
| 630 | To learn how to use a particular module, use C<perldoc I<Module::Name>>.
|
|---|
| 631 | Typically you will want to C<use I<Module::Name>>, which will then give
|
|---|
| 632 | you access to exported functions or an OO interface to the module.
|
|---|
| 633 |
|
|---|
| 634 | L<perlfaq> contains questions and answers related to many common
|
|---|
| 635 | tasks, and often provides suggestions for good CPAN modules to use.
|
|---|
| 636 |
|
|---|
| 637 | L<perlmod> describes Perl modules in general. L<perlmodlib> lists the
|
|---|
| 638 | modules which came with your Perl installation.
|
|---|
| 639 |
|
|---|
| 640 | If you feel the urge to write Perl modules, L<perlnewmod> will give you
|
|---|
| 641 | good advice.
|
|---|
| 642 |
|
|---|
| 643 | =head1 AUTHOR
|
|---|
| 644 |
|
|---|
| 645 | Kirrily "Skud" Robert <[email protected]>
|
|---|