| 1 | =head1 NAME
|
|---|
| 2 |
|
|---|
| 3 | perlopentut - tutorial on opening things in Perl
|
|---|
| 4 |
|
|---|
| 5 | =head1 DESCRIPTION
|
|---|
| 6 |
|
|---|
| 7 | Perl has two simple, built-in ways to open files: the shell way for
|
|---|
| 8 | convenience, and the C way for precision. The shell way also has 2- and
|
|---|
| 9 | 3-argument forms, which have different semantics for handling the filename.
|
|---|
| 10 | The choice is yours.
|
|---|
| 11 |
|
|---|
| 12 | =head1 Open E<agrave> la shell
|
|---|
| 13 |
|
|---|
| 14 | Perl's C<open> function was designed to mimic the way command-line
|
|---|
| 15 | redirection in the shell works. Here are some basic examples
|
|---|
| 16 | from the shell:
|
|---|
| 17 |
|
|---|
| 18 | $ myprogram file1 file2 file3
|
|---|
| 19 | $ myprogram < inputfile
|
|---|
| 20 | $ myprogram > outputfile
|
|---|
| 21 | $ myprogram >> outputfile
|
|---|
| 22 | $ myprogram | otherprogram
|
|---|
| 23 | $ otherprogram | myprogram
|
|---|
| 24 |
|
|---|
| 25 | And here are some more advanced examples:
|
|---|
| 26 |
|
|---|
| 27 | $ otherprogram | myprogram f1 - f2
|
|---|
| 28 | $ otherprogram 2>&1 | myprogram -
|
|---|
| 29 | $ myprogram <&3
|
|---|
| 30 | $ myprogram >&4
|
|---|
| 31 |
|
|---|
| 32 | Programmers accustomed to constructs like those above can take comfort
|
|---|
| 33 | in learning that Perl directly supports these familiar constructs using
|
|---|
| 34 | virtually the same syntax as the shell.
|
|---|
| 35 |
|
|---|
| 36 | =head2 Simple Opens
|
|---|
| 37 |
|
|---|
| 38 | The C<open> function takes two arguments: the first is a filehandle,
|
|---|
| 39 | and the second is a single string comprising both what to open and how
|
|---|
| 40 | to open it. C<open> returns true when it works, and when it fails,
|
|---|
| 41 | returns a false value and sets the special variable C<$!> to reflect
|
|---|
| 42 | the system error. If the filehandle was previously opened, it will
|
|---|
| 43 | be implicitly closed first.
|
|---|
| 44 |
|
|---|
| 45 | For example:
|
|---|
| 46 |
|
|---|
| 47 | open(INFO, "datafile") || die("can't open datafile: $!");
|
|---|
| 48 | open(INFO, "< datafile") || die("can't open datafile: $!");
|
|---|
| 49 | open(RESULTS,"> runstats") || die("can't open runstats: $!");
|
|---|
| 50 | open(LOG, ">> logfile ") || die("can't open logfile: $!");
|
|---|
| 51 |
|
|---|
| 52 | If you prefer the low-punctuation version, you could write that this way:
|
|---|
| 53 |
|
|---|
| 54 | open INFO, "< datafile" or die "can't open datafile: $!";
|
|---|
| 55 | open RESULTS,"> runstats" or die "can't open runstats: $!";
|
|---|
| 56 | open LOG, ">> logfile " or die "can't open logfile: $!";
|
|---|
| 57 |
|
|---|
| 58 | A few things to notice. First, the leading less-than is optional.
|
|---|
| 59 | If omitted, Perl assumes that you want to open the file for reading.
|
|---|
| 60 |
|
|---|
| 61 | Note also that the first example uses the C<||> logical operator, and the
|
|---|
| 62 | second uses C<or>, which has lower precedence. Using C<||> in the latter
|
|---|
| 63 | examples would effectively mean
|
|---|
| 64 |
|
|---|
| 65 | open INFO, ( "< datafile" || die "can't open datafile: $!" );
|
|---|
| 66 |
|
|---|
| 67 | which is definitely not what you want.
|
|---|
| 68 |
|
|---|
| 69 | The other important thing to notice is that, just as in the shell,
|
|---|
| 70 | any whitespace before or after the filename is ignored. This is good,
|
|---|
| 71 | because you wouldn't want these to do different things:
|
|---|
| 72 |
|
|---|
| 73 | open INFO, "<datafile"
|
|---|
| 74 | open INFO, "< datafile"
|
|---|
| 75 | open INFO, "< datafile"
|
|---|
| 76 |
|
|---|
| 77 | Ignoring surrounding whitespace also helps for when you read a filename
|
|---|
| 78 | in from a different file, and forget to trim it before opening:
|
|---|
| 79 |
|
|---|
| 80 | $filename = <INFO>; # oops, \n still there
|
|---|
| 81 | open(EXTRA, "< $filename") || die "can't open $filename: $!";
|
|---|
| 82 |
|
|---|
| 83 | This is not a bug, but a feature. Because C<open> mimics the shell in
|
|---|
| 84 | its style of using redirection arrows to specify how to open the file, it
|
|---|
| 85 | also does so with respect to extra whitespace around the filename itself
|
|---|
| 86 | as well. For accessing files with naughty names, see
|
|---|
| 87 | L<"Dispelling the Dweomer">.
|
|---|
| 88 |
|
|---|
| 89 | There is also a 3-argument version of C<open>, which lets you put the
|
|---|
| 90 | special redirection characters into their own argument:
|
|---|
| 91 |
|
|---|
| 92 | open( INFO, ">", $datafile ) || die "Can't create $datafile: $!";
|
|---|
| 93 |
|
|---|
| 94 | In this case, the filename to open is the actual string in C<$datafile>,
|
|---|
| 95 | so you don't have to worry about C<$datafile> containing characters
|
|---|
| 96 | that might influence the open mode, or whitespace at the beginning of
|
|---|
| 97 | the filename that would be absorbed in the 2-argument version. Also,
|
|---|
| 98 | any reduction of unnecessary string interpolation is a good thing.
|
|---|
| 99 |
|
|---|
| 100 | =head2 Indirect Filehandles
|
|---|
| 101 |
|
|---|
| 102 | C<open>'s first argument can be a reference to a filehandle. As of
|
|---|
| 103 | perl 5.6.0, if the argument is uninitialized, Perl will automatically
|
|---|
| 104 | create a filehandle and put a reference to it in the first argument,
|
|---|
| 105 | like so:
|
|---|
| 106 |
|
|---|
| 107 | open( my $in, $infile ) or die "Couldn't read $infile: $!";
|
|---|
| 108 | while ( <$in> ) {
|
|---|
| 109 | # do something with $_
|
|---|
| 110 | }
|
|---|
| 111 | close $in;
|
|---|
| 112 |
|
|---|
| 113 | Indirect filehandles make namespace management easier. Since filehandles
|
|---|
| 114 | are global to the current package, two subroutines trying to open
|
|---|
| 115 | C<INFILE> will clash. With two functions opening indirect filehandles
|
|---|
| 116 | like C<my $infile>, there's no clash and no need to worry about future
|
|---|
| 117 | conflicts.
|
|---|
| 118 |
|
|---|
| 119 | Another convenient behavior is that an indirect filehandle automatically
|
|---|
| 120 | closes when it goes out of scope or when you undefine it:
|
|---|
| 121 |
|
|---|
| 122 | sub firstline {
|
|---|
| 123 | open( my $in, shift ) && return scalar <$in>;
|
|---|
| 124 | # no close() required
|
|---|
| 125 | }
|
|---|
| 126 |
|
|---|
| 127 | =head2 Pipe Opens
|
|---|
| 128 |
|
|---|
| 129 | In C, when you want to open a file using the standard I/O library,
|
|---|
| 130 | you use the C<fopen> function, but when opening a pipe, you use the
|
|---|
| 131 | C<popen> function. But in the shell, you just use a different redirection
|
|---|
| 132 | character. That's also the case for Perl. The C<open> call
|
|---|
| 133 | remains the same--just its argument differs.
|
|---|
| 134 |
|
|---|
| 135 | If the leading character is a pipe symbol, C<open> starts up a new
|
|---|
| 136 | command and opens a write-only filehandle leading into that command.
|
|---|
| 137 | This lets you write into that handle and have what you write show up on
|
|---|
| 138 | that command's standard input. For example:
|
|---|
| 139 |
|
|---|
| 140 | open(PRINTER, "| lpr -Plp1") || die "can't run lpr: $!";
|
|---|
| 141 | print PRINTER "stuff\n";
|
|---|
| 142 | close(PRINTER) || die "can't close lpr: $!";
|
|---|
| 143 |
|
|---|
| 144 | If the trailing character is a pipe, you start up a new command and open a
|
|---|
| 145 | read-only filehandle leading out of that command. This lets whatever that
|
|---|
| 146 | command writes to its standard output show up on your handle for reading.
|
|---|
| 147 | For example:
|
|---|
| 148 |
|
|---|
| 149 | open(NET, "netstat -i -n |") || die "can't fork netstat: $!";
|
|---|
| 150 | while (<NET>) { } # do something with input
|
|---|
| 151 | close(NET) || die "can't close netstat: $!";
|
|---|
| 152 |
|
|---|
| 153 | What happens if you try to open a pipe to or from a non-existent
|
|---|
| 154 | command? If possible, Perl will detect the failure and set C<$!> as
|
|---|
| 155 | usual. But if the command contains special shell characters, such as
|
|---|
| 156 | C<E<gt>> or C<*>, called 'metacharacters', Perl does not execute the
|
|---|
| 157 | command directly. Instead, Perl runs the shell, which then tries to
|
|---|
| 158 | run the command. This means that it's the shell that gets the error
|
|---|
| 159 | indication. In such a case, the C<open> call will only indicate
|
|---|
| 160 | failure if Perl can't even run the shell. See L<perlfaq8/"How can I
|
|---|
| 161 | capture STDERR from an external command?"> to see how to cope with
|
|---|
| 162 | this. There's also an explanation in L<perlipc>.
|
|---|
| 163 |
|
|---|
| 164 | If you would like to open a bidirectional pipe, the IPC::Open2
|
|---|
| 165 | library will handle this for you. Check out
|
|---|
| 166 | L<perlipc/"Bidirectional Communication with Another Process">
|
|---|
| 167 |
|
|---|
| 168 | =head2 The Minus File
|
|---|
| 169 |
|
|---|
| 170 | Again following the lead of the standard shell utilities, Perl's
|
|---|
| 171 | C<open> function treats a file whose name is a single minus, "-", in a
|
|---|
| 172 | special way. If you open minus for reading, it really means to access
|
|---|
| 173 | the standard input. If you open minus for writing, it really means to
|
|---|
| 174 | access the standard output.
|
|---|
| 175 |
|
|---|
| 176 | If minus can be used as the default input or default output, what happens
|
|---|
| 177 | if you open a pipe into or out of minus? What's the default command it
|
|---|
| 178 | would run? The same script as you're currently running! This is actually
|
|---|
| 179 | a stealth C<fork> hidden inside an C<open> call. See
|
|---|
| 180 | L<perlipc/"Safe Pipe Opens"> for details.
|
|---|
| 181 |
|
|---|
| 182 | =head2 Mixing Reads and Writes
|
|---|
| 183 |
|
|---|
| 184 | It is possible to specify both read and write access. All you do is
|
|---|
| 185 | add a "+" symbol in front of the redirection. But as in the shell,
|
|---|
| 186 | using a less-than on a file never creates a new file; it only opens an
|
|---|
| 187 | existing one. On the other hand, using a greater-than always clobbers
|
|---|
| 188 | (truncates to zero length) an existing file, or creates a brand-new one
|
|---|
| 189 | if there isn't an old one. Adding a "+" for read-write doesn't affect
|
|---|
| 190 | whether it only works on existing files or always clobbers existing ones.
|
|---|
| 191 |
|
|---|
| 192 | open(WTMP, "+< /usr/adm/wtmp")
|
|---|
| 193 | || die "can't open /usr/adm/wtmp: $!";
|
|---|
| 194 |
|
|---|
| 195 | open(SCREEN, "+> lkscreen")
|
|---|
| 196 | || die "can't open lkscreen: $!";
|
|---|
| 197 |
|
|---|
| 198 | open(LOGFILE, "+>> /var/log/applog"
|
|---|
| 199 | || die "can't open /var/log/applog: $!";
|
|---|
| 200 |
|
|---|
| 201 | The first one won't create a new file, and the second one will always
|
|---|
| 202 | clobber an old one. The third one will create a new file if necessary
|
|---|
| 203 | and not clobber an old one, and it will allow you to read at any point
|
|---|
| 204 | in the file, but all writes will always go to the end. In short,
|
|---|
| 205 | the first case is substantially more common than the second and third
|
|---|
| 206 | cases, which are almost always wrong. (If you know C, the plus in
|
|---|
| 207 | Perl's C<open> is historically derived from the one in C's fopen(3S),
|
|---|
| 208 | which it ultimately calls.)
|
|---|
| 209 |
|
|---|
| 210 | In fact, when it comes to updating a file, unless you're working on
|
|---|
| 211 | a binary file as in the WTMP case above, you probably don't want to
|
|---|
| 212 | use this approach for updating. Instead, Perl's B<-i> flag comes to
|
|---|
| 213 | the rescue. The following command takes all the C, C++, or yacc source
|
|---|
| 214 | or header files and changes all their foo's to bar's, leaving
|
|---|
| 215 | the old version in the original filename with a ".orig" tacked
|
|---|
| 216 | on the end:
|
|---|
| 217 |
|
|---|
| 218 | $ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy]
|
|---|
| 219 |
|
|---|
| 220 | This is a short cut for some renaming games that are really
|
|---|
| 221 | the best way to update textfiles. See the second question in
|
|---|
| 222 | L<perlfaq5> for more details.
|
|---|
| 223 |
|
|---|
| 224 | =head2 Filters
|
|---|
| 225 |
|
|---|
| 226 | One of the most common uses for C<open> is one you never
|
|---|
| 227 | even notice. When you process the ARGV filehandle using
|
|---|
| 228 | C<< <ARGV> >>, Perl actually does an implicit open
|
|---|
| 229 | on each file in @ARGV. Thus a program called like this:
|
|---|
| 230 |
|
|---|
| 231 | $ myprogram file1 file2 file3
|
|---|
| 232 |
|
|---|
| 233 | Can have all its files opened and processed one at a time
|
|---|
| 234 | using a construct no more complex than:
|
|---|
| 235 |
|
|---|
| 236 | while (<>) {
|
|---|
| 237 | # do something with $_
|
|---|
| 238 | }
|
|---|
| 239 |
|
|---|
| 240 | If @ARGV is empty when the loop first begins, Perl pretends you've opened
|
|---|
| 241 | up minus, that is, the standard input. In fact, $ARGV, the currently
|
|---|
| 242 | open file during C<< <ARGV> >> processing, is even set to "-"
|
|---|
| 243 | in these circumstances.
|
|---|
| 244 |
|
|---|
| 245 | You are welcome to pre-process your @ARGV before starting the loop to
|
|---|
| 246 | make sure it's to your liking. One reason to do this might be to remove
|
|---|
| 247 | command options beginning with a minus. While you can always roll the
|
|---|
| 248 | simple ones by hand, the Getopts modules are good for this:
|
|---|
| 249 |
|
|---|
| 250 | use Getopt::Std;
|
|---|
| 251 |
|
|---|
| 252 | # -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o
|
|---|
| 253 | getopts("vDo:");
|
|---|
| 254 |
|
|---|
| 255 | # -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o}
|
|---|
| 256 | getopts("vDo:", \%args);
|
|---|
| 257 |
|
|---|
| 258 | Or the standard Getopt::Long module to permit named arguments:
|
|---|
| 259 |
|
|---|
| 260 | use Getopt::Long;
|
|---|
| 261 | GetOptions( "verbose" => \$verbose, # --verbose
|
|---|
| 262 | "Debug" => \$debug, # --Debug
|
|---|
| 263 | "output=s" => \$output );
|
|---|
| 264 | # --output=somestring or --output somestring
|
|---|
| 265 |
|
|---|
| 266 | Another reason for preprocessing arguments is to make an empty
|
|---|
| 267 | argument list default to all files:
|
|---|
| 268 |
|
|---|
| 269 | @ARGV = glob("*") unless @ARGV;
|
|---|
| 270 |
|
|---|
| 271 | You could even filter out all but plain, text files. This is a bit
|
|---|
| 272 | silent, of course, and you might prefer to mention them on the way.
|
|---|
| 273 |
|
|---|
| 274 | @ARGV = grep { -f && -T } @ARGV;
|
|---|
| 275 |
|
|---|
| 276 | If you're using the B<-n> or B<-p> command-line options, you
|
|---|
| 277 | should put changes to @ARGV in a C<BEGIN{}> block.
|
|---|
| 278 |
|
|---|
| 279 | Remember that a normal C<open> has special properties, in that it might
|
|---|
| 280 | call fopen(3S) or it might called popen(3S), depending on what its
|
|---|
| 281 | argument looks like; that's why it's sometimes called "magic open".
|
|---|
| 282 | Here's an example:
|
|---|
| 283 |
|
|---|
| 284 | $pwdinfo = `domainname` =~ /^(\(none\))?$/
|
|---|
| 285 | ? '< /etc/passwd'
|
|---|
| 286 | : 'ypcat passwd |';
|
|---|
| 287 |
|
|---|
| 288 | open(PWD, $pwdinfo)
|
|---|
| 289 | or die "can't open $pwdinfo: $!";
|
|---|
| 290 |
|
|---|
| 291 | This sort of thing also comes into play in filter processing. Because
|
|---|
| 292 | C<< <ARGV> >> processing employs the normal, shell-style Perl C<open>,
|
|---|
| 293 | it respects all the special things we've already seen:
|
|---|
| 294 |
|
|---|
| 295 | $ myprogram f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
|
|---|
| 296 |
|
|---|
| 297 | That program will read from the file F<f1>, the process F<cmd1>, standard
|
|---|
| 298 | input (F<tmpfile> in this case), the F<f2> file, the F<cmd2> command,
|
|---|
| 299 | and finally the F<f3> file.
|
|---|
| 300 |
|
|---|
| 301 | Yes, this also means that if you have files named "-" (and so on) in
|
|---|
| 302 | your directory, they won't be processed as literal files by C<open>.
|
|---|
| 303 | You'll need to pass them as "./-", much as you would for the I<rm> program,
|
|---|
| 304 | or you could use C<sysopen> as described below.
|
|---|
| 305 |
|
|---|
| 306 | One of the more interesting applications is to change files of a certain
|
|---|
| 307 | name into pipes. For example, to autoprocess gzipped or compressed
|
|---|
| 308 | files by decompressing them with I<gzip>:
|
|---|
| 309 |
|
|---|
| 310 | @ARGV = map { /^\.(gz|Z)$/ ? "gzip -dc $_ |" : $_ } @ARGV;
|
|---|
| 311 |
|
|---|
| 312 | Or, if you have the I<GET> program installed from LWP,
|
|---|
| 313 | you can fetch URLs before processing them:
|
|---|
| 314 |
|
|---|
| 315 | @ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV;
|
|---|
| 316 |
|
|---|
| 317 | It's not for nothing that this is called magic C<< <ARGV> >>.
|
|---|
| 318 | Pretty nifty, eh?
|
|---|
| 319 |
|
|---|
| 320 | =head1 Open E<agrave> la C
|
|---|
| 321 |
|
|---|
| 322 | If you want the convenience of the shell, then Perl's C<open> is
|
|---|
| 323 | definitely the way to go. On the other hand, if you want finer precision
|
|---|
| 324 | than C's simplistic fopen(3S) provides you should look to Perl's
|
|---|
| 325 | C<sysopen>, which is a direct hook into the open(2) system call.
|
|---|
| 326 | That does mean it's a bit more involved, but that's the price of
|
|---|
| 327 | precision.
|
|---|
| 328 |
|
|---|
| 329 | C<sysopen> takes 3 (or 4) arguments.
|
|---|
| 330 |
|
|---|
| 331 | sysopen HANDLE, PATH, FLAGS, [MASK]
|
|---|
| 332 |
|
|---|
| 333 | The HANDLE argument is a filehandle just as with C<open>. The PATH is
|
|---|
| 334 | a literal path, one that doesn't pay attention to any greater-thans or
|
|---|
| 335 | less-thans or pipes or minuses, nor ignore whitespace. If it's there,
|
|---|
| 336 | it's part of the path. The FLAGS argument contains one or more values
|
|---|
| 337 | derived from the Fcntl module that have been or'd together using the
|
|---|
| 338 | bitwise "|" operator. The final argument, the MASK, is optional; if
|
|---|
| 339 | present, it is combined with the user's current umask for the creation
|
|---|
| 340 | mode of the file. You should usually omit this.
|
|---|
| 341 |
|
|---|
| 342 | Although the traditional values of read-only, write-only, and read-write
|
|---|
| 343 | are 0, 1, and 2 respectively, this is known not to hold true on some
|
|---|
| 344 | systems. Instead, it's best to load in the appropriate constants first
|
|---|
| 345 | from the Fcntl module, which supplies the following standard flags:
|
|---|
| 346 |
|
|---|
| 347 | O_RDONLY Read only
|
|---|
| 348 | O_WRONLY Write only
|
|---|
| 349 | O_RDWR Read and write
|
|---|
| 350 | O_CREAT Create the file if it doesn't exist
|
|---|
| 351 | O_EXCL Fail if the file already exists
|
|---|
| 352 | O_APPEND Append to the file
|
|---|
| 353 | O_TRUNC Truncate the file
|
|---|
| 354 | O_NONBLOCK Non-blocking access
|
|---|
| 355 |
|
|---|
| 356 | Less common flags that are sometimes available on some operating
|
|---|
| 357 | systems include C<O_BINARY>, C<O_TEXT>, C<O_SHLOCK>, C<O_EXLOCK>,
|
|---|
| 358 | C<O_DEFER>, C<O_SYNC>, C<O_ASYNC>, C<O_DSYNC>, C<O_RSYNC>,
|
|---|
| 359 | C<O_NOCTTY>, C<O_NDELAY> and C<O_LARGEFILE>. Consult your open(2)
|
|---|
| 360 | manpage or its local equivalent for details. (Note: starting from
|
|---|
| 361 | Perl release 5.6 the C<O_LARGEFILE> flag, if available, is automatically
|
|---|
| 362 | added to the sysopen() flags because large files are the default.)
|
|---|
| 363 |
|
|---|
| 364 | Here's how to use C<sysopen> to emulate the simple C<open> calls we had
|
|---|
| 365 | before. We'll omit the C<|| die $!> checks for clarity, but make sure
|
|---|
| 366 | you always check the return values in real code. These aren't quite
|
|---|
| 367 | the same, since C<open> will trim leading and trailing whitespace,
|
|---|
| 368 | but you'll get the idea.
|
|---|
| 369 |
|
|---|
| 370 | To open a file for reading:
|
|---|
| 371 |
|
|---|
| 372 | open(FH, "< $path");
|
|---|
| 373 | sysopen(FH, $path, O_RDONLY);
|
|---|
| 374 |
|
|---|
| 375 | To open a file for writing, creating a new file if needed or else truncating
|
|---|
| 376 | an old file:
|
|---|
| 377 |
|
|---|
| 378 | open(FH, "> $path");
|
|---|
| 379 | sysopen(FH, $path, O_WRONLY | O_TRUNC | O_CREAT);
|
|---|
| 380 |
|
|---|
| 381 | To open a file for appending, creating one if necessary:
|
|---|
| 382 |
|
|---|
| 383 | open(FH, ">> $path");
|
|---|
| 384 | sysopen(FH, $path, O_WRONLY | O_APPEND | O_CREAT);
|
|---|
| 385 |
|
|---|
| 386 | To open a file for update, where the file must already exist:
|
|---|
| 387 |
|
|---|
| 388 | open(FH, "+< $path");
|
|---|
| 389 | sysopen(FH, $path, O_RDWR);
|
|---|
| 390 |
|
|---|
| 391 | And here are things you can do with C<sysopen> that you cannot do with
|
|---|
| 392 | a regular C<open>. As you'll see, it's just a matter of controlling the
|
|---|
| 393 | flags in the third argument.
|
|---|
| 394 |
|
|---|
| 395 | To open a file for writing, creating a new file which must not previously
|
|---|
| 396 | exist:
|
|---|
| 397 |
|
|---|
| 398 | sysopen(FH, $path, O_WRONLY | O_EXCL | O_CREAT);
|
|---|
| 399 |
|
|---|
| 400 | To open a file for appending, where that file must already exist:
|
|---|
| 401 |
|
|---|
| 402 | sysopen(FH, $path, O_WRONLY | O_APPEND);
|
|---|
| 403 |
|
|---|
| 404 | To open a file for update, creating a new file if necessary:
|
|---|
| 405 |
|
|---|
| 406 | sysopen(FH, $path, O_RDWR | O_CREAT);
|
|---|
| 407 |
|
|---|
| 408 | To open a file for update, where that file must not already exist:
|
|---|
| 409 |
|
|---|
| 410 | sysopen(FH, $path, O_RDWR | O_EXCL | O_CREAT);
|
|---|
| 411 |
|
|---|
| 412 | To open a file without blocking, creating one if necessary:
|
|---|
| 413 |
|
|---|
| 414 | sysopen(FH, $path, O_WRONLY | O_NONBLOCK | O_CREAT);
|
|---|
| 415 |
|
|---|
| 416 | =head2 Permissions E<agrave> la mode
|
|---|
| 417 |
|
|---|
| 418 | If you omit the MASK argument to C<sysopen>, Perl uses the octal value
|
|---|
| 419 | 0666. The normal MASK to use for executables and directories should
|
|---|
| 420 | be 0777, and for anything else, 0666.
|
|---|
| 421 |
|
|---|
| 422 | Why so permissive? Well, it isn't really. The MASK will be modified
|
|---|
| 423 | by your process's current C<umask>. A umask is a number representing
|
|---|
| 424 | I<disabled> permissions bits; that is, bits that will not be turned on
|
|---|
| 425 | in the created files' permissions field.
|
|---|
| 426 |
|
|---|
| 427 | For example, if your C<umask> were 027, then the 020 part would
|
|---|
| 428 | disable the group from writing, and the 007 part would disable others
|
|---|
| 429 | from reading, writing, or executing. Under these conditions, passing
|
|---|
| 430 | C<sysopen> 0666 would create a file with mode 0640, since C<0666 & ~027>
|
|---|
| 431 | is 0640.
|
|---|
| 432 |
|
|---|
| 433 | You should seldom use the MASK argument to C<sysopen()>. That takes
|
|---|
| 434 | away the user's freedom to choose what permission new files will have.
|
|---|
| 435 | Denying choice is almost always a bad thing. One exception would be for
|
|---|
| 436 | cases where sensitive or private data is being stored, such as with mail
|
|---|
| 437 | folders, cookie files, and internal temporary files.
|
|---|
| 438 |
|
|---|
| 439 | =head1 Obscure Open Tricks
|
|---|
| 440 |
|
|---|
| 441 | =head2 Re-Opening Files (dups)
|
|---|
| 442 |
|
|---|
| 443 | Sometimes you already have a filehandle open, and want to make another
|
|---|
| 444 | handle that's a duplicate of the first one. In the shell, we place an
|
|---|
| 445 | ampersand in front of a file descriptor number when doing redirections.
|
|---|
| 446 | For example, C<< 2>&1 >> makes descriptor 2 (that's STDERR in Perl)
|
|---|
| 447 | be redirected into descriptor 1 (which is usually Perl's STDOUT).
|
|---|
| 448 | The same is essentially true in Perl: a filename that begins with an
|
|---|
| 449 | ampersand is treated instead as a file descriptor if a number, or as a
|
|---|
| 450 | filehandle if a string.
|
|---|
| 451 |
|
|---|
| 452 | open(SAVEOUT, ">&SAVEERR") || die "couldn't dup SAVEERR: $!";
|
|---|
| 453 | open(MHCONTEXT, "<&4") || die "couldn't dup fd4: $!";
|
|---|
| 454 |
|
|---|
| 455 | That means that if a function is expecting a filename, but you don't
|
|---|
| 456 | want to give it a filename because you already have the file open, you
|
|---|
| 457 | can just pass the filehandle with a leading ampersand. It's best to
|
|---|
| 458 | use a fully qualified handle though, just in case the function happens
|
|---|
| 459 | to be in a different package:
|
|---|
| 460 |
|
|---|
| 461 | somefunction("&main::LOGFILE");
|
|---|
| 462 |
|
|---|
| 463 | This way if somefunction() is planning on opening its argument, it can
|
|---|
| 464 | just use the already opened handle. This differs from passing a handle,
|
|---|
| 465 | because with a handle, you don't open the file. Here you have something
|
|---|
| 466 | you can pass to open.
|
|---|
| 467 |
|
|---|
| 468 | If you have one of those tricky, newfangled I/O objects that the C++
|
|---|
| 469 | folks are raving about, then this doesn't work because those aren't a
|
|---|
| 470 | proper filehandle in the native Perl sense. You'll have to use fileno()
|
|---|
| 471 | to pull out the proper descriptor number, assuming you can:
|
|---|
| 472 |
|
|---|
| 473 | use IO::Socket;
|
|---|
| 474 | $handle = IO::Socket::INET->new("www.perl.com:80");
|
|---|
| 475 | $fd = $handle->fileno;
|
|---|
| 476 | somefunction("&$fd"); # not an indirect function call
|
|---|
| 477 |
|
|---|
| 478 | It can be easier (and certainly will be faster) just to use real
|
|---|
| 479 | filehandles though:
|
|---|
| 480 |
|
|---|
| 481 | use IO::Socket;
|
|---|
| 482 | local *REMOTE = IO::Socket::INET->new("www.perl.com:80");
|
|---|
| 483 | die "can't connect" unless defined(fileno(REMOTE));
|
|---|
| 484 | somefunction("&main::REMOTE");
|
|---|
| 485 |
|
|---|
| 486 | If the filehandle or descriptor number is preceded not just with a simple
|
|---|
| 487 | "&" but rather with a "&=" combination, then Perl will not create a
|
|---|
| 488 | completely new descriptor opened to the same place using the dup(2)
|
|---|
| 489 | system call. Instead, it will just make something of an alias to the
|
|---|
| 490 | existing one using the fdopen(3S) library call This is slightly more
|
|---|
| 491 | parsimonious of systems resources, although this is less a concern
|
|---|
| 492 | these days. Here's an example of that:
|
|---|
| 493 |
|
|---|
| 494 | $fd = $ENV{"MHCONTEXTFD"};
|
|---|
| 495 | open(MHCONTEXT, "<&=$fd") or die "couldn't fdopen $fd: $!";
|
|---|
| 496 |
|
|---|
| 497 | If you're using magic C<< <ARGV> >>, you could even pass in as a
|
|---|
| 498 | command line argument in @ARGV something like C<"<&=$MHCONTEXTFD">,
|
|---|
| 499 | but we've never seen anyone actually do this.
|
|---|
| 500 |
|
|---|
| 501 | =head2 Dispelling the Dweomer
|
|---|
| 502 |
|
|---|
| 503 | Perl is more of a DWIMmer language than something like Java--where DWIM
|
|---|
| 504 | is an acronym for "do what I mean". But this principle sometimes leads
|
|---|
| 505 | to more hidden magic than one knows what to do with. In this way, Perl
|
|---|
|
|---|