| 1 | package PerlIO;
|
|---|
| 2 |
|
|---|
| 3 | our $VERSION = '1.04';
|
|---|
| 4 |
|
|---|
| 5 | # Map layer name to package that defines it
|
|---|
| 6 | our %alias;
|
|---|
| 7 |
|
|---|
| 8 | sub import
|
|---|
| 9 | {
|
|---|
| 10 | my $class = shift;
|
|---|
| 11 | while (@_)
|
|---|
| 12 | {
|
|---|
| 13 | my $layer = shift;
|
|---|
| 14 | if (exists $alias{$layer})
|
|---|
| 15 | {
|
|---|
| 16 | $layer = $alias{$layer}
|
|---|
| 17 | }
|
|---|
| 18 | else
|
|---|
| 19 | {
|
|---|
| 20 | $layer = "${class}::$layer";
|
|---|
| 21 | }
|
|---|
| 22 | eval "require $layer";
|
|---|
| 23 | warn $@ if $@;
|
|---|
| 24 | }
|
|---|
| 25 | }
|
|---|
| 26 |
|
|---|
| 27 | sub F_UTF8 () { 0x8000 }
|
|---|
| 28 |
|
|---|
| 29 | 1;
|
|---|
| 30 | __END__
|
|---|
| 31 |
|
|---|
| 32 | =head1 NAME
|
|---|
| 33 |
|
|---|
| 34 | PerlIO - On demand loader for PerlIO layers and root of PerlIO::* name space
|
|---|
| 35 |
|
|---|
| 36 | =head1 SYNOPSIS
|
|---|
| 37 |
|
|---|
| 38 | open($fh,"<:crlf", "my.txt"); # support platform-native and CRLF text files
|
|---|
| 39 |
|
|---|
| 40 | open($fh,"<","his.jpg"); # portably open a binary file for reading
|
|---|
| 41 | binmode($fh);
|
|---|
| 42 |
|
|---|
| 43 | Shell:
|
|---|
| 44 | PERLIO=perlio perl ....
|
|---|
| 45 |
|
|---|
| 46 | =head1 DESCRIPTION
|
|---|
| 47 |
|
|---|
| 48 | When an undefined layer 'foo' is encountered in an C<open> or
|
|---|
| 49 | C<binmode> layer specification then C code performs the equivalent of:
|
|---|
| 50 |
|
|---|
| 51 | use PerlIO 'foo';
|
|---|
| 52 |
|
|---|
| 53 | The perl code in PerlIO.pm then attempts to locate a layer by doing
|
|---|
| 54 |
|
|---|
| 55 | require PerlIO::foo;
|
|---|
| 56 |
|
|---|
| 57 | Otherwise the C<PerlIO> package is a place holder for additional
|
|---|
| 58 | PerlIO related functions.
|
|---|
| 59 |
|
|---|
| 60 | The following layers are currently defined:
|
|---|
| 61 |
|
|---|
| 62 | =over 4
|
|---|
| 63 |
|
|---|
| 64 | =item :unix
|
|---|
| 65 |
|
|---|
| 66 | Lowest level layer which provides basic PerlIO operations in terms of
|
|---|
| 67 | UNIX/POSIX numeric file descriptor calls
|
|---|
| 68 | (open(), read(), write(), lseek(), close()).
|
|---|
| 69 |
|
|---|
| 70 | =item :stdio
|
|---|
| 71 |
|
|---|
| 72 | Layer which calls C<fread>, C<fwrite> and C<fseek>/C<ftell> etc. Note
|
|---|
| 73 | that as this is "real" stdio it will ignore any layers beneath it and
|
|---|
| 74 | got straight to the operating system via the C library as usual.
|
|---|
| 75 |
|
|---|
| 76 | =item :perlio
|
|---|
| 77 |
|
|---|
| 78 | A from scratch implementation of buffering for PerlIO. Provides fast
|
|---|
| 79 | access to the buffer for C<sv_gets> which implements perl's readline/E<lt>E<gt>
|
|---|
| 80 | and in general attempts to minimize data copying.
|
|---|
| 81 |
|
|---|
| 82 | C<:perlio> will insert a C<:unix> layer below itself to do low level IO.
|
|---|
| 83 |
|
|---|
| 84 | =item :crlf
|
|---|
| 85 |
|
|---|
| 86 | A layer that implements DOS/Windows like CRLF line endings. On read
|
|---|
| 87 | converts pairs of CR,LF to a single "\n" newline character. On write
|
|---|
| 88 | converts each "\n" to a CR,LF pair. Note that this layer likes to be
|
|---|
| 89 | one of its kind: it silently ignores attempts to be pushed into the
|
|---|
| 90 | layer stack more than once.
|
|---|
| 91 |
|
|---|
| 92 | It currently does I<not> mimic MS-DOS as far as treating of Control-Z
|
|---|
| 93 | as being an end-of-file marker.
|
|---|
| 94 |
|
|---|
| 95 | (Gory details follow) To be more exact what happens is this: after
|
|---|
| 96 | pushing itself to the stack, the C<:crlf> layer checks all the layers
|
|---|
| 97 | below itself to find the first layer that is capable of being a CRLF
|
|---|
| 98 | layer but is not yet enabled to be a CRLF layer. If it finds such a
|
|---|
| 99 | layer, it enables the CRLFness of that other deeper layer, and then
|
|---|
| 100 | pops itself off the stack. If not, fine, use the one we just pushed.
|
|---|
| 101 |
|
|---|
| 102 | The end result is that a C<:crlf> means "please enable the first CRLF
|
|---|
| 103 | layer you can find, and if you can't find one, here would be a good
|
|---|
| 104 | spot to place a new one."
|
|---|
| 105 |
|
|---|
| 106 | Based on the C<:perlio> layer.
|
|---|
| 107 |
|
|---|
| 108 | =item :mmap
|
|---|
| 109 |
|
|---|
| 110 | A layer which implements "reading" of files by using C<mmap()> to
|
|---|
| 111 | make (whole) file appear in the process's address space, and then
|
|---|
| 112 | using that as PerlIO's "buffer". This I<may> be faster in certain
|
|---|
| 113 | circumstances for large files, and may result in less physical memory
|
|---|
| 114 | use when multiple processes are reading the same file.
|
|---|
| 115 |
|
|---|
| 116 | Files which are not C<mmap()>-able revert to behaving like the C<:perlio>
|
|---|
| 117 | layer. Writes also behave like C<:perlio> layer as C<mmap()> for write
|
|---|
| 118 | needs extra house-keeping (to extend the file) which negates any advantage.
|
|---|
| 119 |
|
|---|
| 120 | The C<:mmap> layer will not exist if platform does not support C<mmap()>.
|
|---|
| 121 |
|
|---|
| 122 | =item :utf8
|
|---|
| 123 |
|
|---|
| 124 | Declares that the stream accepts perl's internal encoding of
|
|---|
| 125 | characters. (Which really is UTF-8 on ASCII machines, but is
|
|---|
| 126 | UTF-EBCDIC on EBCDIC machines.) This allows any character perl can
|
|---|
| 127 | represent to be read from or written to the stream. The UTF-X encoding
|
|---|
| 128 | is chosen to render simple text parts (i.e. non-accented letters,
|
|---|
| 129 | digits and common punctuation) human readable in the encoded file.
|
|---|
| 130 |
|
|---|
| 131 | Here is how to write your native data out using UTF-8 (or UTF-EBCDIC)
|
|---|
| 132 | and then read it back in.
|
|---|
| 133 |
|
|---|
| 134 | open(F, ">:utf8", "data.utf");
|
|---|
| 135 | print F $out;
|
|---|
| 136 | close(F);
|
|---|
| 137 |
|
|---|
| 138 | open(F, "<:utf8", "data.utf");
|
|---|
| 139 | $in = <F>;
|
|---|
| 140 | close(F);
|
|---|
| 141 |
|
|---|
| 142 | =item :bytes
|
|---|
| 143 |
|
|---|
| 144 | This is the inverse of C<:utf8> layer. It turns off the flag
|
|---|
| 145 | on the layer below so that data read from it is considered to
|
|---|
| 146 | be "octets" i.e. characters in range 0..255 only. Likewise
|
|---|
|
|---|