| 1 | =head1 NAME
|
|---|
| 2 |
|
|---|
| 3 | perlsec - Perl security
|
|---|
| 4 |
|
|---|
| 5 | =head1 DESCRIPTION
|
|---|
| 6 |
|
|---|
| 7 | Perl is designed to make it easy to program securely even when running
|
|---|
| 8 | with extra privileges, like setuid or setgid programs. Unlike most
|
|---|
| 9 | command line shells, which are based on multiple substitution passes on
|
|---|
| 10 | each line of the script, Perl uses a more conventional evaluation scheme
|
|---|
| 11 | with fewer hidden snags. Additionally, because the language has more
|
|---|
| 12 | builtin functionality, it can rely less upon external (and possibly
|
|---|
| 13 | untrustworthy) programs to accomplish its purposes.
|
|---|
| 14 |
|
|---|
| 15 | Perl automatically enables a set of special security checks, called I<taint
|
|---|
| 16 | mode>, when it detects its program running with differing real and effective
|
|---|
| 17 | user or group IDs. The setuid bit in Unix permissions is mode 04000, the
|
|---|
| 18 | setgid bit mode 02000; either or both may be set. You can also enable taint
|
|---|
| 19 | mode explicitly by using the B<-T> command line flag. This flag is
|
|---|
| 20 | I<strongly> suggested for server programs and any program run on behalf of
|
|---|
| 21 | someone else, such as a CGI script. Once taint mode is on, it's on for
|
|---|
| 22 | the remainder of your script.
|
|---|
| 23 |
|
|---|
| 24 | While in this mode, Perl takes special precautions called I<taint
|
|---|
| 25 | checks> to prevent both obvious and subtle traps. Some of these checks
|
|---|
| 26 | are reasonably simple, such as verifying that path directories aren't
|
|---|
| 27 | writable by others; careful programmers have always used checks like
|
|---|
| 28 | these. Other checks, however, are best supported by the language itself,
|
|---|
| 29 | and it is these checks especially that contribute to making a set-id Perl
|
|---|
| 30 | program more secure than the corresponding C program.
|
|---|
| 31 |
|
|---|
| 32 | You may not use data derived from outside your program to affect
|
|---|
| 33 | something else outside your program--at least, not by accident. All
|
|---|
| 34 | command line arguments, environment variables, locale information (see
|
|---|
| 35 | L<perllocale>), results of certain system calls (C<readdir()>,
|
|---|
| 36 | C<readlink()>, the variable of C<shmread()>, the messages returned by
|
|---|
| 37 | C<msgrcv()>, the password, gcos and shell fields returned by the
|
|---|
| 38 | C<getpwxxx()> calls), and all file input are marked as "tainted".
|
|---|
| 39 | Tainted data may not be used directly or indirectly in any command
|
|---|
| 40 | that invokes a sub-shell, nor in any command that modifies files,
|
|---|
| 41 | directories, or processes, B<with the following exceptions>:
|
|---|
| 42 |
|
|---|
| 43 | =over 4
|
|---|
| 44 |
|
|---|
| 45 | =item *
|
|---|
| 46 |
|
|---|
| 47 | Arguments to C<print> and C<syswrite> are B<not> checked for taintedness.
|
|---|
| 48 |
|
|---|
| 49 | =item *
|
|---|
| 50 |
|
|---|
| 51 | Symbolic methods
|
|---|
| 52 |
|
|---|
| 53 | $obj->$method(@args);
|
|---|
| 54 |
|
|---|
| 55 | and symbolic sub references
|
|---|
| 56 |
|
|---|
| 57 | &{$foo}(@args);
|
|---|
| 58 | $foo->(@args);
|
|---|
| 59 |
|
|---|
| 60 | are not checked for taintedness. This requires extra carefulness
|
|---|
| 61 | unless you want external data to affect your control flow. Unless
|
|---|
| 62 | you carefully limit what these symbolic values are, people are able
|
|---|
| 63 | to call functions B<outside> your Perl code, such as POSIX::system,
|
|---|
| 64 | in which case they are able to run arbitrary external code.
|
|---|
| 65 |
|
|---|
| 66 | =back
|
|---|
| 67 |
|
|---|
| 68 | For efficiency reasons, Perl takes a conservative view of
|
|---|
| 69 | whether data is tainted. If an expression contains tainted data,
|
|---|
| 70 | any subexpression may be considered tainted, even if the value
|
|---|
| 71 | of the subexpression is not itself affected by the tainted data.
|
|---|
| 72 |
|
|---|
| 73 | Because taintedness is associated with each scalar value, some
|
|---|
| 74 | elements of an array or hash can be tainted and others not.
|
|---|
| 75 | The keys of a hash are never tainted.
|
|---|
| 76 |
|
|---|
| 77 | For example:
|
|---|
| 78 |
|
|---|
| 79 | $arg = shift; # $arg is tainted
|
|---|
| 80 | $hid = $arg, 'bar'; # $hid is also tainted
|
|---|
| 81 | $line = <>; # Tainted
|
|---|
| 82 | $line = <STDIN>; # Also tainted
|
|---|
| 83 | open FOO, "/home/me/bar" or die $!;
|
|---|
| 84 | $line = <FOO>; # Still tainted
|
|---|
| 85 | $path = $ENV{'PATH'}; # Tainted, but see below
|
|---|
| 86 | $data = 'abc'; # Not tainted
|
|---|
| 87 |
|
|---|
| 88 | system "echo $arg"; # Insecure
|
|---|
| 89 | system "/bin/echo", $arg; # Considered insecure
|
|---|
| 90 | # (Perl doesn't know about /bin/echo)
|
|---|
| 91 | system "echo $hid"; # Insecure
|
|---|
| 92 | system "echo $data"; # Insecure until PATH set
|
|---|
| 93 |
|
|---|
| 94 | $path = $ENV{'PATH'}; # $path now tainted
|
|---|
| 95 |
|
|---|
| 96 | $ENV{'PATH'} = '/bin:/usr/bin';
|
|---|
| 97 | delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
|
|---|
| 98 |
|
|---|
| 99 | $path = $ENV{'PATH'}; # $path now NOT tainted
|
|---|
| 100 | system "echo $data"; # Is secure now!
|
|---|
| 101 |
|
|---|
| 102 | open(FOO, "< $arg"); # OK - read-only file
|
|---|
| 103 | open(FOO, "> $arg"); # Not OK - trying to write
|
|---|
| 104 |
|
|---|
| 105 | open(FOO,"echo $arg|"); # Not OK
|
|---|
| 106 | open(FOO,"-|")
|
|---|
| 107 | or exec 'echo', $arg; # Also not OK
|
|---|
| 108 |
|
|---|
| 109 | $shout = `echo $arg`; # Insecure, $shout now tainted
|
|---|
| 110 |
|
|---|
| 111 | unlink $data, $arg; # Insecure
|
|---|
| 112 | umask $arg; # Insecure
|
|---|
| 113 |
|
|---|
| 114 | exec "echo $arg"; # Insecure
|
|---|
| 115 | exec "echo", $arg; # Insecure
|
|---|
| 116 | exec "sh", '-c', $arg; # Very insecure!
|
|---|
| 117 |
|
|---|
| 118 | @files = <*.c>; # insecure (uses readdir() or similar)
|
|---|
| 119 | @files = glob('*.c'); # insecure (uses readdir() or similar)
|
|---|
| 120 |
|
|---|
| 121 | # In Perl releases older than 5.6.0 the <*.c> and glob('*.c') would
|
|---|
| 122 | # have used an external program to do the filename expansion; but in
|
|---|
| 123 | # either case the result is tainted since the list of filenames comes
|
|---|
| 124 | # from outside of the program.
|
|---|
| 125 |
|
|---|
| 126 | $bad = ($arg, 23); # $bad will be tainted
|
|---|
| 127 | $arg, `true`; # Insecure (although it isn't really)
|
|---|
| 128 |
|
|---|
| 129 | If you try to do something insecure, you will get a fatal error saying
|
|---|
| 130 | something like "Insecure dependency" or "Insecure $ENV{PATH}".
|
|---|
| 131 |
|
|---|
| 132 | The exception to the principle of "one tainted value taints the whole
|
|---|
| 133 | expression" is with the ternary conditional operator C<?:>. Since code
|
|---|
| 134 | with a ternary conditional
|
|---|
| 135 |
|
|---|
| 136 | $result = $tainted_value ? "Untainted" : "Also untainted";
|
|---|
| 137 |
|
|---|
| 138 | is effectively
|
|---|
| 139 |
|
|---|
| 140 | if ( $tainted_value ) {
|
|---|
| 141 | $result = "Untainted";
|
|---|
| 142 | } else {
|
|---|
| 143 | $result = "Also untainted";
|
|---|
| 144 | }
|
|---|
| 145 |
|
|---|
| 146 | it doesn't make sense for C<$result> to be tainted.
|
|---|
| 147 |
|
|---|
| 148 | =head2 Laundering and Detecting Tainted Data
|
|---|
| 149 |
|
|---|
| 150 | To test whether a variable contains tainted data, and whose use would
|
|---|
| 151 | thus trigger an "Insecure dependency" message, you can use the
|
|---|
| 152 | C<tainted()> function of the Scalar::Util module, available in your
|
|---|
| 153 | nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
|
|---|
| 154 | Or you may be able to use the following C<is_tainted()> function.
|
|---|
| 155 |
|
|---|
| 156 | sub is_tainted {
|
|---|
| 157 | return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
|
|---|
| 158 | }
|
|---|
| 159 |
|
|---|
| 160 | This function makes use of the fact that the presence of tainted data
|
|---|
| 161 | anywhere within an expression renders the entire expression tainted. It
|
|---|
| 162 | would be inefficient for every operator to test every argument for
|
|---|
| 163 | taintedness. Instead, the slightly more efficient and conservative
|
|---|
| 164 | approach is used that if any tainted value has been accessed within the
|
|---|
| 165 | same expression, the whole expression is considered tainted.
|
|---|
| 166 |
|
|---|
| 167 | But testing for taintedness gets you only so far. Sometimes you have just
|
|---|
| 168 | to clear your data's taintedness. Values may be untainted by using them
|
|---|
| 169 | as keys in a hash; otherwise the only way to bypass the tainting
|
|---|
| 170 | mechanism is by referencing subpatterns from a regular expression match.
|
|---|
| 171 | Perl presumes that if you reference a substring using $1, $2, etc., that
|
|---|
| 172 | you knew what you were doing when you wrote the pattern. That means using
|
|---|
| 173 | a bit of thought--don't just blindly untaint anything, or you defeat the
|
|---|
| 174 | entire mechanism. It's better to verify that the variable has only good
|
|---|
| 175 | characters (for certain values of "good") rather than checking whether it
|
|---|
| 176 | has any bad characters. That's because it's far too easy to miss bad
|
|---|
| 177 | characters that you never thought of.
|
|---|
| 178 |
|
|---|
| 179 | Here's a test to make sure that the data contains nothing but "word"
|
|---|
| 180 | characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
|
|---|
| 181 | or a dot.
|
|---|
| 182 |
|
|---|
| 183 | if ($data =~ /^([-\@\w.]+)$/) {
|
|---|
| 184 | $data = $1; # $data now untainted
|
|---|
| 185 | } else {
|
|---|
| 186 | die "Bad data in '$data'"; # log this somewhere
|
|---|
| 187 | }
|
|---|
| 188 |
|
|---|
| 189 | This is fairly secure because C</\w+/> doesn't normally match shell
|
|---|
| 190 | metacharacters, nor are dot, dash, or at going to mean something special
|
|---|
| 191 | to the shell. Use of C</.+/> would have been insecure in theory because
|
|---|
| 192 | it lets everything through, but Perl doesn't check for that. The lesson
|
|---|
| 193 | is that when untainting, you must be exceedingly careful with your patterns.
|
|---|
| 194 | Laundering data using regular expression is the I<only> mechanism for
|
|---|
| 195 | untainting dirty data, unless you use the strategy detailed below to fork
|
|---|
| 196 | a child of lesser privilege.
|
|---|
| 197 |
|
|---|
| 198 | The example does not untaint C<$data> if C<use locale> is in effect,
|
|---|
| 199 | because the characters matched by C<\w> are determined by the locale.
|
|---|
| 200 | Perl considers that locale definitions are untrustworthy because they
|
|---|
| 201 | contain data from outside the program. If you are writing a
|
|---|
| 202 | locale-aware program, and want to launder data with a regular expression
|
|---|
| 203 | containing C<\w>, put C<no locale> ahead of the expression in the same
|
|---|
| 204 | block. See L<perllocale/SECURITY> for further discussion and examples.
|
|---|
| 205 |
|
|---|
| 206 | =head2 Switches On the "#!" Line
|
|---|
| 207 |
|
|---|
| 208 | When you make a script executable, in order to make it usable as a
|
|---|
| 209 | command, the system will pass switches to perl from the script's #!
|
|---|
| 210 | line. Perl checks that any command line switches given to a setuid
|
|---|
| 211 | (or setgid) script actually match the ones set on the #! line. Some
|
|---|
| 212 | Unix and Unix-like environments impose a one-switch limit on the #!
|
|---|
| 213 | line, so you may need to use something like C<-wU> instead of C<-w -U>
|
|---|
| 214 | under such systems. (This issue should arise only in Unix or
|
|---|
| 215 | Unix-like environments that support #! and setuid or setgid scripts.)
|
|---|
| 216 |
|
|---|
| 217 | =head2 Taint mode and @INC
|
|---|
| 218 |
|
|---|
| 219 | When the taint mode (C<-T>) is in effect, the "." directory is removed
|
|---|
| 220 | from C<@INC>, and the environment variables C<PERL5LIB> and C<PERLLIB>
|
|---|
| 221 | are ignored by Perl. You can still adjust C<@INC> from outside the
|
|---|
| 222 | program by using the C<-I> command line option as explained in
|
|---|
| 223 | L<perlrun>. The two environment variables are ignored because
|
|---|
| 224 | they are obscured, and a user running a program could be unaware that
|
|---|
| 225 | they are set, whereas the C<-I> option is clearly visible and
|
|---|
| 226 | therefore permitted.
|
|---|
| 227 |
|
|---|
| 228 | Another way to modify C<@INC> without modifying the program, is to use
|
|---|
| 229 | the C<lib> pragma, e.g.:
|
|---|
| 230 |
|
|---|
| 231 | perl -Mlib=/foo program
|
|---|
| 232 |
|
|---|
| 233 | The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
|
|---|
| 234 | will automagically remove any duplicated directories, while the later
|
|---|
| 235 | will not.
|
|---|
| 236 |
|
|---|
| 237 | Note that if a tainted string is added to C<@INC>, the following
|
|---|
| 238 | problem will be reported:
|
|---|
| 239 |
|
|---|
| 240 | Insecure dependency in require while running with -T switch
|
|---|
| 241 |
|
|---|
| 242 | =head2 Cleaning Up Your Path
|
|---|
| 243 |
|
|---|
| 244 | For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to
|
|---|
| 245 | a known value, and each directory in the path must be absolute and
|
|---|
| 246 | non-writable by others than its owner and group. You may be surprised to
|
|---|
| 247 | get this message even if the pathname to your executable is fully
|
|---|
| 248 | qualified. This is I<not> generated because you didn't supply a full path
|
|---|
| 249 | to the program; instead, it's generated because you never set your PATH
|
|---|
| 250 | environment variable, or you didn't set it to something that was safe.
|
|---|
| 251 | Because Perl can't guarantee that the executable in question isn't itself
|
|---|
| 252 | going to turn around and execute some other program that is dependent on
|
|---|
| 253 | your PATH, it makes sure you set the PATH.
|
|---|
| 254 |
|
|---|
| 255 | The PATH isn't the only environment variable which can cause problems.
|
|---|
| 256 | Because some shells may use the variables IFS, CDPATH, ENV, and
|
|---|
| 257 | BASH_ENV, Perl checks that those are either empty or untainted when
|
|---|
| 258 | starting subprocesses. You may wish to add something like this to your
|
|---|
| 259 | setid and taint-checking scripts.
|
|---|
| 260 |
|
|---|
| 261 | delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer
|
|---|
| 262 |
|
|---|
| 263 | It's also possible to get into trouble with other operations that don't
|
|---|
| 264 | care whether they use tainted values. Make judicious use of the file
|
|---|
| 265 | tests in dealing with any user-supplied filenames. When possible, do
|
|---|
| 266 | opens and such B<after> properly dropping any special user (or group!)
|
|---|
| 267 | privileges. Perl doesn't prevent you from opening tainted filenames for reading,
|
|---|
| 268 | so be careful what you print out. The tainting mechanism is intended to
|
|---|
| 269 | prevent stupid mistakes, not to remove the need for thought.
|
|---|
| 270 |
|
|---|
| 271 | Perl does not call the shell to expand wild cards when you pass C<system>
|
|---|
| 272 | and C<exec> explicit parameter lists instead of strings with possible shell
|
|---|
| 273 | wildcards in them. Unfortunately, the C<open>, C<glob>, and
|
|---|
| 274 | backtick functions provide no such alternate calling convention, so more
|
|---|
| 275 | subterfuge will be required.
|
|---|
| 276 |
|
|---|
| 277 | Perl provides a reasonably safe way to open a file or pipe from a setuid
|
|---|
| 278 | or setgid program: just create a child process with reduced privilege who
|
|---|
| 279 | does the dirty work for you. First, fork a child using the special
|
|---|
| 280 | C<open> syntax that connects the parent and child by a pipe. Now the
|
|---|
| 281 | child resets its ID set and any other per-process attributes, like
|
|---|
| 282 | environment variables, umasks, current working directories, back to the
|
|---|
| 283 | originals or known safe values. Then the child process, which no longer
|
|---|
| 284 | has any special permissions, does the C<open> or other system call.
|
|---|
| 285 | Finally, the child passes the data it managed to access back to the
|
|---|
| 286 | parent. Because the file or pipe was opened in the child while running
|
|---|
| 287 | under less privilege than the parent, it's not apt to be tricked into
|
|---|
| 288 | doing something it shouldn't.
|
|---|
| 289 |
|
|---|
| 290 | Here's a way to do backticks reasonably safely. Notice how the C<exec> is
|
|---|
| 291 | not called with a string that the shell could expand. This is by far the
|
|---|
| 292 | best way to call something that might be subjected to shell escapes: just
|
|---|
| 293 | never call the shell at all.
|
|---|
| 294 |
|
|---|
| 295 | use English '-no_match_vars';
|
|---|
| 296 | die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
|
|---|
| 297 | if ($pid) { # parent
|
|---|
| 298 | while (<KID>) {
|
|---|
| 299 | # do something
|
|---|
| 300 | }
|
|---|
| 301 | close KID;
|
|---|
| 302 | } else {
|
|---|
| 303 | my @temp = ($EUID, $EGID);
|
|---|
| 304 | my $orig_uid = $UID;
|
|---|
| 305 | my $orig_gid = $GID;
|
|---|
| 306 | $EUID = $UID;
|
|---|
| 307 | $EGID = $GID;
|
|---|
| 308 | # Drop privileges
|
|---|
| 309 | $UID = $orig_uid;
|
|---|
| 310 | $GID = $orig_gid;
|
|---|
| 311 | # Make sure privs are really gone
|
|---|
| 312 | ($EUID, $EGID) = @temp;
|
|---|
| 313 | die "Can't drop privileges"
|
|---|
| 314 | unless $UID == $EUID && $GID eq $EGID;
|
|---|
| 315 | $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
|
|---|
| 316 | # Consider sanitizing the environment even more.
|
|---|
| 317 | exec 'myprog', 'arg1', 'arg2'
|
|---|
| 318 | or die "can't exec myprog: $!";
|
|---|
| 319 | }
|
|---|
| 320 |
|
|---|
| 321 | A similar strategy would work for wildcard expansion via C<glob>, although
|
|---|
| 322 | you can use C<readdir> instead.
|
|---|
| 323 |
|
|---|
| 324 | Taint checking is most useful when although you trust yourself not to have
|
|---|
| 325 | written a program to give away the farm, you don't necessarily trust those
|
|---|
| 326 | who end up using it not to try to trick it into doing something bad. This
|
|---|
| 327 | is the kind of security checking that's useful for set-id programs and
|
|---|
| 328 | programs launched on someone else's behalf, like CGI programs.
|
|---|
| 329 |
|
|---|
| 330 | This is quite different, however, from not even trusting the writer of the
|
|---|
| 331 | code not to try to do something evil. That's the kind of trust needed
|
|---|
| 332 | when someone hands you a program you've never seen before and says, "Here,
|
|---|
| 333 | run this." For that kind of safety, check out the Safe module,
|
|---|
| 334 | included standard in the Perl distribution. This module allows the
|
|---|
| 335 | programmer to set up special compartments in which all system operations
|
|---|
| 336 | are trapped and namespace access is carefully controlled.
|
|---|
| 337 |
|
|---|
| 338 | =head2 Security Bugs
|
|---|
| 339 |
|
|---|
| 340 | Beyond the obvious problems that stem from giving special privileges to
|
|---|
| 341 | systems as flexible as scripts, on many versions of Unix, set-id scripts
|
|---|
| 342 | are inherently insecure right from the start. The problem is a race
|
|---|
| 343 | condition in the kernel. Between the time the kernel opens the file to
|
|---|
| 344 | see which interpreter to run and when the (now-set-id) interpreter turns
|
|---|
| 345 | around and reopens the file to interpret it, the file in question may have
|
|---|
| 346 | changed, especially if you have symbolic links on your system.
|
|---|
| 347 |
|
|---|
| 348 | Fortunately, sometimes this kernel "feature" can be disabled.
|
|---|
| 349 | Unfortunately, there are two ways to disable it. The system can simply
|
|---|
| 350 | outlaw scripts with any set-id bit set, which doesn't help much.
|
|---|
| 351 | Alternately, it can simply ignore the set-id bits on scripts. If the
|
|---|
| 352 | latter is true, Perl can emulate the setuid and setgid mechanism when it
|
|---|
| 353 | notices the otherwise useless setuid/gid bits on Perl scripts. It does
|
|---|
| 354 | this via a special executable called F<suidperl> that is automatically
|
|---|
| 355 | invoked for you if it's needed.
|
|---|
| 356 |
|
|---|
| 357 | However, if the kernel set-id script feature isn't disabled, Perl will
|
|---|
| 358 | complain loudly that your set-id script is insecure. You'll need to
|
|---|
| 359 | either disable the kernel set-id script feature, or put a C wrapper around
|
|---|
| 360 | the script. A C wrapper is just a compiled program that does nothing
|
|---|
| 361 | except call your Perl program. Compiled programs are not subject to the
|
|---|
| 362 | kernel bug that plagues set-id scripts. Here's a simple wrapper, written
|
|---|
| 363 | in C:
|
|---|
| 364 |
|
|---|
| 365 | #define REAL_PATH "/path/to/script"
|
|---|
| 366 | main(ac, av)
|
|---|
| 367 | char **av;
|
|---|
| 368 | {
|
|---|
| 369 | execv(REAL_PATH, av);
|
|---|
| 370 | }
|
|---|
| 371 |
|
|---|
| 372 | Compile this wrapper into a binary executable and then make I<it> rather
|
|---|
| 373 | than your script setuid or setgid.
|
|---|
| 374 |
|
|---|
| 375 | In recent years, vendors have begun to supply systems free of this
|
|---|
| 376 | inherent security bug. On such systems, when the kernel passes the name
|
|---|
| 377 | of the set-id script to open to the interpreter, rather than using a
|
|---|
| 378 | pathname subject to meddling, it instead passes I</dev/fd/3>. This is a
|
|---|
| 379 | special file already opened on the script, so that there can be no race
|
|---|
| 380 | condition for evil scripts to exploit. On these systems, Perl should be
|
|---|
| 381 | compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The F<Configure>
|
|---|
| 382 | program that builds Perl tries to figure this out for itself, so you
|
|---|
| 383 | should never have to specify this yourself. Most modern releases of
|
|---|
| 384 | SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
|
|---|
| 385 |
|
|---|
| 386 | Prior to release 5.6.1 of Perl, bugs in the code of F<suidperl> could
|
|---|
| 387 | introduce a security hole.
|
|---|
| 388 |
|
|---|
| 389 | =head2 Protecting Your Programs
|
|---|
| 390 |
|
|---|
| 391 | There are a number of ways to hide the source to your Perl programs,
|
|---|
| 392 | with varying levels of "security".
|
|---|
| 393 |
|
|---|
| 394 | First of all, however, you I<can't> take away read permission, because
|
|---|
| 395 | the source code has to be readable in order to be compiled and
|
|---|
| 396 | interpreted. (That doesn't mean that a CGI script's source is
|
|---|
| 397 | readable by people on the web, though.) So you have to leave the
|
|---|
| 398 | permissions at the socially friendly 0755 level. This lets
|
|---|
| 399 | people on your local system only see your source.
|
|---|
| 400 |
|
|---|
| 401 | Some people mistakenly regard this as a security problem. If your program does
|
|---|
| 402 | insecure things, and relies on people not knowing how to exploit those
|
|---|
| 403 | insecurities, it is not secure. It is often possible for someone to
|
|---|
| 404 | determine the insecure things and exploit them without viewing the
|
|---|
| 405 | source. Security through obscurity, the name for hiding your bugs
|
|---|
| 406 | instead of fixing them, is little security indeed.
|
|---|
| 407 |
|
|---|
| 408 | You can try using encryption via source filters (Filter::* from CPAN,
|
|---|
| 409 | or Filter::Util::Call and Filter::Simple since Perl 5.8).
|
|---|
| 410 | But crackers might be able to decrypt it. You can try using the byte
|
|---|
| 411 | code compiler and interpreter described below, but crackers might be
|
|---|
| 412 | able to de-compile it. You can try using the native-code compiler
|
|---|
| 413 | described below, but crackers might be able to disassemble it. These
|
|---|
| 414 | pose varying degrees of difficulty to people wanting to get at your
|
|---|
| 415 | code, but none can definitively conceal it (this is true of every
|
|---|
| 416 | language, not just Perl).
|
|---|
| 417 |
|
|---|
| 418 | If you're concerned about people profiting from your code, then the
|
|---|
| 419 | bottom line is that nothing but a restrictive licence will give you
|
|---|
| 420 | legal security. License your software and pepper it with threatening
|
|---|
| 421 | statements like "This is unpublished proprietary software of XYZ Corp.
|
|---|
| 422 | Your access to it does not give you permission to use it blah blah
|
|---|
| 423 | blah." You should see a lawyer to be sure your licence's wording will
|
|---|
| 424 | stand up in court.
|
|---|
| 425 |
|
|---|
| 426 | =head2 Unicode
|
|---|
| 427 |
|
|---|
| 428 | Unicode is a new and complex technology and one may easily overlook
|
|---|
| 429 | certain security pitfalls. See L<perluniintro> for an overview and
|
|---|
| 430 | L<perlunicode> for details, and L<perlunicode/"Security Implications
|
|---|
| 431 | of Unicode"> for security implications in particular.
|
|---|
| 432 |
|
|---|
| 433 | =head2 Algorithmic Complexity Attacks
|
|---|
| 434 |
|
|---|
| 435 | Certain internal algorithms used in the implementation of Perl can
|
|---|
| 436 | be attacked by choosing the input carefully to consume large amounts
|
|---|
| 437 | of either time or space or both. This can lead into the so-called
|
|---|
| 438 | I<Denial of Service> (DoS) attacks.
|
|---|
| 439 |
|
|---|
| 440 | =over 4
|
|---|
| 441 |
|
|---|
| 442 | =item *
|
|---|
| 443 |
|
|---|
| 444 | Hash Function - the algorithm used to "order" hash elements has been
|
|---|
| 445 | changed several times during the development of Perl, mainly to be
|
|---|
| 446 | reasonably fast. In Perl 5.8.1 also the security aspect was taken
|
|---|
| 447 | into account.
|
|---|
| 448 |
|
|---|
| 449 | In Perls before 5.8.1 one could rather easily generate data that as
|
|---|
| 450 | hash keys would cause Perl to consume large amounts of time because
|
|---|
| 451 | internal structure of hashes would badly degenerate. In Perl 5.8.1
|
|---|
| 452 | the hash function is randomly perturbed by a pseudorandom seed which
|
|---|
| 453 | makes generating such naughty hash keys harder.
|
|---|
| 454 | See L<perlrun/PERL_HASH_SEED> for more information.
|
|---|
| 455 |
|
|---|
| 456 | The random perturbation is done by default but if one wants for some
|
|---|
| 457 | reason emulate the old behaviour one can set the environment variable
|
|---|
| 458 | PERL_HASH_SEED to zero (or any other integer). One possible reason
|
|---|
| 459 | for wanting to emulate the old behaviour is that in the new behaviour
|
|---|
| 460 | consecutive runs of Perl will order hash keys differently, which may
|
|---|
| 461 | confuse some applications (like Data::Dumper: the outputs of two
|
|---|
| 462 | different runs are no more identical).
|
|---|
| 463 |
|
|---|
| 464 | B<Perl has never guaranteed any ordering of the hash keys>, and the
|
|---|
| 465 | ordering has already changed several times during the lifetime of
|
|---|
| 466 | Perl 5. Also, the ordering of hash keys has always been, and
|
|---|
| 467 | continues to be, affected by the insertion order.
|
|---|
| 468 |
|
|---|
| 469 | Also note that while the order of the hash elements might be
|
|---|
| 470 | randomised, this "pseudoordering" should B<not> be used for
|
|---|
| 471 | applications like shuffling a list randomly (use List::Util::shuffle()
|
|---|
| 472 | for that, see L<List::Util>, a standard core module since Perl 5.8.0;
|
|---|
| 473 | or the CPAN module Algorithm::Numerical::Shuffle), or for generating
|
|---|
| 474 | permutations (use e.g. the CPAN modules Algorithm::Permute or
|
|---|
| 475 | Algorithm::FastPermute), or for any cryptographic applications.
|
|---|
| 476 |
|
|---|
| 477 | =item *
|
|---|
| 478 |
|
|---|
| 479 | Regular expressions - Perl's regular expression engine is so called
|
|---|
| 480 | NFA (Non-Finite Automaton), which among other things means that it can
|
|---|
| 481 | rather easily consume large amounts of both time and space if the
|
|---|
| 482 | regular expression may match in several ways. Careful crafting of the
|
|---|
| 483 | regular expressions can help but quite often there really isn't much
|
|---|
| 484 | one can do (the book "Mastering Regular Expressions" is required
|
|---|
| 485 | reading, see L<perlfaq2>). Running out of space manifests itself by
|
|---|
| 486 | Perl running out of memory.
|
|---|
| 487 |
|
|---|
| 488 | =item *
|
|---|
| 489 |
|
|---|
| 490 | Sorting - the quicksort algorithm used in Perls before 5.8.0 to
|
|---|
| 491 | implement the sort() function is very easy to trick into misbehaving
|
|---|
| 492 | so that it consumes a lot of time. Nothing more is required than
|
|---|
| 493 | resorting a list already sorted. Starting from Perl 5.8.0 a different
|
|---|
| 494 | sorting algorithm, mergesort, is used. Mergesort is insensitive to
|
|---|
| 495 | its input data, so it cannot be similarly fooled.
|
|---|
| 496 |
|
|---|
| 497 | =back
|
|---|
| 498 |
|
|---|
| 499 | See L<http://www.cs.rice.edu/~scrosby/hash/> for more information,
|
|---|
| 500 | and any computer science text book on the algorithmic complexity.
|
|---|
| 501 |
|
|---|
| 502 | =head1 SEE ALSO
|
|---|
| 503 |
|
|---|
| 504 | L<perlrun> for its description of cleaning up environment variables.
|
|---|