| 1 | =head1 NAME
|
|---|
| 2 |
|
|---|
| 3 | perldebguts - Guts of Perl debugging
|
|---|
| 4 |
|
|---|
| 5 | =head1 DESCRIPTION
|
|---|
| 6 |
|
|---|
| 7 | This is not the perldebug(1) manpage, which tells you how to use
|
|---|
| 8 | the debugger. This manpage describes low-level details concerning
|
|---|
| 9 | the debugger's internals, which range from difficult to impossible
|
|---|
| 10 | to understand for anyone who isn't incredibly intimate with Perl's guts.
|
|---|
| 11 | Caveat lector.
|
|---|
| 12 |
|
|---|
| 13 | =head1 Debugger Internals
|
|---|
| 14 |
|
|---|
| 15 | Perl has special debugging hooks at compile-time and run-time used
|
|---|
| 16 | to create debugging environments. These hooks are not to be confused
|
|---|
| 17 | with the I<perl -Dxxx> command described in L<perlrun>, which is
|
|---|
| 18 | usable only if a special Perl is built per the instructions in the
|
|---|
| 19 | F<INSTALL> podpage in the Perl source tree.
|
|---|
| 20 |
|
|---|
| 21 | For example, whenever you call Perl's built-in C<caller> function
|
|---|
| 22 | from the package C<DB>, the arguments that the corresponding stack
|
|---|
| 23 | frame was called with are copied to the C<@DB::args> array. These
|
|---|
| 24 | mechanisms are enabled by calling Perl with the B<-d> switch.
|
|---|
| 25 | Specifically, the following additional features are enabled
|
|---|
| 26 | (cf. L<perlvar/$^P>):
|
|---|
| 27 |
|
|---|
| 28 | =over 4
|
|---|
| 29 |
|
|---|
| 30 | =item *
|
|---|
| 31 |
|
|---|
| 32 | Perl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require
|
|---|
| 33 | 'perl5db.pl'}> if not present) before the first line of your program.
|
|---|
| 34 |
|
|---|
| 35 | =item *
|
|---|
| 36 |
|
|---|
| 37 | Each array C<@{"_<$filename"}> holds the lines of $filename for a
|
|---|
| 38 | file compiled by Perl. The same is also true for C<eval>ed strings
|
|---|
| 39 | that contain subroutines, or which are currently being executed.
|
|---|
| 40 | The $filename for C<eval>ed strings looks like C<(eval 34)>.
|
|---|
| 41 | Code assertions in regexes look like C<(re_eval 19)>.
|
|---|
| 42 |
|
|---|
| 43 | Values in this array are magical in numeric context: they compare
|
|---|
| 44 | equal to zero only if the line is not breakable.
|
|---|
| 45 |
|
|---|
| 46 | =item *
|
|---|
| 47 |
|
|---|
| 48 | Each hash C<%{"_<$filename"}> contains breakpoints and actions keyed
|
|---|
| 49 | by line number. Individual entries (as opposed to the whole hash)
|
|---|
| 50 | are settable. Perl only cares about Boolean true here, although
|
|---|
| 51 | the values used by F<perl5db.pl> have the form
|
|---|
| 52 | C<"$break_condition\0$action">.
|
|---|
| 53 |
|
|---|
| 54 | The same holds for evaluated strings that contain subroutines, or
|
|---|
| 55 | which are currently being executed. The $filename for C<eval>ed strings
|
|---|
| 56 | looks like C<(eval 34)> or C<(re_eval 19)>.
|
|---|
| 57 |
|
|---|
| 58 | =item *
|
|---|
| 59 |
|
|---|
| 60 | Each scalar C<${"_<$filename"}> contains C<"_<$filename">. This is
|
|---|
| 61 | also the case for evaluated strings that contain subroutines, or
|
|---|
| 62 | which are currently being executed. The $filename for C<eval>ed
|
|---|
| 63 | strings looks like C<(eval 34)> or C<(re_eval 19)>.
|
|---|
| 64 |
|
|---|
| 65 | =item *
|
|---|
| 66 |
|
|---|
| 67 | After each C<require>d file is compiled, but before it is executed,
|
|---|
| 68 | C<DB::postponed(*{"_<$filename"})> is called if the subroutine
|
|---|
| 69 | C<DB::postponed> exists. Here, the $filename is the expanded name of
|
|---|
| 70 | the C<require>d file, as found in the values of %INC.
|
|---|
| 71 |
|
|---|
| 72 | =item *
|
|---|
| 73 |
|
|---|
| 74 | After each subroutine C<subname> is compiled, the existence of
|
|---|
| 75 | C<$DB::postponed{subname}> is checked. If this key exists,
|
|---|
| 76 | C<DB::postponed(subname)> is called if the C<DB::postponed> subroutine
|
|---|
| 77 | also exists.
|
|---|
| 78 |
|
|---|
| 79 | =item *
|
|---|
| 80 |
|
|---|
| 81 | A hash C<%DB::sub> is maintained, whose keys are subroutine names
|
|---|
| 82 | and whose values have the form C<filename:startline-endline>.
|
|---|
| 83 | C<filename> has the form C<(eval 34)> for subroutines defined inside
|
|---|
| 84 | C<eval>s, or C<(re_eval 19)> for those within regex code assertions.
|
|---|
| 85 |
|
|---|
| 86 | =item *
|
|---|
| 87 |
|
|---|
| 88 | When the execution of your program reaches a point that can hold a
|
|---|
| 89 | breakpoint, the C<DB::DB()> subroutine is called if any of the variables
|
|---|
| 90 | C<$DB::trace>, C<$DB::single>, or C<$DB::signal> is true. These variables
|
|---|
| 91 | are not C<local>izable. This feature is disabled when executing
|
|---|
| 92 | inside C<DB::DB()>, including functions called from it
|
|---|
| 93 | unless C<< $^D & (1<<30) >> is true.
|
|---|
| 94 |
|
|---|
| 95 | =item *
|
|---|
| 96 |
|
|---|
| 97 | When execution of the program reaches a subroutine call, a call to
|
|---|
| 98 | C<&DB::sub>(I<args>) is made instead, with C<$DB::sub> holding the
|
|---|
| 99 | name of the called subroutine. (This doesn't happen if the subroutine
|
|---|
| 100 | was compiled in the C<DB> package.)
|
|---|
| 101 |
|
|---|
| 102 | =back
|
|---|
| 103 |
|
|---|
| 104 | Note that if C<&DB::sub> needs external data for it to work, no
|
|---|
| 105 | subroutine call is possible without it. As an example, the standard
|
|---|
| 106 | debugger's C<&DB::sub> depends on the C<$DB::deep> variable
|
|---|
| 107 | (it defines how many levels of recursion deep into the debugger you can go
|
|---|
| 108 | before a mandatory break). If C<$DB::deep> is not defined, subroutine
|
|---|
| 109 | calls are not possible, even though C<&DB::sub> exists.
|
|---|
| 110 |
|
|---|
| 111 | =head2 Writing Your Own Debugger
|
|---|
| 112 |
|
|---|
| 113 | =head3 Environment Variables
|
|---|
| 114 |
|
|---|
| 115 | The C<PERL5DB> environment variable can be used to define a debugger.
|
|---|
| 116 | For example, the minimal "working" debugger (it actually doesn't do anything)
|
|---|
| 117 | consists of one line:
|
|---|
| 118 |
|
|---|
| 119 | sub DB::DB {}
|
|---|
| 120 |
|
|---|
| 121 | It can easily be defined like this:
|
|---|
| 122 |
|
|---|
| 123 | $ PERL5DB="sub DB::DB {}" perl -d your-script
|
|---|
| 124 |
|
|---|
| 125 | Another brief debugger, slightly more useful, can be created
|
|---|
| 126 | with only the line:
|
|---|
| 127 |
|
|---|
| 128 | sub DB::DB {print ++$i; scalar <STDIN>}
|
|---|
| 129 |
|
|---|
| 130 | This debugger prints a number which increments for each statement
|
|---|
| 131 | encountered and waits for you to hit a newline before continuing
|
|---|
| 132 | to the next statement.
|
|---|
| 133 |
|
|---|
| 134 | The following debugger is actually useful:
|
|---|
| 135 |
|
|---|
| 136 | {
|
|---|
| 137 | package DB;
|
|---|
| 138 | sub DB {}
|
|---|
| 139 | sub sub {print ++$i, " $sub\n"; &$sub}
|
|---|
| 140 | }
|
|---|
| 141 |
|
|---|
| 142 | It prints the sequence number of each subroutine call and the name of the
|
|---|
| 143 | called subroutine. Note that C<&DB::sub> is being compiled into the
|
|---|
| 144 | package C<DB> through the use of the C<package> directive.
|
|---|
| 145 |
|
|---|
| 146 | When it starts, the debugger reads your rc file (F<./.perldb> or
|
|---|
| 147 | F<~/.perldb> under Unix), which can set important options.
|
|---|
| 148 | (A subroutine (C<&afterinit>) can be defined here as well; it is executed
|
|---|
| 149 | after the debugger completes its own initialization.)
|
|---|
| 150 |
|
|---|
| 151 | After the rc file is read, the debugger reads the PERLDB_OPTS
|
|---|
| 152 | environment variable and uses it to set debugger options. The
|
|---|
| 153 | contents of this variable are treated as if they were the argument
|
|---|
| 154 | of an C<o ...> debugger command (q.v. in L<perldebug/Options>).
|
|---|
| 155 |
|
|---|
| 156 | =head3 Debugger internal variables
|
|---|
| 157 | In addition to the file and subroutine-related variables mentioned above,
|
|---|
| 158 | the debugger also maintains various magical internal variables.
|
|---|
| 159 |
|
|---|
| 160 | =over 4
|
|---|
| 161 |
|
|---|
| 162 | =item *
|
|---|
| 163 |
|
|---|
| 164 | C<@DB::dbline> is an alias for C<@{"::_<current_file"}>, which
|
|---|
| 165 | holds the lines of the currently-selected file (compiled by Perl), either
|
|---|
| 166 | explicitly chosen with the debugger's C<f> command, or implicitly by flow
|
|---|
| 167 | of execution.
|
|---|
| 168 |
|
|---|
| 169 | Values in this array are magical in numeric context: they compare
|
|---|
| 170 | equal to zero only if the line is not breakable.
|
|---|
| 171 |
|
|---|
| 172 | =item *
|
|---|
| 173 |
|
|---|
| 174 | C<%DB::dbline>, is an alias for C<%{"::_<current_file"}>, which
|
|---|
| 175 | contains breakpoints and actions keyed by line number in
|
|---|
| 176 | the currently-selected file, either explicitly chosen with the
|
|---|
| 177 | debugger's C<f> command, or implicitly by flow of execution.
|
|---|
| 178 |
|
|---|
| 179 | As previously noted, individual entries (as opposed to the whole hash)
|
|---|
| 180 | are settable. Perl only cares about Boolean true here, although
|
|---|
| 181 | the values used by F<perl5db.pl> have the form
|
|---|
| 182 | C<"$break_condition\0$action">.
|
|---|
| 183 |
|
|---|
| 184 | =back
|
|---|
| 185 |
|
|---|
| 186 | =head3 Debugger customization functions
|
|---|
| 187 |
|
|---|
| 188 | Some functions are provided to simplify customization.
|
|---|
| 189 |
|
|---|
| 190 | =over 4
|
|---|
| 191 |
|
|---|
| 192 | =item *
|
|---|
| 193 |
|
|---|
| 194 | See L<perldebug/"Options"> for description of options parsed by
|
|---|
| 195 | C<DB::parse_options(string)> parses debugger options; see
|
|---|
| 196 | L<pperldebug/Options> for a description of options recognized.
|
|---|
| 197 |
|
|---|
| 198 | =item *
|
|---|
| 199 |
|
|---|
| 200 | C<DB::dump_trace(skip[,count])> skips the specified number of frames
|
|---|
| 201 | and returns a list containing information about the calling frames (all
|
|---|
| 202 | of them, if C<count> is missing). Each entry is reference to a hash
|
|---|
| 203 | with keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine
|
|---|
| 204 | name, or info about C<eval>), C<args> (C<undef> or a reference to
|
|---|
| 205 | an array), C<file>, and C<line>.
|
|---|
| 206 |
|
|---|
| 207 | =item *
|
|---|
| 208 |
|
|---|
| 209 | C<DB::print_trace(FH, skip[, count[, short]])> prints
|
|---|
| 210 | formatted info about caller frames. The last two functions may be
|
|---|
| 211 | convenient as arguments to C<< < >>, C<< << >> commands.
|
|---|
| 212 |
|
|---|
| 213 | =back
|
|---|
| 214 |
|
|---|
| 215 | Note that any variables and functions that are not documented in
|
|---|
| 216 | this manpages (or in L<perldebug>) are considered for internal
|
|---|
| 217 | use only, and as such are subject to change without notice.
|
|---|
| 218 |
|
|---|
| 219 | =head1 Frame Listing Output Examples
|
|---|
| 220 |
|
|---|
| 221 | The C<frame> option can be used to control the output of frame
|
|---|
| 222 | information. For example, contrast this expression trace:
|
|---|
| 223 |
|
|---|
| 224 | $ perl -de 42
|
|---|
| 225 | Stack dump during die enabled outside of evals.
|
|---|
| 226 |
|
|---|
| 227 | Loading DB routines from perl5db.pl patch level 0.94
|
|---|
| 228 | Emacs support available.
|
|---|
| 229 |
|
|---|
| 230 | Enter h or `h h' for help.
|
|---|
| 231 |
|
|---|
| 232 | main::(-e:1): 0
|
|---|
| 233 | DB<1> sub foo { 14 }
|
|---|
| 234 |
|
|---|
| 235 | DB<2> sub bar { 3 }
|
|---|
| 236 |
|
|---|
| 237 | DB<3> t print foo() * bar()
|
|---|
| 238 | main::((eval 172):3): print foo() + bar();
|
|---|
| 239 | main::foo((eval 168):2):
|
|---|
| 240 | main::bar((eval 170):2):
|
|---|
| 241 | 42
|
|---|
| 242 |
|
|---|
| 243 | with this one, once the C<o>ption C<frame=2> has been set:
|
|---|
| 244 |
|
|---|
| 245 | DB<4> o f=2
|
|---|
| 246 | frame = '2'
|
|---|
| 247 | DB<5> t print foo() * bar()
|
|---|
| 248 | 3: foo() * bar()
|
|---|
| 249 | entering main::foo
|
|---|
| 250 | 2: sub foo { 14 };
|
|---|
| 251 | exited main::foo
|
|---|
| 252 | entering main::bar
|
|---|
| 253 | 2: sub bar { 3 };
|
|---|
| 254 | exited main::bar
|
|---|
| 255 | 42
|
|---|
| 256 |
|
|---|
| 257 | By way of demonstration, we present below a laborious listing
|
|---|
| 258 | resulting from setting your C<PERLDB_OPTS> environment variable to
|
|---|
| 259 | the value C<f=n N>, and running I<perl -d -V> from the command line.
|
|---|
| 260 | Examples use various values of C<n> are shown to give you a feel
|
|---|
| 261 | for the difference between settings. Long those it may be, this
|
|---|
| 262 | is not a complete listing, but only excerpts.
|
|---|
| 263 |
|
|---|
| 264 | =over 4
|
|---|
| 265 |
|
|---|
| 266 | =item 1
|
|---|
| 267 |
|
|---|
| 268 | entering main::BEGIN
|
|---|
| 269 | entering Config::BEGIN
|
|---|
| 270 | Package lib/Exporter.pm.
|
|---|
| 271 | Package lib/Carp.pm.
|
|---|
| 272 | Package lib/Config.pm.
|
|---|
| 273 | entering Config::TIEHASH
|
|---|
| 274 | entering Exporter::import
|
|---|
| 275 | entering Exporter::export
|
|---|
| 276 | entering Config::myconfig
|
|---|
| 277 | entering Config::FETCH
|
|---|
| 278 | entering Config::FETCH
|
|---|
| 279 | entering Config::FETCH
|
|---|
| 280 | entering Config::FETCH
|
|---|
| 281 |
|
|---|
| 282 | =item 2
|
|---|
| 283 |
|
|---|
| 284 | entering main::BEGIN
|
|---|
| 285 | entering Config::BEGIN
|
|---|
| 286 | Package lib/Exporter.pm.
|
|---|
| 287 | Package lib/Carp.pm.
|
|---|
| 288 | exited Config::BEGIN
|
|---|
| 289 | Package lib/Config.pm.
|
|---|
| 290 | entering Config::TIEHASH
|
|---|
| 291 | exited Config::TIEHASH
|
|---|
| 292 | entering Exporter::import
|
|---|
| 293 | entering Exporter::export
|
|---|
| 294 | exited Exporter::export
|
|---|
| 295 | exited Exporter::import
|
|---|
| 296 | exited main::BEGIN
|
|---|
| 297 | entering Config::myconfig
|
|---|
| 298 | entering Config::FETCH
|
|---|
| 299 | exited Config::FETCH
|
|---|
| 300 | entering Config::FETCH
|
|---|
| 301 | exited Config::FETCH
|
|---|
| 302 | entering Config::FETCH
|
|---|
| 303 |
|
|---|
| 304 | =item 4
|
|---|
| 305 |
|
|---|
| 306 | in $=main::BEGIN() from /dev/null:0
|
|---|
| 307 | in $=Config::BEGIN() from lib/Config.pm:2
|
|---|
| 308 | Package lib/Exporter.pm.
|
|---|
| 309 | Package lib/Carp.pm.
|
|---|
| 310 | Package lib/Config.pm.
|
|---|
| 311 | in $=Config::TIEHASH('Config') from lib/Config.pm:644
|
|---|
| 312 | in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
|
|---|
| 313 | in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li
|
|---|
| 314 | in @=Config::myconfig() from /dev/null:0
|
|---|
| 315 | in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
|
|---|
| 316 | in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
|
|---|
| 317 | in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
|
|---|
| 318 | in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
|
|---|
| 319 | in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574
|
|---|
| 320 | in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574
|
|---|
| 321 |
|
|---|
| 322 | =item 6
|
|---|
| 323 |
|
|---|
| 324 | in $=main::BEGIN() from /dev/null:0
|
|---|
| 325 | in $=Config::BEGIN() from lib/Config.pm:2
|
|---|
| 326 | Package lib/Exporter.pm.
|
|---|
| 327 | Package lib/Carp.pm.
|
|---|
| 328 | out $=Config::BEGIN() from lib/Config.pm:0
|
|---|
| 329 | Package lib/Config.pm.
|
|---|
| 330 | in $=Config::TIEHASH('Config') from lib/Config.pm:644
|
|---|
| 331 | out $=Config::TIEHASH('Config') from lib/Config.pm:644
|
|---|
| 332 | in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
|
|---|
| 333 | in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
|
|---|
| 334 | out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
|
|---|
| 335 | out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
|
|---|
| 336 | out $=main::BEGIN() from /dev/null:0
|
|---|
| 337 | in @=Config::myconfig() from /dev/null:0
|
|---|
| 338 | in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
|
|---|
| 339 | out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
|
|---|
| 340 | in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
|
|---|
| 341 | out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
|
|---|
| 342 | in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
|
|---|
| 343 | out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
|
|---|
| 344 | in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
|
|---|
| 345 |
|
|---|
| 346 | =item 14
|
|---|
| 347 |
|
|---|
| 348 | in $=main::BEGIN() from /dev/null:0
|
|---|
| 349 | in $=Config::BEGIN() from lib/Config.pm:2
|
|---|
| 350 | Package lib/Exporter.pm.
|
|---|
| 351 | Package lib/Carp.pm.
|
|---|
| 352 | out $=Config::BEGIN() from lib/Config.pm:0
|
|---|
| 353 | Package lib/Config.pm.
|
|---|
| 354 | in $=Config::TIEHASH('Config') from lib/Config.pm:644
|
|---|
| 355 | out $=Config::TIEHASH('Config') from lib/Config.pm:644
|
|---|
| 356 | in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
|
|---|
| 357 | in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
|
|---|
| 358 | out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
|
|---|
| 359 | out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
|
|---|
| 360 | out $=main::BEGIN() from /dev/null:0
|
|---|
| 361 | in @=Config::myconfig() from /dev/null:0
|
|---|
| 362 | in $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
|
|---|
| 363 | out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
|
|---|
| 364 | in $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
|
|---|
| 365 | out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
|
|---|
| 366 |
|
|---|
| 367 | =item 30
|
|---|
| 368 |
|
|---|
| 369 | in $=CODE(0x15eca4)() from /dev/null:0
|
|---|
| 370 | in $=CODE(0x182528)() from lib/Config.pm:2
|
|---|
| 371 | Package lib/Exporter.pm.
|
|---|
| 372 | out $=CODE(0x182528)() from lib/Config.pm:0
|
|---|
| 373 | scalar context return from CODE(0x182528): undef
|
|---|
| 374 | Package lib/Config.pm.
|
|---|
| 375 | in $=Config::TIEHASH('Config') from lib/Config.pm:628
|
|---|
| 376 | out $=Config::TIEHASH('Config') from lib/Config.pm:628
|
|---|
| 377 | scalar context return from Config::TIEHASH: empty hash
|
|---|
| 378 | in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
|
|---|
| 379 | in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
|
|---|
| 380 | out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
|
|---|
| 381 | scalar context return from Exporter::export: ''
|
|---|
| 382 | out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
|
|---|
| 383 | scalar context return from Exporter::import: ''
|
|---|
| 384 |
|
|---|
| 385 | =back
|
|---|
| 386 |
|
|---|
| 387 | In all cases shown above, the line indentation shows the call tree.
|
|---|
| 388 | If bit 2 of C<frame> is set, a line is printed on exit from a
|
|---|
| 389 | subroutine as well. If bit 4 is set, the arguments are printed
|
|---|
| 390 | along with the caller info. If bit 8 is set, the arguments are
|
|---|
| 391 | printed even if they are tied or references. If bit 16 is set, the
|
|---|
| 392 | return value is printed, too.
|
|---|
| 393 |
|
|---|
| 394 | When a package is compiled, a line like this
|
|---|
| 395 |
|
|---|
| 396 | Package lib/Carp.pm.
|
|---|
| 397 |
|
|---|
| 398 | is printed with proper indentation.
|
|---|
| 399 |
|
|---|
| 400 | =head1 Debugging regular expressions
|
|---|
| 401 |
|
|---|
| 402 | There are two ways to enable debugging output for regular expressions.
|
|---|
| 403 |
|
|---|
| 404 | If your perl is compiled with C<-DDEBUGGING>, you may use the
|
|---|
| 405 | B<-Dr> flag on the command line.
|
|---|
| 406 |
|
|---|
| 407 | Otherwise, one can C<use re 'debug'>, which has effects at
|
|---|
| 408 | compile time and run time. It is not lexically scoped.
|
|---|
| 409 |
|
|---|
| 410 | =head2 Compile-time output
|
|---|
| 411 |
|
|---|
| 412 | The debugging output at compile time looks like this:
|
|---|
| 413 |
|
|---|
| 414 | Compiling REx `[bc]d(ef*g)+h[ij]k$'
|
|---|
| 415 | size 45 Got 364 bytes for offset annotations.
|
|---|
| 416 | first at 1
|
|---|
| 417 | rarest char g at 0
|
|---|
| 418 | rarest char d at 0
|
|---|
| 419 | 1: ANYOF[bc](12)
|
|---|
| 420 | 12: EXACT <d>(14)
|
|---|
| 421 | 14: CURLYX[0] {1,32767}(28)
|
|---|
| 422 | 16: OPEN1(18)
|
|---|
| 423 | 18: EXACT <e>(20)
|
|---|
| 424 | 20: STAR(23)
|
|---|
| 425 | 21: EXACT <f>(0)
|
|---|
| 426 | 23: EXACT <g>(25)
|
|---|
| 427 | 25: CLOSE1(27)
|
|---|
| 428 | 27: WHILEM[1/1](0)
|
|---|
| 429 | 28: NOTHING(29)
|
|---|
| 430 | 29: EXACT <h>(31)
|
|---|
| 431 | 31: ANYOF[ij](42)
|
|---|
| 432 | 42: EXACT <k>(44)
|
|---|
| 433 | 44: EOL(45)
|
|---|
| 434 | 45: END(0)
|
|---|
| 435 | anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
|
|---|
| 436 | stclass `ANYOF[bc]' minlen 7
|
|---|
| 437 | Offsets: [45]
|
|---|
| 438 | 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
|
|---|
| 439 | 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
|
|---|
| 440 | 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
|
|---|
| 441 | 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
|
|---|
| 442 | Omitting $` $& $' support.
|
|---|
| 443 |
|
|---|
| 444 | The first line shows the pre-compiled form of the regex. The second
|
|---|
| 445 | shows the size of the compiled form (in arbitrary units, usually
|
|---|
| 446 | 4-byte words) and the total number of bytes allocated for the
|
|---|
| 447 | offset/length table, usually 4+C<size>*8. The next line shows the
|
|---|
| 448 | label I<id> of the first node that does a match.
|
|---|
| 449 |
|
|---|
| 450 | The
|
|---|
| 451 |
|
|---|
| 452 | anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
|
|---|
| 453 | stclass `ANYOF[bc]' minlen 7
|
|---|
| 454 |
|
|---|
| 455 | line (split into two lines above) contains optimizer
|
|---|
| 456 | information. In the example shown, the optimizer found that the match
|
|---|
| 457 | should contain a substring C<de> at offset 1, plus substring C<gh>
|
|---|
| 458 | at some offset between 3 and infinity. Moreover, when checking for
|
|---|
| 459 | these substrings (to abandon impossible matches quickly), Perl will check
|
|---|
| 460 | for the substring C<gh> before checking for the substring C<de>. The
|
|---|
| 461 | optimizer may also use the knowledge that the match starts (at the
|
|---|
| 462 | C<first> I<id>) with a character class, and no string
|
|---|
| 463 | shorter than 7 characters can possibly match.
|
|---|
| 464 |
|
|---|
| 465 | The fields of interest which may appear in this line are
|
|---|
| 466 |
|
|---|
| 467 | =over 4
|
|---|
| 468 |
|
|---|
| 469 | =item C<anchored> I<STRING> C<at> I<POS>
|
|---|
| 470 |
|
|---|
| 471 | =item C<floating> I<STRING> C<at> I<POS1..POS2>
|
|---|
| 472 |
|
|---|
| 473 | See above.
|
|---|
| 474 |
|
|---|
| 475 | =item C<matching floating/anchored>
|
|---|
| 476 |
|
|---|
| 477 | Which substring to check first.
|
|---|
| 478 |
|
|---|
| 479 | =item C<minlen>
|
|---|
| 480 |
|
|---|
| 481 | The minimal length of the match.
|
|---|
| 482 |
|
|---|
| 483 | =item C<stclass> I<TYPE>
|
|---|
| 484 |
|
|---|
| 485 | Type of first matching node.
|
|---|
| 486 |
|
|---|
| 487 | =item C<noscan>
|
|---|
| 488 |
|
|---|
| 489 | Don't scan for the found substrings.
|
|---|
| 490 |
|
|---|
| 491 | =item C<isall>
|
|---|
| 492 |
|
|---|
| 493 | Means that the optimizer information is all that the regular
|
|---|
| 494 | expression contains, and thus one does not need to enter the regex engine at
|
|---|
| 495 | all.
|
|---|
| 496 |
|
|---|
| 497 | =item C<GPOS>
|
|---|
| 498 |
|
|---|
| 499 | Set if the pattern contains C<\G>.
|
|---|
| 500 |
|
|---|
| 501 | =item C<plus>
|
|---|
| 502 |
|
|---|
| 503 | Set if the pattern starts with a repeated char (as in C<x+y>).
|
|---|
| 504 |
|
|---|
| 505 | =item C<implicit>
|
|---|
| 506 |
|
|---|
| 507 | Set if the pattern starts with C<.*>.
|
|---|
| 508 |
|
|---|
| 509 | =item C<with eval>
|
|---|
| 510 |
|
|---|
| 511 | Set if the pattern contain eval-groups, such as C<(?{ code })> and
|
|---|
| 512 | C<(??{ code })>.
|
|---|
| 513 |
|
|---|
| 514 | =item C<anchored(TYPE)>
|
|---|
| 515 |
|
|---|
| 516 | If the pattern may match only at a handful of places, (with C<TYPE>
|
|---|
| 517 | being C<BOL>, C<MBOL>, or C<GPOS>. See the table below.
|
|---|
| 518 |
|
|---|
| 519 | =back
|
|---|
| 520 |
|
|---|
| 521 | If a substring is known to match at end-of-line only, it may be
|
|---|
| 522 | followed by C<$>, as in C<floating `k'$>.
|
|---|
| 523 |
|
|---|
| 524 | The optimizer-specific information is used to avoid entering (a slow) regex
|
|---|
| 525 | engine on strings that will not definitely match. If the C<isall> flag
|
|---|
| 526 | is set, a call to the regex engine may be avoided even when the optimizer
|
|---|
| 527 | found an appropriate place for the match.
|
|---|
| 528 |
|
|---|
| 529 | Above the optimizer section is the list of I<nodes> of the compiled
|
|---|
| 530 | form of the regex. Each line has format
|
|---|
| 531 |
|
|---|
| 532 | C< >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>)
|
|---|
| 533 |
|
|---|
| 534 | =head2 Types of nodes
|
|---|
| 535 |
|
|---|
| 536 | Here are the possible types, with short descriptions:
|
|---|
| 537 |
|
|---|
| 538 | # TYPE arg-description [num-args] [longjump-len] DESCRIPTION
|
|---|
| 539 |
|
|---|
| 540 | # Exit points
|
|---|
| 541 | END no End of program.
|
|---|
| 542 | SUCCEED no Return from a subroutine, basically.
|
|---|
| 543 |
|
|---|
| 544 | # Anchors:
|
|---|
| 545 | BOL no Match "" at beginning of line.
|
|---|
| 546 | MBOL no Same, assuming multiline.
|
|---|
| 547 | SBOL no Same, assuming singleline.
|
|---|
| 548 | EOS no Match "" at end of string.
|
|---|
| 549 | EOL no Match "" at end of line.
|
|---|
| 550 | MEOL no Same, assuming multiline.
|
|---|
| 551 | SEOL no Same, assuming singleline.
|
|---|
| 552 | BOUND no Match "" at any word boundary
|
|---|
| 553 | BOUNDL no Match "" at any word boundary
|
|---|
| 554 | NBOUND no Match "" at any word non-boundary
|
|---|
| 555 | NBOUNDL no Match "" at any word non-boundary
|
|---|
| 556 | GPOS no Matches where last m//g left off.
|
|---|
| 557 |
|
|---|
| 558 | # [Special] alternatives
|
|---|
| 559 | ANY no Match any one character (except newline).
|
|---|
| 560 | SANY no Match any one character.
|
|---|
| 561 | ANYOF sv Match character in (or not in) this class.
|
|---|
| 562 | ALNUM no Match any alphanumeric character
|
|---|
| 563 | ALNUML no Match any alphanumeric char in locale
|
|---|
| 564 | NALNUM no Match any non-alphanumeric character
|
|---|
| 565 | NALNUML no Match any non-alphanumeric char in locale
|
|---|
| 566 | SPACE no Match any whitespace character
|
|---|
| 567 | SPACEL no Match any whitespace char in locale
|
|---|
| 568 | NSPACE no Match any non-whitespace character
|
|---|
| 569 | NSPACEL no Match any non-whitespace char in locale
|
|---|
| 570 | DIGIT no Match any numeric character
|
|---|
| 571 | NDIGIT no Match any non-numeric character
|
|---|
| 572 |
|
|---|
| 573 | # BRANCH The set of branches constituting a single choice are hooked
|
|---|
| 574 | # together with their "next" pointers, since precedence prevents
|
|---|
| 575 | # anything being concatenated to any individual branch. The
|
|---|
| 576 | # "next" pointer of the last BRANCH in a choice points to the
|
|---|
| 577 | # thing following the whole choice. This is also where the
|
|---|
| 578 | # final "next" pointer of each individual branch points; each
|
|---|
| 579 | # branch starts with the operand node of a BRANCH node.
|
|---|
| 580 | #
|
|---|
| 581 | BRANCH node Match this alternative, or the next...
|
|---|
| 582 |
|
|---|
| 583 | # BACK Normal "next" pointers all implicitly point forward; BACK
|
|---|
| 584 | # exists to make loop structures possible.
|
|---|
| 585 | # not used
|
|---|
| 586 | BACK no Match "", "next" ptr points backward.
|
|---|
| 587 |
|
|---|
| 588 | # Literals
|
|---|
| 589 | EXACT sv Match this string (preceded by length).
|
|---|
| 590 | EXACTF sv Match this string, folded (prec. by length).
|
|---|
| 591 | EXACTFL sv Match this string, folded in locale (w/len).
|
|---|
| 592 |
|
|---|
| 593 | # Do nothing
|
|---|
| 594 | NOTHING no Match empty string.
|
|---|
| 595 | # A variant of above which delimits a group, thus stops optimizations
|
|---|
| 596 | TAIL no Match empty string. Can jump here from outside.
|
|---|
| 597 |
|
|---|
| 598 | # STAR,PLUS '?', and complex '*' and '+', are implemented as circular
|
|---|
| 599 | # BRANCH structures using BACK. Simple cases (one character
|
|---|
| 600 | # per match) are implemented with STAR and PLUS for speed
|
|---|
| 601 | # and to minimize recursive plunges.
|
|---|
| 602 | #
|
|---|
| 603 | STAR node Match this (simple) thing 0 or more times.
|
|---|
| 604 | PLUS node Match this (simple) thing 1 or more times.
|
|---|
| 605 |
|
|---|
| 606 | CURLY sv 2 Match this simple thing {n,m} times.
|
|---|
| 607 | CURLYN no 2 Match next-after-this simple thing
|
|---|
| 608 | # {n,m} times, set parens.
|
|---|
| 609 | CURLYM no 2 Match this medium-complex thing {n,m} times.
|
|---|
| 610 | CURLYX sv 2 Match this complex thing {n,m} times.
|
|---|
| 611 |
|
|---|
| 612 | # This terminator creates a loop structure for CURLYX
|
|---|
| 613 | WHILEM no Do curly processing and see if rest matches.
|
|---|
| 614 |
|
|---|
| 615 | # OPEN,CLOSE,GROUPP ...are numbered at compile time.
|
|---|
| 616 | OPEN num 1 Mark this point in input as start of #n.
|
|---|
| 617 | CLOSE num 1 Analogous to OPEN.
|
|---|
| 618 |
|
|---|
| 619 | REF num 1 Match some already matched string
|
|---|
| 620 | REFF num 1 Match already matched string, folded
|
|---|
| 621 | REFFL num 1 Match already matched string, folded in loc.
|
|---|
| 622 |
|
|---|
| 623 | # grouping assertions
|
|---|
| 624 | IFMATCH off 1 2 Succeeds if the following matches.
|
|---|
| 625 | UNLESSM off 1 2 Fails if the following matches.
|
|---|
| 626 | SUSPEND off 1 1 "Independent" sub-regex.
|
|---|
| 627 | IFTHEN off 1 1 Switch, should be preceded by switcher .
|
|---|
| 628 | GROUPP num 1 Whether the group matched.
|
|---|
| 629 |
|
|---|
| 630 | # Support for long regex
|
|---|
| 631 | LONGJMP off 1 1 Jump far away.
|
|---|
| 632 | BRANCHJ off 1 1 BRANCH with long offset.
|
|---|
| 633 |
|
|---|
| 634 | # The heavy worker
|
|---|
| 635 | EVAL evl 1 Execute some Perl code.
|
|---|
| 636 |
|
|---|
| 637 | # Modifiers
|
|---|
| 638 | MINMOD no Next operator is not greedy.
|
|---|
| 639 | LOGICAL no Next opcode should set the flag only.
|
|---|
| 640 |
|
|---|
| 641 | # This is not used yet
|
|---|
| 642 | RENUM off 1 1 Group with independently numbered parens.
|
|---|
| 643 |
|
|---|
| 644 | # This is not really a node, but an optimized away piece of a "long" node.
|
|---|
| 645 | # To simplify debugging output, we mark it as if it were a node
|
|---|
| 646 | OPTIMIZED off Placeholder for dump.
|
|---|
| 647 |
|
|---|
| 648 | =for unprinted-credits
|
|---|
| 649 | Next section M-J. Dominus ([email protected]) 20010421
|
|---|
| 650 |
|
|---|
| 651 | Following the optimizer information is a dump of the offset/length
|
|---|
| 652 | table, here split across several lines:
|
|---|
| 653 |
|
|---|
| 654 | Offsets: [45]
|
|---|
| 655 | 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
|
|---|
| 656 | 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
|
|---|
| 657 | 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
|
|---|
| 658 | 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
|
|---|
| 659 |
|
|---|
| 660 | The first line here indicates that the offset/length table contains 45
|
|---|
| 661 | entries. Each entry is a pair of integers, denoted by C<offset[length]>.
|
|---|
| 662 | Entries are numbered starting with 1, so entry #1 here is C<1[4]> and
|
|---|
| 663 | entry #12 is C<5[1]>. C<1[4]> indicates that the node labeled C<1:>
|
|---|
| 664 | (the C<1: ANYOF[bc]>) begins at character position 1 in the
|
|---|
| 665 | pre-compiled form of the regex, and has a length of 4 characters.
|
|---|
| 666 | C<5[1]> in position 12
|
|---|
| 667 | indicates that the node labeled C<12:>
|
|---|
| 668 | (the C<< 12: EXACT <d> >>) begins at character position 5 in the
|
|---|
| 669 | pre-compiled form of the regex, and has a length of 1 character.
|
|---|
| 670 | C<12[1]> in position 14
|
|---|
| 671 | indicates that the node labeled C<14:>
|
|---|
| 672 | (the C<< 14: CURLYX[0] {1,32767} >>) begins at character position 12 in the
|
|---|
| 673 | pre-compiled form of the regex, and has a length of 1 character---that
|
|---|
| 674 | is, it corresponds to the C<+> symbol in the precompiled regex.
|
|---|
| 675 |
|
|---|
| 676 | C<0[0]> items indicate that there is no corresponding node.
|
|---|
| 677 |
|
|---|
| 678 | =head2 Run-time output
|
|---|
| 679 |
|
|---|
| 680 | First of all, when doing a match, one may get no run-time output even
|
|---|
| 681 | if debugging is enabled. This means that the regex engine was never
|
|---|
| 682 | entered and that all of the job was therefore done by the optimizer.
|
|---|
| 683 |
|
|---|
| 684 | If the regex engine was entered, the output may look like this:
|
|---|
| 685 |
|
|---|
| 686 | Matching `[bc]d(ef*g)+h[ij]k$' against `abcdefg__gh__'
|
|---|
| 687 | Setting an EVAL scope, savestack=3
|
|---|
| 688 | 2 <ab> <cdefg__gh_> | 1: ANYOF
|
|---|
| 689 | 3 <abc> <defg__gh_> | 11: EXACT <d>
|
|---|
| 690 | 4 <abcd> <efg__gh_> | 13: CURLYX {1,32767}
|
|---|
| 691 | 4 <abcd> <efg__gh_> | 26: WHILEM
|
|---|
| 692 | 0 out of 1..32767 cc=effff31c
|
|---|
| 693 | 4 <abcd> <efg__gh_> | 15: OPEN1
|
|---|
| 694 | 4 <abcd> <efg__gh_> | 17: EXACT <e>
|
|---|
| 695 | 5 <abcde> <fg__gh_> | 19: STAR
|
|---|
| 696 | EXACT <f> can match 1 times out of 32767...
|
|---|
| 697 | Setting an EVAL scope, savestack=3
|
|---|
| 698 | 6 <bcdef> <g__gh__> | 22: EXACT <g>
|
|---|
| 699 | 7 <bcdefg> <__gh__> | 24: CLOSE1
|
|---|
| 700 | 7 <bcdefg> <__gh__> | 26: WHILEM
|
|---|
| 701 | 1 out of 1..32767 cc=effff31c
|
|---|
| 702 | Setting an EVAL scope, savestack=12
|
|---|
| 703 | 7 <bcdefg> <__gh__> | 15: OPEN1
|
|---|
| 704 | 7 <bcdefg> <__gh__> | 17: EXACT <e>
|
|---|
| 705 | restoring \1 to 4(4)..7
|
|---|
| 706 | failed, try continuation...
|
|---|
| 707 | 7 <bcdefg> <__gh__> | 27: NOTHING
|
|---|
| 708 | 7 <bcdefg> <__gh__> | 28: EXACT <h>
|
|---|
| 709 | failed...
|
|---|
| 710 | failed...
|
|---|
| 711 |
|
|---|
| 712 | The most significant information in the output is about the particular I<node>
|
|---|
| 713 | of the compiled regex that is currently being tested against the target string.
|
|---|
| 714 | The format of these lines is
|
|---|
| 715 |
|
|---|
| 716 | C< >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>> |I<ID>: I<TYPE>
|
|---|
| 717 |
|
|---|
| 718 | The I<TYPE> info is indented with respect to the backtracking level.
|
|---|
| 719 | Other incidental information appears interspersed within.
|
|---|
| 720 |
|
|---|
| 721 | =head1 Debugging Perl memory usage
|
|---|
| 722 |
|
|---|
| 723 | Perl is a profligate wastrel when it comes to memory use. There
|
|---|
| 724 | is a saying that to estimate memory usage of Perl, assume a reasonable
|
|---|
| 725 | algorithm for memory allocation, multiply that estimate by 10, and
|
|---|
| 726 | while you still may miss the mark, at least you won't be quite so
|
|---|
| 727 | astonished. This is not absolutely true, but may provide a good
|
|---|
| 728 | grasp of what happens.
|
|---|
| 729 |
|
|---|
| 730 | Assume that an integer cannot take less than 20 bytes of memory, a
|
|---|
| 731 | float cannot take less than 24 bytes, a string cannot take less
|
|---|
| 732 | than 32 bytes (all these examples assume 32-bit architectures, the
|
|---|
| 733 | result are quite a bit worse on 64-bit architectures). If a variable
|
|---|
| 734 | is accessed in two of three different ways (which require an integer,
|
|---|
| 735 | a float, or a string), the memory footprint may increase yet another
|
|---|
| 736 | 20 bytes. A sloppy malloc(3) implementation can inflate these
|
|---|
| 737 | numbers dramatically.
|
|---|
| 738 |
|
|---|
| 739 | On the opposite end of the scale, a declaration like
|
|---|
| 740 |
|
|---|
| 741 | sub foo;
|
|---|
| 742 |
|
|---|
| 743 | may take up to 500 bytes of memory, depending on which release of Perl
|
|---|
| 744 | you're running.
|
|---|
| 745 |
|
|---|
| 746 | Anecdotal estimates of source-to-compiled code bloat suggest an
|
|---|
| 747 | eightfold increase. This means that the compiled form of reasonable
|
|---|
| 748 | (normally commented, properly indented etc.) code will take
|
|---|
| 749 | about eight times more space in memory than the code took
|
|---|
| 750 | on disk.
|
|---|
| 751 |
|
|---|
| 752 | The B<-DL> command-line switch is obsolete since circa Perl 5.6.0
|
|---|
| 753 | (it was available only if Perl was built with C<-DDEBUGGING>).
|
|---|
| 754 | The switch was used to track Perl's memory allocations and possible
|
|---|
| 755 | memory leaks. These days the use of malloc debugging tools like
|
|---|
| 756 | F<Purify> or F<valgrind> is suggested instead.
|
|---|
| 757 |
|
|---|
| 758 | One way to find out how much memory is being used by Perl data
|
|---|
| 759 | structures is to install the Devel::Size module from CPAN: it gives
|
|---|
| 760 | you the minimum number of bytes required to store a particular data
|
|---|
| 761 | structure. Please be mindful of the difference between the size()
|
|---|
| 762 | and total_size().
|
|---|
| 763 |
|
|---|
| 764 | If Perl has been compiled using Perl's malloc you can analyze Perl
|
|---|
| 765 | memory usage by setting the $ENV{PERL_DEBUG_MSTATS}.
|
|---|
| 766 |
|
|---|
| 767 | =head2 Using C<$ENV{PERL_DEBUG_MSTATS}>
|
|---|
| 768 |
|
|---|
| 769 | If your perl is using Perl's malloc() and was compiled with the
|
|---|
| 770 | necessary switches (this is the default), then it will print memory
|
|---|
| 771 | usage statistics after compiling your code when C<< $ENV{PERL_DEBUG_MSTATS}
|
|---|
| 772 | > 1 >>, and before termination of the program when C<<
|
|---|
| 773 | $ENV{PERL_DEBUG_MSTATS} >= 1 >>. The report format is similar to
|
|---|
| 774 | the following example:
|
|---|
| 775 |
|
|---|
| 776 | $ PERL_DEBUG_MSTATS=2 perl -e "require Carp"
|
|---|
| 777 | Memory allocation statistics after compilation: (buckets 4(4)..8188(8192)
|
|---|
| 778 | 14216 free: 130 117 28 7 9 0 2 2 1 0 0
|
|---|
| 779 | 437 61 36 0 5
|
|---|
| 780 | 60924 used: 125 137 161 55 7 8 6 16 2 0 1
|
|---|
| 781 | 74 109 304 84 20
|
|---|
| 782 | Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048.
|
|---|
| 783 | Memory allocation statistics after execution: (buckets 4(4)..8188(8192)
|
|---|
| 784 | 30888 free: 245 78 85 13 6 2 1 3 2 0 1
|
|---|
| 785 | 315 162 39 42 11
|
|---|
| 786 | 175816 used: 265 176 1112 111 26 22 11 27 2 1 1
|
|---|
| 787 | 196 178 1066 798 39
|
|---|
| 788 | Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144.
|
|---|
| 789 |
|
|---|
| 790 | It is possible to ask for such a statistic at arbitrary points in
|
|---|
| 791 | your execution using the mstat() function out of the standard
|
|---|
| 792 | Devel::Peek module.
|
|---|
| 793 |
|
|---|
| 794 | Here is some explanation of that format:
|
|---|
| 795 |
|
|---|
| 796 | =over 4
|
|---|
| 797 |
|
|---|
| 798 | =item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)>
|
|---|
| 799 |
|
|---|
| 800 | Perl's malloc() uses bucketed allocations. Every request is rounded
|
|---|
| 801 | up to the closest bucket size available, and a bucket is taken from
|
|---|
| 802 | the pool of buckets of that size.
|
|---|
| 803 |
|
|---|
| 804 | The line above describes the limits of buckets currently in use.
|
|---|
| 805 | Each bucket has two sizes: memory footprint and the maximal size
|
|---|
| 806 | of user data that can fit into this bucket. Suppose in the above
|
|---|
| 807 | example that the smallest bucket were size 4. The biggest bucket
|
|---|
| 808 | would have usable size 8188, and the memory footprint would be 8192.
|
|---|
| 809 |
|
|---|
| 810 | In a Perl built for debugging, some buckets may have negative usable
|
|---|
| 811 | size. This means that these buckets cannot (and will not) be used.
|
|---|
| 812 | For larger buckets, the memory footprint may be one page greater
|
|---|
| 813 | than a power of 2. If so, case the corresponding power of two is
|
|---|
| 814 | printed in the C<APPROX> field above.
|
|---|
| 815 |
|
|---|
| 816 | =item Free/Used
|
|---|
| 817 |
|
|---|
| 818 | The 1 or 2 rows of numbers following that correspond to the number
|
|---|
| 819 | of buckets of each size between C<SMALLEST> and C<GREATEST>. In
|
|---|
| 820 | the first row, the sizes (memory footprints) of buckets are powers
|
|---|
| 821 | of two--or possibly one page greater. In the second row, if present,
|
|---|
| 822 | the memory footprints of the buckets are between the memory footprints
|
|---|
| 823 | of two buckets "above".
|
|---|
| 824 |
|
|---|
| 825 | For example, suppose under the previous example, the memory footprints
|
|---|
| 826 | were
|
|---|
| 827 |
|
|---|
| 828 | free: 8 16 32 64 128 256 512 1024 2048 4096 8192
|
|---|
| 829 | 4 12 24 48 80
|
|---|
| 830 |
|
|---|
| 831 | With non-C<DEBUGGING> perl, the buckets starting from C<128> have
|
|---|
| 832 | a 4-byte overhead, and thus an 8192-long bucket may take up to
|
|---|
| 833 | 8188-byte allocations.
|
|---|
| 834 |
|
|---|
| 835 | =item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS>
|
|---|
| 836 |
|
|---|
| 837 | The first two fields give the total amount of memory perl sbrk(2)ed
|
|---|
| 838 | (ess-broken? :-) and number of sbrk(2)s used. The third number is
|
|---|
| 839 | what perl thinks about continuity of returned chunks. So long as
|
|---|
| 840 | this number is positive, malloc() will assume that it is probable
|
|---|
| 841 | that sbrk(2) will provide continuous memory.
|
|---|
| 842 |
|
|---|
| 843 | Memory allocated by external libraries is not counted.
|
|---|
| 844 |
|
|---|
| 845 | =item C<pad: 0>
|
|---|
| 846 |
|
|---|
| 847 | The amount of sbrk(2)ed memory needed to keep buckets aligned.
|
|---|
| 848 |
|
|---|
| 849 | =item C<heads: 2192>
|
|---|
| 850 |
|
|---|
| 851 | Although memory overhead of bigger buckets is kept inside the bucket, for
|
|---|
| 852 | smaller buckets, it is kept in separate areas. This field gives the
|
|---|
| 853 | total size of these areas.
|
|---|
| 854 |
|
|---|
| 855 | =item C<chain: 0>
|
|---|
| 856 |
|
|---|
| 857 | malloc() may want to subdivide a bigger bucket into smaller buckets.
|
|---|
| 858 | If only a part of the deceased bucket is left unsubdivided, the rest
|
|---|
| 859 | is kept as an element of a linked list. This field gives the total
|
|---|
| 860 | size of these chunks.
|
|---|
| 861 |
|
|---|
| 862 | =item C<tail: 6144>
|
|---|
| 863 |
|
|---|
| 864 | To minimize the number of sbrk(2)s, malloc() asks for more memory. This
|
|---|
| 865 | field gives the size of the yet unused part, which is sbrk(2)ed, but
|
|---|
| 866 | never touched.
|
|---|
| 867 |
|
|---|
| 868 | =back
|
|---|
| 869 |
|
|---|
| 870 | =head1 SEE ALSO
|
|---|
| 871 |
|
|---|
| 872 | L<perldebug>,
|
|---|
| 873 | L<perlguts>,
|
|---|
| 874 | L<perlrun>
|
|---|
| 875 | L<re>,
|
|---|
| 876 | and
|
|---|
| 877 | L<Devel::DProf>.
|
|---|