| 1 | \chapter{The Python Profilers \label{profile}}
|
|---|
| 2 |
|
|---|
| 3 | \sectionauthor{James Roskind}{}
|
|---|
| 4 |
|
|---|
| 5 | Copyright \copyright{} 1994, by InfoSeek Corporation, all rights reserved.
|
|---|
| 6 | \index{InfoSeek Corporation}
|
|---|
| 7 |
|
|---|
| 8 | Written by James Roskind.\footnote{
|
|---|
| 9 | Updated and converted to \LaTeX\ by Guido van Rossum.
|
|---|
| 10 | Further updated by Armin Rigo to integrate the documentation for the new
|
|---|
| 11 | \module{cProfile} module of Python 2.5.}
|
|---|
| 12 |
|
|---|
| 13 | Permission to use, copy, modify, and distribute this Python software
|
|---|
| 14 | and its associated documentation for any purpose (subject to the
|
|---|
| 15 | restriction in the following sentence) without fee is hereby granted,
|
|---|
| 16 | provided that the above copyright notice appears in all copies, and
|
|---|
| 17 | that both that copyright notice and this permission notice appear in
|
|---|
| 18 | supporting documentation, and that the name of InfoSeek not be used in
|
|---|
| 19 | advertising or publicity pertaining to distribution of the software
|
|---|
| 20 | without specific, written prior permission. This permission is
|
|---|
| 21 | explicitly restricted to the copying and modification of the software
|
|---|
| 22 | to remain in Python, compiled Python, or other languages (such as C)
|
|---|
| 23 | wherein the modified or derived code is exclusively imported into a
|
|---|
| 24 | Python module.
|
|---|
| 25 |
|
|---|
| 26 | INFOSEEK CORPORATION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS
|
|---|
| 27 | SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
|
|---|
| 28 | FITNESS. IN NO EVENT SHALL INFOSEEK CORPORATION BE LIABLE FOR ANY
|
|---|
| 29 | SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
|
|---|
| 30 | RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
|
|---|
| 31 | CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
|
|---|
| 32 | CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
|---|
| 33 |
|
|---|
| 34 |
|
|---|
| 35 | The profiler was written after only programming in Python for 3 weeks.
|
|---|
| 36 | As a result, it is probably clumsy code, but I don't know for sure yet
|
|---|
| 37 | 'cause I'm a beginner :-). I did work hard to make the code run fast,
|
|---|
| 38 | so that profiling would be a reasonable thing to do. I tried not to
|
|---|
| 39 | repeat code fragments, but I'm sure I did some stuff in really awkward
|
|---|
| 40 | ways at times. Please send suggestions for improvements to:
|
|---|
| 41 | \email{[email protected]}. I won't promise \emph{any} support. ...but
|
|---|
| 42 | I'd appreciate the feedback.
|
|---|
| 43 |
|
|---|
| 44 |
|
|---|
| 45 | \section{Introduction to the profilers}
|
|---|
| 46 | \nodename{Profiler Introduction}
|
|---|
| 47 |
|
|---|
| 48 | A \dfn{profiler} is a program that describes the run time performance
|
|---|
| 49 | of a program, providing a variety of statistics. This documentation
|
|---|
| 50 | describes the profiler functionality provided in the modules
|
|---|
| 51 | \module{profile} and \module{pstats}. This profiler provides
|
|---|
| 52 | \dfn{deterministic profiling} of any Python programs. It also
|
|---|
| 53 | provides a series of report generation tools to allow users to rapidly
|
|---|
| 54 | examine the results of a profile operation.
|
|---|
| 55 | \index{deterministic profiling}
|
|---|
| 56 | \index{profiling, deterministic}
|
|---|
| 57 |
|
|---|
| 58 | The Python standard library provides three different profilers:
|
|---|
| 59 |
|
|---|
| 60 | \begin{enumerate}
|
|---|
| 61 | \item \module{profile}, a pure Python module, described in the sequel.
|
|---|
| 62 | Copyright \copyright{} 1994, by InfoSeek Corporation.
|
|---|
| 63 | \versionchanged[also reports the time spent in calls to built-in
|
|---|
| 64 | functions and methods]{2.4}
|
|---|
| 65 |
|
|---|
| 66 | \item \module{cProfile}, a module written in C, with a reasonable
|
|---|
| 67 | overhead that makes it suitable for profiling long-running programs.
|
|---|
| 68 | Based on \module{lsprof}, contributed by Brett Rosen and Ted Czotter.
|
|---|
| 69 | \versionadded{2.5}
|
|---|
| 70 |
|
|---|
| 71 | \item \module{hotshot}, a C module focusing on minimizing the overhead
|
|---|
| 72 | while profiling, at the expense of long data post-processing times.
|
|---|
| 73 | \versionchanged[the results should be more meaningful than in the
|
|---|
| 74 | past: the timing core contained a critical bug]{2.5}
|
|---|
| 75 | \end{enumerate}
|
|---|
| 76 |
|
|---|
| 77 | The \module{profile} and \module{cProfile} modules export the same
|
|---|
| 78 | interface, so they are mostly interchangeables; \module{cProfile} has a
|
|---|
| 79 | much lower overhead but is not so far as well-tested and might not be
|
|---|
| 80 | available on all systems. \module{cProfile} is really a compatibility
|
|---|
| 81 | layer on top of the internal \module{_lsprof} module. The
|
|---|
| 82 | \module{hotshot} module is reserved to specialized usages.
|
|---|
| 83 |
|
|---|
| 84 | %\section{How Is This Profiler Different From The Old Profiler?}
|
|---|
| 85 | %\nodename{Profiler Changes}
|
|---|
| 86 | %
|
|---|
| 87 | %(This section is of historical importance only; the old profiler
|
|---|
| 88 | %discussed here was last seen in Python 1.1.)
|
|---|
| 89 | %
|
|---|
| 90 | %The big changes from old profiling module are that you get more
|
|---|
| 91 | %information, and you pay less CPU time. It's not a trade-off, it's a
|
|---|
| 92 | %trade-up.
|
|---|
| 93 | %
|
|---|
| 94 | %To be specific:
|
|---|
| 95 | %
|
|---|
| 96 | %\begin{description}
|
|---|
| 97 | %
|
|---|
| 98 | %\item[Bugs removed:]
|
|---|
| 99 | %Local stack frame is no longer molested, execution time is now charged
|
|---|
| 100 | %to correct functions.
|
|---|
| 101 | %
|
|---|
| 102 | %\item[Accuracy increased:]
|
|---|
| 103 | %Profiler execution time is no longer charged to user's code,
|
|---|
| 104 | %calibration for platform is supported, file reads are not done \emph{by}
|
|---|
| 105 | %profiler \emph{during} profiling (and charged to user's code!).
|
|---|
| 106 | %
|
|---|
| 107 | %\item[Speed increased:]
|
|---|
| 108 | %Overhead CPU cost was reduced by more than a factor of two (perhaps a
|
|---|
| 109 | %factor of five), lightweight profiler module is all that must be
|
|---|
| 110 | %loaded, and the report generating module (\module{pstats}) is not needed
|
|---|
| 111 | %during profiling.
|
|---|
| 112 | %
|
|---|
| 113 | %\item[Recursive functions support:]
|
|---|
| 114 | %Cumulative times in recursive functions are correctly calculated;
|
|---|
| 115 | %recursive entries are counted.
|
|---|
| 116 | %
|
|---|
| 117 | %\item[Large growth in report generating UI:]
|
|---|
| 118 | %Distinct profiles runs can be added together forming a comprehensive
|
|---|
| 119 | %report; functions that import statistics take arbitrary lists of
|
|---|
| 120 | %files; sorting criteria is now based on keywords (instead of 4 integer
|
|---|
| 121 | %options); reports shows what functions were profiled as well as what
|
|---|
| 122 | %profile file was referenced; output format has been improved.
|
|---|
| 123 | %
|
|---|
| 124 | %\end{description}
|
|---|
| 125 |
|
|---|
| 126 |
|
|---|
| 127 | \section{Instant User's Manual \label{profile-instant}}
|
|---|
| 128 |
|
|---|
| 129 | This section is provided for users that ``don't want to read the
|
|---|
| 130 | manual.'' It provides a very brief overview, and allows a user to
|
|---|
| 131 | rapidly perform profiling on an existing application.
|
|---|
| 132 |
|
|---|
| 133 | To profile an application with a main entry point of \function{foo()},
|
|---|
| 134 | you would add the following to your module:
|
|---|
| 135 |
|
|---|
| 136 | \begin{verbatim}
|
|---|
| 137 | import cProfile
|
|---|
| 138 | cProfile.run('foo()')
|
|---|
| 139 | \end{verbatim}
|
|---|
| 140 |
|
|---|
| 141 | (Use \module{profile} instead of \module{cProfile} if the latter is not
|
|---|
| 142 | available on your system.)
|
|---|
| 143 |
|
|---|
| 144 | The above action would cause \function{foo()} to be run, and a series of
|
|---|
| 145 | informative lines (the profile) to be printed. The above approach is
|
|---|
| 146 | most useful when working with the interpreter. If you would like to
|
|---|
| 147 | save the results of a profile into a file for later examination, you
|
|---|
| 148 | can supply a file name as the second argument to the \function{run()}
|
|---|
| 149 | function:
|
|---|
| 150 |
|
|---|
| 151 | \begin{verbatim}
|
|---|
| 152 | import cProfile
|
|---|
| 153 | cProfile.run('foo()', 'fooprof')
|
|---|
| 154 | \end{verbatim}
|
|---|
| 155 |
|
|---|
| 156 | The file \file{cProfile.py} can also be invoked as
|
|---|
| 157 | a script to profile another script. For example:
|
|---|
| 158 |
|
|---|
| 159 | \begin{verbatim}
|
|---|
| 160 | python -m cProfile myscript.py
|
|---|
| 161 | \end{verbatim}
|
|---|
| 162 |
|
|---|
| 163 | \file{cProfile.py} accepts two optional arguments on the command line:
|
|---|
| 164 |
|
|---|
| 165 | \begin{verbatim}
|
|---|
| 166 | cProfile.py [-o output_file] [-s sort_order]
|
|---|
| 167 | \end{verbatim}
|
|---|
| 168 |
|
|---|
| 169 | \programopt{-s} only applies to standard output (\programopt{-o} is
|
|---|
| 170 | not supplied). Look in the \class{Stats} documentation for valid sort
|
|---|
| 171 | values.
|
|---|
| 172 |
|
|---|
| 173 | When you wish to review the profile, you should use the methods in the
|
|---|
| 174 | \module{pstats} module. Typically you would load the statistics data as
|
|---|
| 175 | follows:
|
|---|
| 176 |
|
|---|
| 177 | \begin{verbatim}
|
|---|
| 178 | import pstats
|
|---|
| 179 | p = pstats.Stats('fooprof')
|
|---|
| 180 | \end{verbatim}
|
|---|
| 181 |
|
|---|
| 182 | The class \class{Stats} (the above code just created an instance of
|
|---|
| 183 | this class) has a variety of methods for manipulating and printing the
|
|---|
| 184 | data that was just read into \code{p}. When you ran
|
|---|
| 185 | \function{cProfile.run()} above, what was printed was the result of three
|
|---|
| 186 | method calls:
|
|---|
| 187 |
|
|---|
| 188 | \begin{verbatim}
|
|---|
| 189 | p.strip_dirs().sort_stats(-1).print_stats()
|
|---|
| 190 | \end{verbatim}
|
|---|
| 191 |
|
|---|
| 192 | The first method removed the extraneous path from all the module
|
|---|
| 193 | names. The second method sorted all the entries according to the
|
|---|
| 194 | standard module/line/name string that is printed.
|
|---|
| 195 | %(this is to comply with the semantics of the old profiler).
|
|---|
| 196 | The third method printed out
|
|---|
| 197 | all the statistics. You might try the following sort calls:
|
|---|
| 198 |
|
|---|
| 199 | \begin{verbatim}
|
|---|
| 200 | p.sort_stats('name')
|
|---|
| 201 | p.print_stats()
|
|---|
| 202 | \end{verbatim}
|
|---|
| 203 |
|
|---|
| 204 | The first call will actually sort the list by function name, and the
|
|---|
| 205 | second call will print out the statistics. The following are some
|
|---|
| 206 | interesting calls to experiment with:
|
|---|
| 207 |
|
|---|
| 208 | \begin{verbatim}
|
|---|
| 209 | p.sort_stats('cumulative').print_stats(10)
|
|---|
| 210 | \end{verbatim}
|
|---|
| 211 |
|
|---|
| 212 | This sorts the profile by cumulative time in a function, and then only
|
|---|
| 213 | prints the ten most significant lines. If you want to understand what
|
|---|
| 214 | algorithms are taking time, the above line is what you would use.
|
|---|
| 215 |
|
|---|
| 216 | If you were looking to see what functions were looping a lot, and
|
|---|
| 217 | taking a lot of time, you would do:
|
|---|
| 218 |
|
|---|
| 219 | \begin{verbatim}
|
|---|
| 220 | p.sort_stats('time').print_stats(10)
|
|---|
| 221 | \end{verbatim}
|
|---|
| 222 |
|
|---|
| 223 | to sort according to time spent within each function, and then print
|
|---|
| 224 | the statistics for the top ten functions.
|
|---|
| 225 |
|
|---|
| 226 | You might also try:
|
|---|
| 227 |
|
|---|
| 228 | \begin{verbatim}
|
|---|
| 229 | p.sort_stats('file').print_stats('__init__')
|
|---|
| 230 | \end{verbatim}
|
|---|
| 231 |
|
|---|
| 232 | This will sort all the statistics by file name, and then print out
|
|---|
| 233 | statistics for only the class init methods (since they are spelled
|
|---|
| 234 | with \code{__init__} in them). As one final example, you could try:
|
|---|
| 235 |
|
|---|
| 236 | \begin{verbatim}
|
|---|
| 237 | p.sort_stats('time', 'cum').print_stats(.5, 'init')
|
|---|
| 238 | \end{verbatim}
|
|---|
| 239 |
|
|---|
| 240 | This line sorts statistics with a primary key of time, and a secondary
|
|---|
| 241 | key of cumulative time, and then prints out some of the statistics.
|
|---|
| 242 | To be specific, the list is first culled down to 50\% (re: \samp{.5})
|
|---|
| 243 | of its original size, then only lines containing \code{init} are
|
|---|
| 244 | maintained, and that sub-sub-list is printed.
|
|---|
| 245 |
|
|---|
| 246 | If you wondered what functions called the above functions, you could
|
|---|
| 247 | now (\code{p} is still sorted according to the last criteria) do:
|
|---|
| 248 |
|
|---|
| 249 | \begin{verbatim}
|
|---|
| 250 | p.print_callers(.5, 'init')
|
|---|
| 251 | \end{verbatim}
|
|---|
| 252 |
|
|---|
| 253 | and you would get a list of callers for each of the listed functions.
|
|---|
| 254 |
|
|---|
| 255 | If you want more functionality, you're going to have to read the
|
|---|
| 256 | manual, or guess what the following functions do:
|
|---|
| 257 |
|
|---|
| 258 | \begin{verbatim}
|
|---|
| 259 | p.print_callees()
|
|---|
| 260 | p.add('fooprof')
|
|---|
| 261 | \end{verbatim}
|
|---|
| 262 |
|
|---|
| 263 | Invoked as a script, the \module{pstats} module is a statistics
|
|---|
| 264 | browser for reading and examining profile dumps. It has a simple
|
|---|
| 265 | line-oriented interface (implemented using \refmodule{cmd}) and
|
|---|
| 266 | interactive help.
|
|---|
| 267 |
|
|---|
| 268 | \section{What Is Deterministic Profiling?}
|
|---|
| 269 | \nodename{Deterministic Profiling}
|
|---|
| 270 |
|
|---|
| 271 | \dfn{Deterministic profiling} is meant to reflect the fact that all
|
|---|
| 272 | \emph{function call}, \emph{function return}, and \emph{exception} events
|
|---|
| 273 | are monitored, and precise timings are made for the intervals between
|
|---|
| 274 | these events (during which time the user's code is executing). In
|
|---|
| 275 | contrast, \dfn{statistical profiling} (which is not done by this
|
|---|
| 276 | module) randomly samples the effective instruction pointer, and
|
|---|
| 277 | deduces where time is being spent. The latter technique traditionally
|
|---|
| 278 | involves less overhead (as the code does not need to be instrumented),
|
|---|
| 279 | but provides only relative indications of where time is being spent.
|
|---|
| 280 |
|
|---|
| 281 | In Python, since there is an interpreter active during execution, the
|
|---|
| 282 | presence of instrumented code is not required to do deterministic
|
|---|
| 283 | profiling. Python automatically provides a \dfn{hook} (optional
|
|---|
| 284 | callback) for each event. In addition, the interpreted nature of
|
|---|
| 285 | Python tends to add so much overhead to execution, that deterministic
|
|---|
| 286 | profiling tends to only add small processing overhead in typical
|
|---|
| 287 | applications. The result is that deterministic profiling is not that
|
|---|
| 288 | expensive, yet provides extensive run time statistics about the
|
|---|
| 289 | execution of a Python program.
|
|---|
| 290 |
|
|---|
| 291 | Call count statistics can be used to identify bugs in code (surprising
|
|---|
| 292 | counts), and to identify possible inline-expansion points (high call
|
|---|
| 293 | counts). Internal time statistics can be used to identify ``hot
|
|---|
| 294 | loops'' that should be carefully optimized. Cumulative time
|
|---|
| 295 | statistics should be used to identify high level errors in the
|
|---|
| 296 | selection of algorithms. Note that the unusual handling of cumulative
|
|---|
| 297 | times in this profiler allows statistics for recursive implementations
|
|---|
| 298 | of algorithms to be directly compared to iterative implementations.
|
|---|
| 299 |
|
|---|
| 300 |
|
|---|
| 301 | \section{Reference Manual -- \module{profile} and \module{cProfile}}
|
|---|
| 302 |
|
|---|
| 303 | \declaremodule{standard}{profile}
|
|---|
| 304 | \declaremodule{standard}{cProfile}
|
|---|
| 305 | \modulesynopsis{Python profiler}
|
|---|
| 306 |
|
|---|
| 307 |
|
|---|
| 308 |
|
|---|
| 309 | The primary entry point for the profiler is the global function
|
|---|
| 310 | \function{profile.run()} (resp. \function{cProfile.run()}).
|
|---|
| 311 | It is typically used to create any profile
|
|---|
| 312 | information. The reports are formatted and printed using methods of
|
|---|
| 313 | the class \class{pstats.Stats}. The following is a description of all
|
|---|
| 314 | of these standard entry points and functions. For a more in-depth
|
|---|
| 315 | view of some of the code, consider reading the later section on
|
|---|
| 316 | Profiler Extensions, which includes discussion of how to derive
|
|---|
| 317 | ``better'' profilers from the classes presented, or reading the source
|
|---|
| 318 | code for these modules.
|
|---|
| 319 |
|
|---|
| 320 | \begin{funcdesc}{run}{command\optional{, filename}}
|
|---|
| 321 |
|
|---|
| 322 | This function takes a single argument that has can be passed to the
|
|---|
| 323 | \keyword{exec} statement, and an optional file name. In all cases this
|
|---|
| 324 | routine attempts to \keyword{exec} its first argument, and gather profiling
|
|---|
| 325 | statistics from the execution. If no file name is present, then this
|
|---|
| 326 | function automatically prints a simple profiling report, sorted by the
|
|---|
| 327 | standard name string (file/line/function-name) that is presented in
|
|---|
| 328 | each line. The following is a typical output from such a call:
|
|---|
| 329 |
|
|---|
| 330 | \begin{verbatim}
|
|---|
| 331 | 2706 function calls (2004 primitive calls) in 4.504 CPU seconds
|
|---|
| 332 |
|
|---|
| 333 | Ordered by: standard name
|
|---|
| 334 |
|
|---|
| 335 | ncalls tottime percall cumtime percall filename:lineno(function)
|
|---|
| 336 | 2 0.006 0.003 0.953 0.477 pobject.py:75(save_objects)
|
|---|
| 337 | 43/3 0.533 0.012 0.749 0.250 pobject.py:99(evaluate)
|
|---|
| 338 | ...
|
|---|
| 339 | \end{verbatim}
|
|---|
| 340 |
|
|---|
| 341 | The first line indicates that 2706 calls were
|
|---|
| 342 | monitored. Of those calls, 2004 were \dfn{primitive}. We define
|
|---|
| 343 | \dfn{primitive} to mean that the call was not induced via recursion.
|
|---|
| 344 | The next line: \code{Ordered by:\ standard name}, indicates that
|
|---|
| 345 | the text string in the far right column was used to sort the output.
|
|---|
| 346 | The column headings include:
|
|---|
| 347 |
|
|---|
| 348 | \begin{description}
|
|---|
| 349 |
|
|---|
| 350 | \item[ncalls ]
|
|---|
| 351 | for the number of calls,
|
|---|
| 352 |
|
|---|
| 353 | \item[tottime ]
|
|---|
| 354 | for the total time spent in the given function (and excluding time
|
|---|
| 355 | made in calls to sub-functions),
|
|---|
| 356 |
|
|---|
| 357 | \item[percall ]
|
|---|
| 358 | is the quotient of \code{tottime} divided by \code{ncalls}
|
|---|
| 359 |
|
|---|
| 360 | \item[cumtime ]
|
|---|
| 361 | is the total time spent in this and all subfunctions (from invocation
|
|---|
| 362 | till exit). This figure is accurate \emph{even} for recursive
|
|---|
| 363 | functions.
|
|---|
| 364 |
|
|---|
| 365 | \item[percall ]
|
|---|
| 366 | is the quotient of \code{cumtime} divided by primitive calls
|
|---|
| 367 |
|
|---|
| 368 | \item[filename:lineno(function) ]
|
|---|
| 369 | provides the respective data of each function
|
|---|
| 370 |
|
|---|
| 371 | \end{description}
|
|---|
| 372 |
|
|---|
| 373 | When there are two numbers in the first column (for example,
|
|---|
| 374 | \samp{43/3}), then the latter is the number of primitive calls, and
|
|---|
| 375 | the former is the actual number of calls. Note that when the function
|
|---|
| 376 | does not recurse, these two values are the same, and only the single
|
|---|
| 377 | figure is printed.
|
|---|
| 378 |
|
|---|
| 379 | \end{funcdesc}
|
|---|
| 380 |
|
|---|
| 381 | \begin{funcdesc}{runctx}{command, globals, locals\optional{, filename}}
|
|---|
| 382 | This function is similar to \function{run()}, with added
|
|---|
| 383 | arguments to supply the globals and locals dictionaries for the
|
|---|
| 384 | \var{command} string.
|
|---|
| 385 | \end{funcdesc}
|
|---|
| 386 |
|
|---|
| 387 | Analysis of the profiler data is done using the \class{Stats} class.
|
|---|
| 388 |
|
|---|
| 389 | \note{The \class{Stats} class is defined in the \module{pstats} module.}
|
|---|
| 390 |
|
|---|
| 391 | % now switch modules....
|
|---|
| 392 | % (This \stmodindex use may be hard to change ;-( )
|
|---|
| 393 | \stmodindex{pstats}
|
|---|
| 394 |
|
|---|
| 395 | \begin{classdesc}{Stats}{filename\optional{, stream=sys.stdout\optional{, \moreargs}}}
|
|---|
| 396 | This class constructor creates an instance of a ``statistics object''
|
|---|
| 397 | from a \var{filename} (or set of filenames). \class{Stats} objects are
|
|---|
| 398 | manipulated by methods, in order to print useful reports. You may specify
|
|---|
| 399 | an alternate output stream by giving the keyword argument, \code{stream}.
|
|---|
| 400 |
|
|---|
| 401 | The file selected by the above constructor must have been created by the
|
|---|
| 402 | corresponding version of \module{profile} or \module{cProfile}. To be
|
|---|
| 403 | specific, there is \emph{no} file compatibility guaranteed with future
|
|---|
| 404 | versions of this profiler, and there is no compatibility with files produced
|
|---|
| 405 | by other profilers.
|
|---|
| 406 | %(such as the old system profiler).
|
|---|
| 407 |
|
|---|
| 408 | If several files are provided, all the statistics for identical
|
|---|
| 409 | functions will be coalesced, so that an overall view of several
|
|---|
| 410 | processes can be considered in a single report. If additional files
|
|---|
| 411 | need to be combined with data in an existing \class{Stats} object, the
|
|---|
| 412 | \method{add()} method can be used.
|
|---|
| 413 |
|
|---|
| 414 | \versionchanged[The \var{stream} parameter was added]{2.5}
|
|---|
| 415 | \end{classdesc}
|
|---|
| 416 |
|
|---|
| 417 |
|
|---|
| 418 | \subsection{The \class{Stats} Class \label{profile-stats}}
|
|---|
| 419 |
|
|---|
| 420 | \class{Stats} objects have the following methods:
|
|---|
| 421 |
|
|---|
| 422 | \begin{methoddesc}[Stats]{strip_dirs}{}
|
|---|
| 423 | This method for the \class{Stats} class removes all leading path
|
|---|
| 424 | information from file names. It is very useful in reducing the size
|
|---|
| 425 | of the printout to fit within (close to) 80 columns. This method
|
|---|
| 426 | modifies the object, and the stripped information is lost. After
|
|---|
| 427 | performing a strip operation, the object is considered to have its
|
|---|
| 428 | entries in a ``random'' order, as it was just after object
|
|---|
| 429 | initialization and loading. If \method{strip_dirs()} causes two
|
|---|
| 430 | function names to be indistinguishable (they are on the same
|
|---|
| 431 | line of the same filename, and have the same function name), then the
|
|---|
| 432 | statistics for these two entries are accumulated into a single entry.
|
|---|
| 433 | \end{methoddesc}
|
|---|
| 434 |
|
|---|
| 435 |
|
|---|
| 436 | \begin{methoddesc}[Stats]{add}{filename\optional{, \moreargs}}
|
|---|
| 437 | This method of the \class{Stats} class accumulates additional
|
|---|
| 438 | profiling information into the current profiling object. Its
|
|---|
| 439 | arguments should refer to filenames created by the corresponding
|
|---|
| 440 | version of \function{profile.run()} or \function{cProfile.run()}.
|
|---|
| 441 | Statistics for identically named
|
|---|
| 442 | (re: file, line, name) functions are automatically accumulated into
|
|---|
| 443 | single function statistics.
|
|---|
| 444 | \end{methoddesc}
|
|---|
| 445 |
|
|---|
| 446 | \begin{methoddesc}[Stats]{dump_stats}{filename}
|
|---|
| 447 | Save the data loaded into the \class{Stats} object to a file named
|
|---|
| 448 | \var{filename}. The file is created if it does not exist, and is
|
|---|
| 449 | overwritten if it already exists. This is equivalent to the method of
|
|---|
| 450 | the same name on the \class{profile.Profile} and
|
|---|
| 451 | \class{cProfile.Profile} classes.
|
|---|
| 452 | \versionadded{2.3}
|
|---|
| 453 | \end{methoddesc}
|
|---|
| 454 |
|
|---|
| 455 | \begin{methoddesc}[Stats]{sort_stats}{key\optional{, \moreargs}}
|
|---|
| 456 | This method modifies the \class{Stats} object by sorting it according
|
|---|
| 457 | to the supplied criteria. The argument is typically a string
|
|---|
| 458 | identifying the basis of a sort (example: \code{'time'} or
|
|---|
| 459 | \code{'name'}).
|
|---|
| 460 |
|
|---|
| 461 | When more than one key is provided, then additional keys are used as
|
|---|
| 462 | secondary criteria when there is equality in all keys selected
|
|---|
| 463 | before them. For example, \code{sort_stats('name', 'file')} will sort
|
|---|
| 464 | all the entries according to their function name, and resolve all ties
|
|---|
| 465 | (identical function names) by sorting by file name.
|
|---|
| 466 |
|
|---|
| 467 | Abbreviations can be used for any key names, as long as the
|
|---|
| 468 | abbreviation is unambiguous. The following are the keys currently
|
|---|
| 469 | defined:
|
|---|
| 470 |
|
|---|
| 471 | \begin{tableii}{l|l}{code}{Valid Arg}{Meaning}
|
|---|
| 472 | \lineii{'calls'}{call count}
|
|---|
| 473 | \lineii{'cumulative'}{cumulative time}
|
|---|
| 474 | \lineii{'file'}{file name}
|
|---|
| 475 | \lineii{'module'}{file name}
|
|---|
| 476 | \lineii{'pcalls'}{primitive call count}
|
|---|
| 477 | \lineii{'line'}{line number}
|
|---|
| 478 | \lineii{'name'}{function name}
|
|---|
| 479 | \lineii{'nfl'}{name/file/line}
|
|---|
| 480 | \lineii{'stdname'}{standard name}
|
|---|
| 481 | \lineii{'time'}{internal time}
|
|---|
| 482 | \end{tableii}
|
|---|
| 483 |
|
|---|
| 484 | Note that all sorts on statistics are in descending order (placing
|
|---|
| 485 | most time consuming items first), where as name, file, and line number
|
|---|
| 486 | searches are in ascending order (alphabetical). The subtle
|
|---|
| 487 | distinction between \code{'nfl'} and \code{'stdname'} is that the
|
|---|
| 488 | standard name is a sort of the name as printed, which means that the
|
|---|
| 489 | embedded line numbers get compared in an odd way. For example, lines
|
|---|
| 490 | 3, 20, and 40 would (if the file names were the same) appear in the
|
|---|
| 491 | string order 20, 3 and 40. In contrast, \code{'nfl'} does a numeric
|
|---|
| 492 | compare of the line numbers. In fact, \code{sort_stats('nfl')} is the
|
|---|
| 493 | same as \code{sort_stats('name', 'file', 'line')}.
|
|---|
| 494 |
|
|---|
| 495 | %For compatibility with the old profiler,
|
|---|
| 496 | For backward-compatibility reasons, the numeric arguments
|
|---|
| 497 | \code{-1}, \code{0}, \code{1}, and \code{2} are permitted. They are
|
|---|
| 498 | interpreted as \code{'stdname'}, \code{'calls'}, \code{'time'}, and
|
|---|
| 499 | \code{'cumulative'} respectively. If this old style format (numeric)
|
|---|
| 500 | is used, only one sort key (the numeric key) will be used, and
|
|---|
| 501 | additional arguments will be silently ignored.
|
|---|
| 502 | \end{methoddesc}
|
|---|
| 503 |
|
|---|
| 504 |
|
|---|
| 505 | \begin{methoddesc}[Stats]{reverse_order}{}
|
|---|
| 506 | This method for the \class{Stats} class reverses the ordering of the basic
|
|---|
| 507 | list within the object. %This method is provided primarily for
|
|---|
| 508 | %compatibility with the old profiler.
|
|---|
| 509 | Note that by default ascending vs descending order is properly selected
|
|---|
| 510 | based on the sort key of choice.
|
|---|
| 511 | \end{methoddesc}
|
|---|
| 512 |
|
|---|
| 513 | \begin{methoddesc}[Stats]{print_stats}{\optional{restriction, \moreargs}}
|
|---|
| 514 | This method for the \class{Stats} class prints out a report as described
|
|---|
| 515 | in the \function{profile.run()} definition.
|
|---|
| 516 |
|
|---|
| 517 | The order of the printing is based on the last \method{sort_stats()}
|
|---|
| 518 | operation done on the object (subject to caveats in \method{add()} and
|
|---|
| 519 | \method{strip_dirs()}).
|
|---|
| 520 |
|
|---|
| 521 | The arguments provided (if any) can be used to limit the list down to
|
|---|
| 522 | the significant entries. Initially, the list is taken to be the
|
|---|
| 523 | complete set of profiled functions. Each restriction is either an
|
|---|
| 524 | integer (to select a count of lines), or a decimal fraction between
|
|---|
| 525 | 0.0 and 1.0 inclusive (to select a percentage of lines), or a regular
|
|---|
| 526 | expression (to pattern match the standard name that is printed; as of
|
|---|
| 527 | Python 1.5b1, this uses the Perl-style regular expression syntax
|
|---|
| 528 | defined by the \refmodule{re} module). If several restrictions are
|
|---|
| 529 | provided, then they are applied sequentially. For example:
|
|---|
| 530 |
|
|---|
| 531 | \begin{verbatim}
|
|---|
| 532 | print_stats(.1, 'foo:')
|
|---|
| 533 | \end{verbatim}
|
|---|
| 534 |
|
|---|
| 535 | would first limit the printing to first 10\% of list, and then only
|
|---|
| 536 | print functions that were part of filename \file{.*foo:}. In
|
|---|
| 537 | contrast, the command:
|
|---|
| 538 |
|
|---|
| 539 | \begin{verbatim}
|
|---|
| 540 | print_stats('foo:', .1)
|
|---|
| 541 | \end{verbatim}
|
|---|
| 542 |
|
|---|
| 543 | would limit the list to all functions having file names \file{.*foo:},
|
|---|
| 544 | and then proceed to only print the first 10\% of them.
|
|---|
| 545 | \end{methoddesc}
|
|---|
| 546 |
|
|---|
| 547 |
|
|---|
| 548 | \begin{methoddesc}[Stats]{print_callers}{\optional{restriction, \moreargs}}
|
|---|
| 549 | This method for the \class{Stats} class prints a list of all functions
|
|---|
| 550 | that called each function in the profiled database. The ordering is
|
|---|
| 551 | identical to that provided by \method{print_stats()}, and the definition
|
|---|
| 552 | of the restricting argument is also identical. Each caller is reported on
|
|---|
| 553 | its own line. The format differs slightly depending on the profiler that
|
|---|
| 554 | produced the stats:
|
|---|
| 555 |
|
|---|
| 556 | \begin{itemize}
|
|---|
| 557 | \item With \module{profile}, a number is shown in parentheses after each
|
|---|
| 558 | caller to show how many times this specific call was made. For
|
|---|
| 559 | convenience, a second non-parenthesized number repeats the cumulative
|
|---|
| 560 | time spent in the function at the right.
|
|---|
| 561 |
|
|---|
| 562 | \item With \module{cProfile}, each caller is preceeded by three numbers:
|
|---|
| 563 | the number of times this specific call was made, and the total and
|
|---|
| 564 | cumulative times spent in the current function while it was invoked by
|
|---|
| 565 | this specific caller.
|
|---|
| 566 | \end{itemize}
|
|---|
| 567 | \end{methoddesc}
|
|---|
| 568 |
|
|---|
| 569 | \begin{methoddesc}[Stats]{print_callees}{\optional{restriction, \moreargs}}
|
|---|
| 570 | This method for the \class{Stats} class prints a list of all function
|
|---|
| 571 | that were called by the indicated function. Aside from this reversal
|
|---|
| 572 | of direction of calls (re: called vs was called by), the arguments and
|
|---|
| 573 | ordering are identical to the \method{print_callers()} method.
|
|---|
| 574 | \end{methoddesc}
|
|---|
| 575 |
|
|---|
| 576 |
|
|---|
| 577 | \section{Limitations \label{profile-limits}}
|
|---|
| 578 |
|
|---|
| 579 | One limitation has to do with accuracy of timing information.
|
|---|
| 580 | There is a fundamental problem with deterministic profilers involving
|
|---|
| 581 | accuracy. The most obvious restriction is that the underlying ``clock''
|
|---|
| 582 | is only ticking at a rate (typically) of about .001 seconds. Hence no
|
|---|
| 583 | measurements will be more accurate than the underlying clock. If
|
|---|
| 584 | enough measurements are taken, then the ``error'' will tend to average
|
|---|
| 585 | out. Unfortunately, removing this first error induces a second source
|
|---|
| 586 | of error.
|
|---|
| 587 |
|
|---|
| 588 | The second problem is that it ``takes a while'' from when an event is
|
|---|
| 589 | dispatched until the profiler's call to get the time actually
|
|---|
| 590 | \emph{gets} the state of the clock. Similarly, there is a certain lag
|
|---|
| 591 | when exiting the profiler event handler from the time that the clock's
|
|---|
| 592 | value was obtained (and then squirreled away), until the user's code
|
|---|
| 593 | is once again executing. As a result, functions that are called many
|
|---|
| 594 | times, or call many functions, will typically accumulate this error.
|
|---|
| 595 | The error that accumulates in this fashion is typically less than the
|
|---|
| 596 | accuracy of the clock (less than one clock tick), but it
|
|---|
| 597 | \emph{can} accumulate and become very significant.
|
|---|
| 598 |
|
|---|
| 599 | The problem is more important with \module{profile} than with the
|
|---|
| 600 | lower-overhead \module{cProfile}. For this reason, \module{profile}
|
|---|
| 601 | provides a means of calibrating itself for a given platform so that
|
|---|
| 602 | this error can be probabilistically (on the average) removed.
|
|---|
| 603 | After the profiler is calibrated, it will be more accurate (in a least
|
|---|
| 604 | square sense), but it will sometimes produce negative numbers (when
|
|---|
| 605 | call counts are exceptionally low, and the gods of probability work
|
|---|
| 606 | against you :-). ) Do \emph{not} be alarmed by negative numbers in
|
|---|
| 607 | the profile. They should \emph{only} appear if you have calibrated
|
|---|
| 608 | your profiler, and the results are actually better than without
|
|---|
| 609 | calibration.
|
|---|
| 610 |
|
|---|
| 611 |
|
|---|
| 612 | \section{Calibration \label{profile-calibration}}
|
|---|
| 613 |
|
|---|
| 614 | The profiler of the \module{profile} module subtracts a constant from each
|
|---|
| 615 | event handling time to compensate for the overhead of calling the time
|
|---|
| 616 | function, and socking away the results. By default, the constant is 0.
|
|---|
| 617 | The following procedure can
|
|---|
| 618 | be used to obtain a better constant for a given platform (see discussion
|
|---|
| 619 | in section Limitations above).
|
|---|
| 620 |
|
|---|
| 621 | \begin{verbatim}
|
|---|
| 622 | import profile
|
|---|
| 623 | pr = profile.Profile()
|
|---|
| 624 | for i in range(5):
|
|---|
| 625 | print pr.calibrate(10000)
|
|---|
| 626 | \end{verbatim}
|
|---|
| 627 |
|
|---|
| 628 | The method executes the number of Python calls given by the argument,
|
|---|
| 629 | directly and again under the profiler, measuring the time for both.
|
|---|
| 630 | It then computes the hidden overhead per profiler event, and returns
|
|---|
| 631 | that as a float. For example, on an 800 MHz Pentium running
|
|---|
| 632 | Windows 2000, and using Python's time.clock() as the timer,
|
|---|
| 633 | the magical number is about 12.5e-6.
|
|---|
| 634 |
|
|---|
| 635 | The object of this exercise is to get a fairly consistent result.
|
|---|
| 636 | If your computer is \emph{very} fast, or your timer function has poor
|
|---|
| 637 | resolution, you might have to pass 100000, or even 1000000, to get
|
|---|
| 638 | consistent results.
|
|---|
| 639 |
|
|---|
| 640 | When you have a consistent answer,
|
|---|
| 641 | there are three ways you can use it:\footnote{Prior to Python 2.2, it
|
|---|
| 642 | was necessary to edit the profiler source code to embed the bias as
|
|---|
| 643 | a literal number. You still can, but that method is no longer
|
|---|
| 644 | described, because no longer needed.}
|
|---|
| 645 |
|
|---|
| 646 | \begin{verbatim}
|
|---|
| 647 | import profile
|
|---|
| 648 |
|
|---|
| 649 | # 1. Apply computed bias to all Profile instances created hereafter.
|
|---|
| 650 | profile.Profile.bias = your_computed_bias
|
|---|
| 651 |
|
|---|
| 652 | # 2. Apply computed bias to a specific Profile instance.
|
|---|
| 653 | pr = profile.Profile()
|
|---|
| 654 | pr.bias = your_computed_bias
|
|---|
| 655 |
|
|---|
| 656 | # 3. Specify computed bias in instance constructor.
|
|---|
| 657 | pr = profile.Profile(bias=your_computed_bias)
|
|---|
| 658 | \end{verbatim}
|
|---|
| 659 |
|
|---|
| 660 | If you have a choice, you are better off choosing a smaller constant, and
|
|---|
| 661 | then your results will ``less often'' show up as negative in profile
|
|---|
| 662 | statistics.
|
|---|
| 663 |
|
|---|
| 664 |
|
|---|
| 665 | \section{Extensions --- Deriving Better Profilers}
|
|---|
| 666 | \nodename{Profiler Extensions}
|
|---|
| 667 |
|
|---|
| 668 | The \class{Profile} class of both modules, \module{profile} and
|
|---|
| 669 | \module{cProfile}, were written so that
|
|---|
| 670 | derived classes could be developed to extend the profiler. The details
|
|---|
| 671 | are not described here, as doing this successfully requires an expert
|
|---|
| 672 | understanding of how the \class{Profile} class works internally. Study
|
|---|
| 673 | the source code of the module carefully if you want to
|
|---|
| 674 | pursue this.
|
|---|
| 675 |
|
|---|
| 676 | If all you want to do is change how current time is determined (for
|
|---|
| 677 | example, to force use of wall-clock time or elapsed process time),
|
|---|
| 678 | pass the timing function you want to the \class{Profile} class
|
|---|
| 679 | constructor:
|
|---|
| 680 |
|
|---|
| 681 | \begin{verbatim}
|
|---|
| 682 | pr = profile.Profile(your_time_func)
|
|---|
| 683 | \end{verbatim}
|
|---|
| 684 |
|
|---|
| 685 | The resulting profiler will then call \function{your_time_func()}.
|
|---|
| 686 |
|
|---|
| 687 | \begin{description}
|
|---|
| 688 | \item[\class{profile.Profile}]
|
|---|
| 689 | \function{your_time_func()} should return a single number, or a list of
|
|---|
| 690 | numbers whose sum is the current time (like what \function{os.times()}
|
|---|
| 691 | returns). If the function returns a single time number, or the list of
|
|---|
| 692 | returned numbers has length 2, then you will get an especially fast
|
|---|
| 693 | version of the dispatch routine.
|
|---|
| 694 |
|
|---|
| 695 | Be warned that you should calibrate the profiler class for the
|
|---|
| 696 | timer function that you choose. For most machines, a timer that
|
|---|
| 697 | returns a lone integer value will provide the best results in terms of
|
|---|
| 698 | low overhead during profiling. (\function{os.times()} is
|
|---|
| 699 | \emph{pretty} bad, as it returns a tuple of floating point values). If
|
|---|
| 700 | you want to substitute a better timer in the cleanest fashion,
|
|---|
| 701 | derive a class and hardwire a replacement dispatch method that best
|
|---|
| 702 | handles your timer call, along with the appropriate calibration
|
|---|
| 703 | constant.
|
|---|
| 704 |
|
|---|
| 705 | \item[\class{cProfile.Profile}]
|
|---|
| 706 | \function{your_time_func()} should return a single number. If it returns
|
|---|
| 707 | plain integers, you can also invoke the class constructor with a second
|
|---|
| 708 | argument specifying the real duration of one unit of time. For example,
|
|---|
| 709 | if \function{your_integer_time_func()} returns times measured in thousands
|
|---|
| 710 | of seconds, you would constuct the \class{Profile} instance as follows:
|
|---|
| 711 |
|
|---|
| 712 | \begin{verbatim}
|
|---|
| 713 | pr = profile.Profile(your_integer_time_func, 0.001)
|
|---|
| 714 | \end{verbatim}
|
|---|
| 715 |
|
|---|
| 716 | As the \module{cProfile.Profile} class cannot be calibrated, custom
|
|---|
| 717 | timer functions should be used with care and should be as fast as
|
|---|
| 718 | possible. For the best results with a custom timer, it might be
|
|---|
| 719 | necessary to hard-code it in the C source of the internal
|
|---|
| 720 | \module{_lsprof} module.
|
|---|
| 721 |
|
|---|
| 722 | \end{description}
|
|---|