| 1 | \documentclass{howto}
|
|---|
| 2 | \usepackage{distutils}
|
|---|
| 3 | % $Id: whatsnew24.tex 50936 2006-07-29 15:42:46Z andrew.kuchling $
|
|---|
| 4 |
|
|---|
| 5 | % Don't write extensive text for new sections; I'll do that.
|
|---|
| 6 | % Feel free to add commented-out reminders of things that need
|
|---|
| 7 | % to be covered. --amk
|
|---|
| 8 |
|
|---|
| 9 | \title{What's New in Python 2.4}
|
|---|
| 10 | \release{1.02}
|
|---|
| 11 | \author{A.M.\ Kuchling}
|
|---|
| 12 | \authoraddress{
|
|---|
| 13 | \strong{Python Software Foundation}\\
|
|---|
| 14 | Email: \email{[email protected]}
|
|---|
| 15 | }
|
|---|
| 16 |
|
|---|
| 17 | \begin{document}
|
|---|
| 18 | \maketitle
|
|---|
| 19 | \tableofcontents
|
|---|
| 20 |
|
|---|
| 21 | This article explains the new features in Python 2.4.1, released on
|
|---|
| 22 | March~30, 2005.
|
|---|
| 23 |
|
|---|
| 24 | Python 2.4 is a medium-sized release. It doesn't introduce as many
|
|---|
| 25 | changes as the radical Python 2.2, but introduces more features than
|
|---|
| 26 | the conservative 2.3 release. The most significant new language
|
|---|
| 27 | features are function decorators and generator expressions; most other
|
|---|
| 28 | changes are to the standard library.
|
|---|
| 29 |
|
|---|
| 30 | According to the CVS change logs, there were 481 patches applied and
|
|---|
| 31 | 502 bugs fixed between Python 2.3 and 2.4. Both figures are likely to
|
|---|
| 32 | be underestimates.
|
|---|
| 33 |
|
|---|
| 34 | This article doesn't attempt to provide a complete specification of
|
|---|
| 35 | every single new feature, but instead provides a brief introduction to
|
|---|
| 36 | each feature. For full details, you should refer to the documentation
|
|---|
| 37 | for Python 2.4, such as the \citetitle[../lib/lib.html]{Python Library
|
|---|
| 38 | Reference} and the \citetitle[../ref/ref.html]{Python Reference
|
|---|
| 39 | Manual}. Often you will be referred to the PEP for a particular new
|
|---|
| 40 | feature for explanations of the implementation and design rationale.
|
|---|
| 41 |
|
|---|
| 42 |
|
|---|
| 43 | %======================================================================
|
|---|
| 44 | \section{PEP 218: Built-In Set Objects}
|
|---|
| 45 |
|
|---|
| 46 | Python 2.3 introduced the \module{sets} module. C implementations of
|
|---|
| 47 | set data types have now been added to the Python core as two new
|
|---|
| 48 | built-in types, \function{set(\var{iterable})} and
|
|---|
| 49 | \function{frozenset(\var{iterable})}. They provide high speed
|
|---|
| 50 | operations for membership testing, for eliminating duplicates from
|
|---|
| 51 | sequences, and for mathematical operations like unions, intersections,
|
|---|
| 52 | differences, and symmetric differences.
|
|---|
| 53 |
|
|---|
| 54 | \begin{verbatim}
|
|---|
| 55 | >>> a = set('abracadabra') # form a set from a string
|
|---|
| 56 | >>> 'z' in a # fast membership testing
|
|---|
| 57 | False
|
|---|
| 58 | >>> a # unique letters in a
|
|---|
| 59 | set(['a', 'r', 'b', 'c', 'd'])
|
|---|
| 60 | >>> ''.join(a) # convert back into a string
|
|---|
| 61 | 'arbcd'
|
|---|
| 62 |
|
|---|
| 63 | >>> b = set('alacazam') # form a second set
|
|---|
| 64 | >>> a - b # letters in a but not in b
|
|---|
| 65 | set(['r', 'd', 'b'])
|
|---|
| 66 | >>> a | b # letters in either a or b
|
|---|
| 67 | set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
|
|---|
| 68 | >>> a & b # letters in both a and b
|
|---|
| 69 | set(['a', 'c'])
|
|---|
| 70 | >>> a ^ b # letters in a or b but not both
|
|---|
| 71 | set(['r', 'd', 'b', 'm', 'z', 'l'])
|
|---|
| 72 |
|
|---|
| 73 | >>> a.add('z') # add a new element
|
|---|
| 74 | >>> a.update('wxy') # add multiple new elements
|
|---|
| 75 | >>> a
|
|---|
| 76 | set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z'])
|
|---|
| 77 | >>> a.remove('x') # take one element out
|
|---|
| 78 | >>> a
|
|---|
| 79 | set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z'])
|
|---|
| 80 | \end{verbatim}
|
|---|
| 81 |
|
|---|
| 82 | The \function{frozenset} type is an immutable version of \function{set}.
|
|---|
| 83 | Since it is immutable and hashable, it may be used as a dictionary key or
|
|---|
| 84 | as a member of another set.
|
|---|
| 85 |
|
|---|
| 86 | The \module{sets} module remains in the standard library, and may be
|
|---|
| 87 | useful if you wish to subclass the \class{Set} or \class{ImmutableSet}
|
|---|
| 88 | classes. There are currently no plans to deprecate the module.
|
|---|
| 89 |
|
|---|
| 90 | \begin{seealso}
|
|---|
| 91 | \seepep{218}{Adding a Built-In Set Object Type}{Originally proposed by
|
|---|
| 92 | Greg Wilson and ultimately implemented by Raymond Hettinger.}
|
|---|
| 93 | \end{seealso}
|
|---|
| 94 |
|
|---|
| 95 |
|
|---|
| 96 | %======================================================================
|
|---|
| 97 | \section{PEP 237: Unifying Long Integers and Integers}
|
|---|
| 98 |
|
|---|
| 99 | The lengthy transition process for this PEP, begun in Python 2.2,
|
|---|
| 100 | takes another step forward in Python 2.4. In 2.3, certain integer
|
|---|
| 101 | operations that would behave differently after int/long unification
|
|---|
| 102 | triggered \exception{FutureWarning} warnings and returned values
|
|---|
| 103 | limited to 32 or 64 bits (depending on your platform). In 2.4, these
|
|---|
| 104 | expressions no longer produce a warning and instead produce a
|
|---|
| 105 | different result that's usually a long integer.
|
|---|
| 106 |
|
|---|
| 107 | The problematic expressions are primarily left shifts and lengthy
|
|---|
| 108 | hexadecimal and octal constants. For example,
|
|---|
| 109 | \code{2 \textless{}\textless{} 32} results
|
|---|
| 110 | in a warning in 2.3, evaluating to 0 on 32-bit platforms. In Python
|
|---|
| 111 | 2.4, this expression now returns the correct answer, 8589934592.
|
|---|
| 112 |
|
|---|
| 113 | \begin{seealso}
|
|---|
| 114 | \seepep{237}{Unifying Long Integers and Integers}{Original PEP
|
|---|
| 115 | written by Moshe Zadka and GvR. The changes for 2.4 were implemented by
|
|---|
| 116 | Kalle Svensson.}
|
|---|
| 117 | \end{seealso}
|
|---|
| 118 |
|
|---|
| 119 |
|
|---|
| 120 | %======================================================================
|
|---|
| 121 | \section{PEP 289: Generator Expressions}
|
|---|
| 122 |
|
|---|
| 123 | The iterator feature introduced in Python 2.2 and the
|
|---|
| 124 | \module{itertools} module make it easier to write programs that loop
|
|---|
| 125 | through large data sets without having the entire data set in memory
|
|---|
| 126 | at one time. List comprehensions don't fit into this picture very
|
|---|
| 127 | well because they produce a Python list object containing all of the
|
|---|
| 128 | items. This unavoidably pulls all of the objects into memory, which
|
|---|
| 129 | can be a problem if your data set is very large. When trying to write
|
|---|
| 130 | a functionally-styled program, it would be natural to write something
|
|---|
| 131 | like:
|
|---|
| 132 |
|
|---|
| 133 | \begin{verbatim}
|
|---|
| 134 | links = [link for link in get_all_links() if not link.followed]
|
|---|
| 135 | for link in links:
|
|---|
| 136 | ...
|
|---|
| 137 | \end{verbatim}
|
|---|
| 138 |
|
|---|
| 139 | instead of
|
|---|
| 140 |
|
|---|
| 141 | \begin{verbatim}
|
|---|
| 142 | for link in get_all_links():
|
|---|
| 143 | if link.followed:
|
|---|
| 144 | continue
|
|---|
| 145 | ...
|
|---|
| 146 | \end{verbatim}
|
|---|
| 147 |
|
|---|
| 148 | The first form is more concise and perhaps more readable, but if
|
|---|
| 149 | you're dealing with a large number of link objects you'd have to write
|
|---|
| 150 | the second form to avoid having all link objects in memory at the same
|
|---|
| 151 | time.
|
|---|
| 152 |
|
|---|
| 153 | Generator expressions work similarly to list comprehensions but don't
|
|---|
| 154 | materialize the entire list; instead they create a generator that will
|
|---|
| 155 | return elements one by one. The above example could be written as:
|
|---|
| 156 |
|
|---|
| 157 | \begin{verbatim}
|
|---|
| 158 | links = (link for link in get_all_links() if not link.followed)
|
|---|
| 159 | for link in links:
|
|---|
| 160 | ...
|
|---|
| 161 | \end{verbatim}
|
|---|
| 162 |
|
|---|
| 163 | Generator expressions always have to be written inside parentheses, as
|
|---|
| 164 | in the above example. The parentheses signalling a function call also
|
|---|
| 165 | count, so if you want to create an iterator that will be immediately
|
|---|
| 166 | passed to a function you could write:
|
|---|
| 167 |
|
|---|
| 168 | \begin{verbatim}
|
|---|
| 169 | print sum(obj.count for obj in list_all_objects())
|
|---|
| 170 | \end{verbatim}
|
|---|
| 171 |
|
|---|
| 172 | Generator expressions differ from list comprehensions in various small
|
|---|
| 173 | ways. Most notably, the loop variable (\var{obj} in the above
|
|---|
| 174 | example) is not accessible outside of the generator expression. List
|
|---|
| 175 | comprehensions leave the variable assigned to its last value; future
|
|---|
| 176 | versions of Python will change this, making list comprehensions match
|
|---|
| 177 | generator expressions in this respect.
|
|---|
| 178 |
|
|---|
| 179 | \begin{seealso}
|
|---|
| 180 | \seepep{289}{Generator Expressions}{Proposed by Raymond Hettinger and
|
|---|
| 181 | implemented by Jiwon Seo with early efforts steered by Hye-Shik Chang.}
|
|---|
| 182 | \end{seealso}
|
|---|
| 183 |
|
|---|
| 184 |
|
|---|
| 185 | %======================================================================
|
|---|
| 186 | \section{PEP 292: Simpler String Substitutions}
|
|---|
| 187 |
|
|---|
| 188 | Some new classes in the standard library provide an alternative
|
|---|
| 189 | mechanism for substituting variables into strings; this style of
|
|---|
| 190 | substitution may be better for applications where untrained
|
|---|
| 191 | users need to edit templates.
|
|---|
| 192 |
|
|---|
| 193 | The usual way of substituting variables by name is the \code{\%}
|
|---|
| 194 | operator:
|
|---|
| 195 |
|
|---|
| 196 | \begin{verbatim}
|
|---|
| 197 | >>> '%(page)i: %(title)s' % {'page':2, 'title': 'The Best of Times'}
|
|---|
| 198 | '2: The Best of Times'
|
|---|
| 199 | \end{verbatim}
|
|---|
| 200 |
|
|---|
| 201 | When writing the template string, it can be easy to forget the
|
|---|
| 202 | \samp{i} or \samp{s} after the closing parenthesis. This isn't a big
|
|---|
| 203 | problem if the template is in a Python module, because you run the
|
|---|
| 204 | code, get an ``Unsupported format character'' \exception{ValueError},
|
|---|
| 205 | and fix the problem. However, consider an application such as Mailman
|
|---|
| 206 | where template strings or translations are being edited by users who
|
|---|
| 207 | aren't aware of the Python language. The format string's syntax is
|
|---|
| 208 | complicated to explain to such users, and if they make a mistake, it's
|
|---|
| 209 | difficult to provide helpful feedback to them.
|
|---|
| 210 |
|
|---|
| 211 | PEP 292 adds a \class{Template} class to the \module{string} module
|
|---|
| 212 | that uses \samp{\$} to indicate a substitution:
|
|---|
| 213 |
|
|---|
| 214 | \begin{verbatim}
|
|---|
| 215 | >>> import string
|
|---|
| 216 | >>> t = string.Template('$page: $title')
|
|---|
| 217 | >>> t.substitute({'page':2, 'title': 'The Best of Times'})
|
|---|
| 218 | '2: The Best of Times'
|
|---|
| 219 | \end{verbatim}
|
|---|
| 220 |
|
|---|
| 221 | % $ Terminate $-mode for Emacs
|
|---|
| 222 |
|
|---|
| 223 | If a key is missing from the dictionary, the \method{substitute} method
|
|---|
| 224 | will raise a \exception{KeyError}. There's also a \method{safe_substitute}
|
|---|
| 225 | method that ignores missing keys:
|
|---|
| 226 |
|
|---|
| 227 | \begin{verbatim}
|
|---|
| 228 | >>> t = string.Template('$page: $title')
|
|---|
| 229 | >>> t.safe_substitute({'page':3})
|
|---|
| 230 | '3: $title'
|
|---|
| 231 | \end{verbatim}
|
|---|
| 232 |
|
|---|
| 233 | % $ Terminate math-mode for Emacs
|
|---|
| 234 |
|
|---|
| 235 |
|
|---|
| 236 | \begin{seealso}
|
|---|
| 237 | \seepep{292}{Simpler String Substitutions}{Written and implemented
|
|---|
| 238 | by Barry Warsaw.}
|
|---|
| 239 | \end{seealso}
|
|---|
| 240 |
|
|---|
| 241 |
|
|---|
| 242 | %======================================================================
|
|---|
| 243 | \section{PEP 318: Decorators for Functions and Methods}
|
|---|
| 244 |
|
|---|
| 245 | Python 2.2 extended Python's object model by adding static methods and
|
|---|
| 246 | class methods, but it didn't extend Python's syntax to provide any new
|
|---|
| 247 | way of defining static or class methods. Instead, you had to write a
|
|---|
| 248 | \keyword{def} statement in the usual way, and pass the resulting
|
|---|
| 249 | method to a \function{staticmethod()} or \function{classmethod()}
|
|---|
| 250 | function that would wrap up the function as a method of the new type.
|
|---|
| 251 | Your code would look like this:
|
|---|
| 252 |
|
|---|
| 253 | \begin{verbatim}
|
|---|
| 254 | class C:
|
|---|
| 255 | def meth (cls):
|
|---|
| 256 | ...
|
|---|
| 257 |
|
|---|
| 258 | meth = classmethod(meth) # Rebind name to wrapped-up class method
|
|---|
| 259 | \end{verbatim}
|
|---|
| 260 |
|
|---|
| 261 | If the method was very long, it would be easy to miss or forget the
|
|---|
| 262 | \function{classmethod()} invocation after the function body.
|
|---|
| 263 |
|
|---|
| 264 | The intention was always to add some syntax to make such definitions
|
|---|
| 265 | more readable, but at the time of 2.2's release a good syntax was not
|
|---|
| 266 | obvious. Today a good syntax \emph{still} isn't obvious but users are
|
|---|
| 267 | asking for easier access to the feature; a new syntactic feature has
|
|---|
| 268 | been added to meet this need.
|
|---|
| 269 |
|
|---|
| 270 | The new feature is called ``function decorators''. The name comes
|
|---|
| 271 | from the idea that \function{classmethod}, \function{staticmethod},
|
|---|
| 272 | and friends are storing additional information on a function object;
|
|---|
| 273 | they're \emph{decorating} functions with more details.
|
|---|
| 274 |
|
|---|
| 275 | The notation borrows from Java and uses the \character{@} character as an
|
|---|
| 276 | indicator. Using the new syntax, the example above would be written:
|
|---|
| 277 |
|
|---|
| 278 | \begin{verbatim}
|
|---|
| 279 | class C:
|
|---|
| 280 |
|
|---|
| 281 | @classmethod
|
|---|
| 282 | def meth (cls):
|
|---|
| 283 | ...
|
|---|
| 284 |
|
|---|
| 285 | \end{verbatim}
|
|---|
| 286 |
|
|---|
| 287 | The \code{@classmethod} is shorthand for the
|
|---|
| 288 | \code{meth=classmethod(meth)} assignment. More generally, if you have
|
|---|
| 289 | the following:
|
|---|
| 290 |
|
|---|
| 291 | \begin{verbatim}
|
|---|
| 292 | @A
|
|---|
| 293 | @B
|
|---|
| 294 | @C
|
|---|
| 295 | def f ():
|
|---|
| 296 | ...
|
|---|
| 297 | \end{verbatim}
|
|---|
| 298 |
|
|---|
| 299 | It's equivalent to the following pre-decorator code:
|
|---|
| 300 |
|
|---|
| 301 | \begin{verbatim}
|
|---|
| 302 | def f(): ...
|
|---|
| 303 | f = A(B(C(f)))
|
|---|
| 304 | \end{verbatim}
|
|---|
| 305 |
|
|---|
| 306 | Decorators must come on the line before a function definition, one decorator
|
|---|
| 307 | per line, and can't be on the same line as the def statement, meaning that
|
|---|
| 308 | \code{@A def f(): ...} is illegal. You can only decorate function
|
|---|
| 309 | definitions, either at the module level or inside a class; you can't
|
|---|
| 310 | decorate class definitions.
|
|---|
| 311 |
|
|---|
| 312 | A decorator is just a function that takes the function to be decorated as an
|
|---|
| 313 | argument and returns either the same function or some new object. The
|
|---|
| 314 | return value of the decorator need not be callable (though it typically is),
|
|---|
| 315 | unless further decorators will be applied to the result. It's easy to write
|
|---|
| 316 | your own decorators. The following simple example just sets an attribute on
|
|---|
| 317 | the function object:
|
|---|
| 318 |
|
|---|
| 319 | \begin{verbatim}
|
|---|
| 320 | >>> def deco(func):
|
|---|
| 321 | ... func.attr = 'decorated'
|
|---|
| 322 | ... return func
|
|---|
| 323 | ...
|
|---|
| 324 | >>> @deco
|
|---|
| 325 | ... def f(): pass
|
|---|
| 326 | ...
|
|---|
| 327 | >>> f
|
|---|
| 328 | <function f at 0x402ef0d4>
|
|---|
| 329 | >>> f.attr
|
|---|
| 330 | 'decorated'
|
|---|
| 331 | >>>
|
|---|
| 332 | \end{verbatim}
|
|---|
| 333 |
|
|---|
| 334 | As a slightly more realistic example, the following decorator checks
|
|---|
| 335 | that the supplied argument is an integer:
|
|---|
| 336 |
|
|---|
| 337 | \begin{verbatim}
|
|---|
| 338 | def require_int (func):
|
|---|
| 339 | def wrapper (arg):
|
|---|
| 340 | assert isinstance(arg, int)
|
|---|
| 341 | return func(arg)
|
|---|
| 342 |
|
|---|
| 343 | return wrapper
|
|---|
| 344 |
|
|---|
| 345 | @require_int
|
|---|
| 346 | def p1 (arg):
|
|---|
| 347 | print arg
|
|---|
| 348 |
|
|---|
| 349 | @require_int
|
|---|
| 350 | def p2(arg):
|
|---|
| 351 | print arg*2
|
|---|
| 352 | \end{verbatim}
|
|---|
| 353 |
|
|---|
| 354 | An example in \pep{318} contains a fancier version of this idea that
|
|---|
| 355 | lets you both specify the required type and check the returned type.
|
|---|
| 356 |
|
|---|
| 357 | Decorator functions can take arguments. If arguments are supplied,
|
|---|
| 358 | your decorator function is called with only those arguments and must
|
|---|
| 359 | return a new decorator function; this function must take a single
|
|---|
| 360 | function and return a function, as previously described. In other
|
|---|
| 361 | words, \code{@A @B @C(args)} becomes:
|
|---|
| 362 |
|
|---|
| 363 | \begin{verbatim}
|
|---|
| 364 | def f(): ...
|
|---|
| 365 | _deco = C(args)
|
|---|
| 366 | f = A(B(_deco(f)))
|
|---|
| 367 | \end{verbatim}
|
|---|
| 368 |
|
|---|
| 369 | Getting this right can be slightly brain-bending, but it's not too
|
|---|
| 370 | difficult.
|
|---|
| 371 |
|
|---|
| 372 | A small related change makes the \member{func_name} attribute of
|
|---|
| 373 | functions writable. This attribute is used to display function names
|
|---|
| 374 | in tracebacks, so decorators should change the name of any new
|
|---|
| 375 | function that's constructed and returned.
|
|---|
| 376 |
|
|---|
| 377 | \begin{seealso}
|
|---|
| 378 | \seepep{318}{Decorators for Functions, Methods and Classes}{Written
|
|---|
| 379 | by Kevin D. Smith, Jim Jewett, and Skip Montanaro. Several people
|
|---|
| 380 | wrote patches implementing function decorators, but the one that was
|
|---|
| 381 | actually checked in was patch \#979728, written by Mark Russell.}
|
|---|
| 382 |
|
|---|
| 383 | \seeurl{http://www.python.org/moin/PythonDecoratorLibrary}
|
|---|
| 384 | {This Wiki page contains several examples of decorators.}
|
|---|
| 385 |
|
|---|
| 386 | \end{seealso}
|
|---|
| 387 |
|
|---|
| 388 |
|
|---|
| 389 | %======================================================================
|
|---|
| 390 | \section{PEP 322: Reverse Iteration}
|
|---|
| 391 |
|
|---|
| 392 | A new built-in function, \function{reversed(\var{seq})}, takes a sequence
|
|---|
| 393 | and returns an iterator that loops over the elements of the sequence
|
|---|
| 394 | in reverse order.
|
|---|
| 395 |
|
|---|
| 396 | \begin{verbatim}
|
|---|
| 397 | >>> for i in reversed(xrange(1,4)):
|
|---|
| 398 | ... print i
|
|---|
| 399 | ...
|
|---|
| 400 | 3
|
|---|
| 401 | 2
|
|---|
| 402 | 1
|
|---|
| 403 | \end{verbatim}
|
|---|
| 404 |
|
|---|
| 405 | Compared to extended slicing, such as \code{range(1,4)[::-1]},
|
|---|
| 406 | \function{reversed()} is easier to read, runs faster, and uses
|
|---|
| 407 | substantially less memory.
|
|---|
| 408 |
|
|---|
| 409 | Note that \function{reversed()} only accepts sequences, not arbitrary
|
|---|
| 410 | iterators. If you want to reverse an iterator, first convert it to
|
|---|
| 411 | a list with \function{list()}.
|
|---|
| 412 |
|
|---|
| 413 | \begin{verbatim}
|
|---|
| 414 | >>> input = open('/etc/passwd', 'r')
|
|---|
| 415 | >>> for line in reversed(list(input)):
|
|---|
| 416 | ... print line
|
|---|
| 417 | ...
|
|---|
| 418 | root:*:0:0:System Administrator:/var/root:/bin/tcsh
|
|---|
| 419 | ...
|
|---|
| 420 | \end{verbatim}
|
|---|
| 421 |
|
|---|
| 422 | \begin{seealso}
|
|---|
| 423 | \seepep{322}{Reverse Iteration}{Written and implemented by Raymond Hettinger.}
|
|---|
| 424 |
|
|---|
| 425 | \end{seealso}
|
|---|
| 426 |
|
|---|
| 427 |
|
|---|
| 428 | %======================================================================
|
|---|
| 429 | \section{PEP 324: New subprocess Module}
|
|---|
| 430 |
|
|---|
| 431 | The standard library provides a number of ways to execute a
|
|---|
| 432 | subprocess, offering different features and different levels of
|
|---|
| 433 | complexity. \function{os.system(\var{command})} is easy to use, but
|
|---|
| 434 | slow (it runs a shell process which executes the command) and
|
|---|
| 435 | dangerous (you have to be careful about escaping the shell's
|
|---|
| 436 | metacharacters). The \module{popen2} module offers classes that can
|
|---|
| 437 | capture standard output and standard error from the subprocess, but
|
|---|
| 438 | the naming is confusing. The \module{subprocess} module cleans
|
|---|
| 439 | this up, providing a unified interface that offers all the features
|
|---|
| 440 | you might need.
|
|---|
| 441 |
|
|---|
| 442 | Instead of \module{popen2}'s collection of classes,
|
|---|
| 443 | \module{subprocess} contains a single class called \class{Popen}
|
|---|
| 444 | whose constructor supports a number of different keyword arguments.
|
|---|
| 445 |
|
|---|
| 446 | \begin{verbatim}
|
|---|
| 447 | class Popen(args, bufsize=0, executable=None,
|
|---|
| 448 | stdin=None, stdout=None, stderr=None,
|
|---|
| 449 | preexec_fn=None, close_fds=False, shell=False,
|
|---|
| 450 | cwd=None, env=None, universal_newlines=False,
|
|---|
| 451 | startupinfo=None, creationflags=0):
|
|---|
| 452 | \end{verbatim}
|
|---|
| 453 |
|
|---|
| 454 | \var{args} is commonly a sequence of strings that will be the
|
|---|
| 455 | arguments to the program executed as the subprocess. (If the
|
|---|
| 456 | \var{shell} argument is true, \var{args} can be a string which will
|
|---|
| 457 | then be passed on to the shell for interpretation, just as
|
|---|
| 458 | \function{os.system()} does.)
|
|---|
| 459 |
|
|---|
| 460 | \var{stdin}, \var{stdout}, and \var{stderr} specify what the
|
|---|
| 461 | subprocess's input, output, and error streams will be. You can
|
|---|
| 462 | provide a file object or a file descriptor, or you can use the
|
|---|
| 463 | constant \code{subprocess.PIPE} to create a pipe between the
|
|---|
| 464 | subprocess and the parent.
|
|---|
| 465 |
|
|---|
| 466 | The constructor has a number of handy options:
|
|---|
| 467 |
|
|---|
| 468 | \begin{itemize}
|
|---|
| 469 | \item \var{close_fds} requests that all file descriptors be closed
|
|---|
| 470 | before running the subprocess.
|
|---|
| 471 |
|
|---|
| 472 | \item \var{cwd} specifies the working directory in which the
|
|---|
| 473 | subprocess will be executed (defaulting to whatever the parent's
|
|---|
| 474 | working directory is).
|
|---|
| 475 |
|
|---|
| 476 | \item \var{env} is a dictionary specifying environment variables.
|
|---|
| 477 |
|
|---|
| 478 | \item \var{preexec_fn} is a function that gets called before the
|
|---|
| 479 | child is started.
|
|---|
| 480 |
|
|---|
| 481 | \item \var{universal_newlines} opens the child's input and output
|
|---|
| 482 | using Python's universal newline feature.
|
|---|
| 483 |
|
|---|
| 484 | \end{itemize}
|
|---|
| 485 |
|
|---|
| 486 | Once you've created the \class{Popen} instance,
|
|---|
| 487 | you can call its \method{wait()} method to pause until the subprocess
|
|---|
| 488 | has exited, \method{poll()} to check if it's exited without pausing,
|
|---|
| 489 | or \method{communicate(\var{data})} to send the string \var{data} to
|
|---|
| 490 | the subprocess's standard input. \method{communicate(\var{data})}
|
|---|
| 491 | then reads any data that the subprocess has sent to its standard output
|
|---|
| 492 | or standard error, returning a tuple \code{(\var{stdout_data},
|
|---|
| 493 | \var{stderr_data})}.
|
|---|
| 494 |
|
|---|
| 495 | \function{call()} is a shortcut that passes its arguments along to the
|
|---|
| 496 | \class{Popen} constructor, waits for the command to complete, and
|
|---|
| 497 | returns the status code of the subprocess. It can serve as a safer
|
|---|
| 498 | analog to \function{os.system()}:
|
|---|
| 499 |
|
|---|
| 500 | \begin{verbatim}
|
|---|
| 501 | sts = subprocess.call(['dpkg', '-i', '/tmp/new-package.deb'])
|
|---|
| 502 | if sts == 0:
|
|---|
| 503 | # Success
|
|---|
| 504 | ...
|
|---|
| 505 | else:
|
|---|
| 506 | # dpkg returned an error
|
|---|
| 507 | ...
|
|---|
| 508 | \end{verbatim}
|
|---|
| 509 |
|
|---|
| 510 | The command is invoked without use of the shell. If you really do want to
|
|---|
| 511 | use the shell, you can add \code{shell=True} as a keyword argument and provide
|
|---|
| 512 | a string instead of a sequence:
|
|---|
| 513 |
|
|---|
| 514 | \begin{verbatim}
|
|---|
| 515 | sts = subprocess.call('dpkg -i /tmp/new-package.deb', shell=True)
|
|---|
| 516 | \end{verbatim}
|
|---|
| 517 |
|
|---|
| 518 | The PEP takes various examples of shell and Python code and shows how
|
|---|
| 519 | they'd be translated into Python code that uses \module{subprocess}.
|
|---|
| 520 | Reading this section of the PEP is highly recommended.
|
|---|
| 521 |
|
|---|
| 522 | \begin{seealso}
|
|---|
| 523 | \seepep{324}{subprocess - New process module}{Written and implemented by Peter {\AA}strand, with assistance from Fredrik Lundh and others.}
|
|---|
| 524 | \end{seealso}
|
|---|
| 525 |
|
|---|
| 526 |
|
|---|
| 527 | %======================================================================
|
|---|
| 528 | \section{PEP 327: Decimal Data Type}
|
|---|
| 529 |
|
|---|
| 530 | Python has always supported floating-point (FP) numbers, based on the
|
|---|
| 531 | underlying C \ctype{double} type, as a data type. However, while most
|
|---|
| 532 | programming languages provide a floating-point type, many people (even
|
|---|
| 533 | programmers) are unaware that floating-point numbers don't represent
|
|---|
| 534 | certain decimal fractions accurately. The new \class{Decimal} type
|
|---|
| 535 | can represent these fractions accurately, up to a user-specified
|
|---|
| 536 | precision limit.
|
|---|
| 537 |
|
|---|
| 538 |
|
|---|
| 539 | \subsection{Why is Decimal needed?}
|
|---|
| 540 |
|
|---|
| 541 | The limitations arise from the representation used for floating-point numbers.
|
|---|
| 542 | FP numbers are made up of three components:
|
|---|
| 543 |
|
|---|
| 544 | \begin{itemize}
|
|---|
| 545 | \item The sign, which is positive or negative.
|
|---|
| 546 | \item The mantissa, which is a single-digit binary number
|
|---|
| 547 | followed by a fractional part. For example, \code{1.01} in base-2 notation
|
|---|
| 548 | is \code{1 + 0/2 + 1/4}, or 1.25 in decimal notation.
|
|---|
| 549 | \item The exponent, which tells where the decimal point is located in the number represented.
|
|---|
| 550 | \end{itemize}
|
|---|
| 551 |
|
|---|
| 552 | For example, the number 1.25 has positive sign, a mantissa value of
|
|---|
| 553 | 1.01 (in binary), and an exponent of 0 (the decimal point doesn't need
|
|---|
| 554 | to be shifted). The number 5 has the same sign and mantissa, but the
|
|---|
| 555 | exponent is 2 because the mantissa is multiplied by 4 (2 to the power
|
|---|
| 556 | of the exponent 2); 1.25 * 4 equals 5.
|
|---|
| 557 |
|
|---|
| 558 | Modern systems usually provide floating-point support that conforms to
|
|---|
| 559 | a standard called IEEE 754. C's \ctype{double} type is usually
|
|---|
| 560 | implemented as a 64-bit IEEE 754 number, which uses 52 bits of space
|
|---|
| 561 | for the mantissa. This means that numbers can only be specified to 52
|
|---|
| 562 | bits of precision. If you're trying to represent numbers whose
|
|---|
| 563 | expansion repeats endlessly, the expansion is cut off after 52 bits.
|
|---|
| 564 | Unfortunately, most software needs to produce output in base 10, and
|
|---|
| 565 | common fractions in base 10 are often repeating decimals in binary.
|
|---|
| 566 | For example, 1.1 decimal is binary \code{1.0001100110011 ...}; .1 =
|
|---|
| 567 | 1/16 + 1/32 + 1/256 plus an infinite number of additional terms. IEEE
|
|---|
| 568 | 754 has to chop off that infinitely repeated decimal after 52 digits,
|
|---|
| 569 | so the representation is slightly inaccurate.
|
|---|
| 570 |
|
|---|
| 571 | Sometimes you can see this inaccuracy when the number is printed:
|
|---|
| 572 | \begin{verbatim}
|
|---|
| 573 | >>> 1.1
|
|---|
| 574 | 1.1000000000000001
|
|---|
| 575 | \end{verbatim}
|
|---|
| 576 |
|
|---|
| 577 | The inaccuracy isn't always visible when you print the number because
|
|---|
| 578 | the FP-to-decimal-string conversion is provided by the C library, and
|
|---|
| 579 | most C libraries try to produce sensible output. Even if it's not
|
|---|
| 580 | displayed, however, the inaccuracy is still there and subsequent
|
|---|
| 581 | operations can magnify the error.
|
|---|
| 582 |
|
|---|
| 583 | For many applications this doesn't matter. If I'm plotting points and
|
|---|
| 584 | displaying them on my monitor, the difference between 1.1 and
|
|---|
| 585 | 1.1000000000000001 is too small to be visible. Reports often limit
|
|---|
| 586 | output to a certain number of decimal places, and if you round the
|
|---|
| 587 | number to two or three or even eight decimal places, the error is
|
|---|
| 588 | never apparent. However, for applications where it does matter,
|
|---|
| 589 | it's a lot of work to implement your own custom arithmetic routines.
|
|---|
| 590 |
|
|---|
| 591 | Hence, the \class{Decimal} type was created.
|
|---|
| 592 |
|
|---|
| 593 | \subsection{The \class{Decimal} type}
|
|---|
| 594 |
|
|---|
| 595 | A new module, \module{decimal}, was added to Python's standard
|
|---|
| 596 | library. It contains two classes, \class{Decimal} and
|
|---|
| 597 | \class{Context}. \class{Decimal} instances represent numbers, and
|
|---|
| 598 | \class{Context} instances are used to wrap up various settings such as
|
|---|
| 599 | the precision and default rounding mode.
|
|---|
| 600 |
|
|---|
| 601 | \class{Decimal} instances are immutable, like regular Python integers
|
|---|
| 602 | and FP numbers; once it's been created, you can't change the value an
|
|---|
| 603 | instance represents. \class{Decimal} instances can be created from
|
|---|
| 604 | integers or strings:
|
|---|
| 605 |
|
|---|
| 606 | \begin{verbatim}
|
|---|
| 607 | >>> import decimal
|
|---|
| 608 | >>> decimal.Decimal(1972)
|
|---|
| 609 | Decimal("1972")
|
|---|
| 610 | >>> decimal.Decimal("1.1")
|
|---|
| 611 | Decimal("1.1")
|
|---|
| 612 | \end{verbatim}
|
|---|
| 613 |
|
|---|
| 614 | You can also provide tuples containing the sign, the mantissa represented
|
|---|
| 615 | as a tuple of decimal digits, and the exponent:
|
|---|
| 616 |
|
|---|
| 617 | \begin{verbatim}
|
|---|
| 618 | >>> decimal.Decimal((1, (1, 4, 7, 5), -2))
|
|---|
| 619 | Decimal("-14.75")
|
|---|
| 620 | \end{verbatim}
|
|---|
| 621 |
|
|---|
| 622 | Cautionary note: the sign bit is a Boolean value, so 0 is positive and
|
|---|
| 623 | 1 is negative.
|
|---|
| 624 |
|
|---|
| 625 | Converting from floating-point numbers poses a bit of a problem:
|
|---|
| 626 | should the FP number representing 1.1 turn into the decimal number for
|
|---|
| 627 | exactly 1.1, or for 1.1 plus whatever inaccuracies are introduced?
|
|---|
| 628 | The decision was to dodge the issue and leave such a conversion out of
|
|---|
| 629 | the API. Instead, you should convert the floating-point number into a
|
|---|
| 630 | string using the desired precision and pass the string to the
|
|---|
| 631 | \class{Decimal} constructor:
|
|---|
| 632 |
|
|---|
| 633 | \begin{verbatim}
|
|---|
| 634 | >>> f = 1.1
|
|---|
| 635 | >>> decimal.Decimal(str(f))
|
|---|
| 636 | Decimal("1.1")
|
|---|
| 637 | >>> decimal.Decimal('%.12f' % f)
|
|---|
| 638 | Decimal("1.100000000000")
|
|---|
| 639 | \end{verbatim}
|
|---|
| 640 |
|
|---|
| 641 | Once you have \class{Decimal} instances, you can perform the usual
|
|---|
| 642 | mathematical operations on them. One limitation: exponentiation
|
|---|
| 643 | requires an integer exponent:
|
|---|
| 644 |
|
|---|
| 645 | \begin{verbatim}
|
|---|
| 646 | >>> a = decimal.Decimal('35.72')
|
|---|
| 647 | >>> b = decimal.Decimal('1.73')
|
|---|
| 648 | >>> a+b
|
|---|
| 649 | Decimal("37.45")
|
|---|
| 650 | >>> a-b
|
|---|
| 651 | Decimal("33.99")
|
|---|
| 652 | >>> a*b
|
|---|
| 653 | Decimal("61.7956")
|
|---|
| 654 | >>> a/b
|
|---|
| 655 | Decimal("20.64739884393063583815028902")
|
|---|
| 656 | >>> a ** 2
|
|---|
| 657 | Decimal("1275.9184")
|
|---|
| 658 | >>> a**b
|
|---|
| 659 | Traceback (most recent call last):
|
|---|
| 660 | ...
|
|---|
| 661 | decimal.InvalidOperation: x ** (non-integer)
|
|---|
| 662 | \end{verbatim}
|
|---|
| 663 |
|
|---|
| 664 | You can combine \class{Decimal} instances with integers, but not with
|
|---|
| 665 | floating-point numbers:
|
|---|
| 666 |
|
|---|
| 667 | \begin{verbatim}
|
|---|
| 668 | >>> a + 4
|
|---|
| 669 | Decimal("39.72")
|
|---|
| 670 | >>> a + 4.5
|
|---|
| 671 | Traceback (most recent call last):
|
|---|
| 672 | ...
|
|---|
| 673 | TypeError: You can interact Decimal only with int, long or Decimal data types.
|
|---|
| 674 | >>>
|
|---|
| 675 | \end{verbatim}
|
|---|
| 676 |
|
|---|
| 677 | \class{Decimal} numbers can be used with the \module{math} and
|
|---|
| 678 | \module{cmath} modules, but note that they'll be immediately converted to
|
|---|
| 679 | floating-point numbers before the operation is performed, resulting in
|
|---|
| 680 | a possible loss of precision and accuracy. You'll also get back a
|
|---|
| 681 | regular floating-point number and not a \class{Decimal}.
|
|---|
| 682 |
|
|---|
| 683 | \begin{verbatim}
|
|---|
| 684 | >>> import math, cmath
|
|---|
| 685 | >>> d = decimal.Decimal('123456789012.345')
|
|---|
| 686 | >>> math.sqrt(d)
|
|---|
| 687 | 351364.18288201344
|
|---|
| 688 | >>> cmath.sqrt(-d)
|
|---|
| 689 | 351364.18288201344j
|
|---|
| 690 | \end{verbatim}
|
|---|
| 691 |
|
|---|
| 692 | \class{Decimal} instances have a \method{sqrt()} method that
|
|---|
| 693 | returns a \class{Decimal}, but if you need other things such as
|
|---|
| 694 | trigonometric functions you'll have to implement them.
|
|---|
| 695 |
|
|---|
| 696 | \begin{verbatim}
|
|---|
| 697 | >>> d.sqrt()
|
|---|
| 698 | Decimal("351364.1828820134592177245001")
|
|---|
| 699 | \end{verbatim}
|
|---|
| 700 |
|
|---|
| 701 |
|
|---|
| 702 | \subsection{The \class{Context} type}
|
|---|
| 703 |
|
|---|
| 704 | Instances of the \class{Context} class encapsulate several settings for
|
|---|
| 705 | decimal operations:
|
|---|
| 706 |
|
|---|
| 707 | \begin{itemize}
|
|---|
| 708 | \item \member{prec} is the precision, the number of decimal places.
|
|---|
| 709 | \item \member{rounding} specifies the rounding mode. The \module{decimal}
|
|---|
| 710 | module has constants for the various possibilities:
|
|---|
| 711 | \constant{ROUND_DOWN}, \constant{ROUND_CEILING},
|
|---|
| 712 | \constant{ROUND_HALF_EVEN}, and various others.
|
|---|
| 713 | \item \member{traps} is a dictionary specifying what happens on
|
|---|
| 714 | encountering certain error conditions: either an exception is raised or
|
|---|
| 715 | a value is returned. Some examples of error conditions are
|
|---|
| 716 | division by zero, loss of precision, and overflow.
|
|---|
| 717 | \end{itemize}
|
|---|
| 718 |
|
|---|
| 719 | There's a thread-local default context available by calling
|
|---|
| 720 | \function{getcontext()}; you can change the properties of this context
|
|---|
| 721 | to alter the default precision, rounding, or trap handling. The
|
|---|
| 722 | following example shows the effect of changing the precision of the default
|
|---|
| 723 | context:
|
|---|
| 724 |
|
|---|
| 725 | \begin{verbatim}
|
|---|
| 726 | >>> decimal.getcontext().prec
|
|---|
| 727 | 28
|
|---|
| 728 | >>> decimal.Decimal(1) / decimal.Decimal(7)
|
|---|
| 729 | Decimal("0.1428571428571428571428571429")
|
|---|
| 730 | >>> decimal.getcontext().prec = 9
|
|---|
| 731 | >>> decimal.Decimal(1) / decimal.Decimal(7)
|
|---|
| 732 | Decimal("0.142857143")
|
|---|
| 733 | \end{verbatim}
|
|---|
| 734 |
|
|---|
| 735 | The default action for error conditions is selectable; the module can
|
|---|
| 736 | either return a special value such as infinity or not-a-number, or
|
|---|
| 737 | exceptions can be raised:
|
|---|
| 738 |
|
|---|
| 739 | \begin{verbatim}
|
|---|
| 740 | >>> decimal.Decimal(1) / decimal.Decimal(0)
|
|---|
| 741 | Traceback (most recent call last):
|
|---|
| 742 | ...
|
|---|
| 743 | decimal.DivisionByZero: x / 0
|
|---|
| 744 | >>> decimal.getcontext().traps[decimal.DivisionByZero] = False
|
|---|
| 745 | >>> decimal.Decimal(1) / decimal.Decimal(0)
|
|---|
| 746 | Decimal("Infinity")
|
|---|
| 747 | >>>
|
|---|
| 748 | \end{verbatim}
|
|---|
| 749 |
|
|---|
| 750 | The \class{Context} instance also has various methods for formatting
|
|---|
| 751 | numbers such as \method{to_eng_string()} and \method{to_sci_string()}.
|
|---|
| 752 |
|
|---|
| 753 | For more information, see the documentation for the \module{decimal}
|
|---|
| 754 | module, which includes a quick-start tutorial and a reference.
|
|---|
| 755 |
|
|---|
| 756 | \begin{seealso}
|
|---|
| 757 | \seepep{327}{Decimal Data Type}{Written by Facundo Batista and implemented
|
|---|
| 758 | by Facundo Batista, Eric Price, Raymond Hettinger, Aahz, and Tim Peters.}
|
|---|
| 759 |
|
|---|
| 760 | \seeurl{http://research.microsoft.com/\textasciitilde hollasch/cgindex/coding/ieeefloat.html}
|
|---|
| 761 | {A more detailed overview of the IEEE-754 representation.}
|
|---|
| 762 |
|
|---|
| 763 | \seeurl{http://www.lahey.com/float.htm}
|
|---|
| 764 | {The article uses Fortran code to illustrate many of the problems
|
|---|
| 765 | that floating-point inaccuracy can cause.}
|
|---|
| 766 |
|
|---|
| 767 | \seeurl{http://www2.hursley.ibm.com/decimal/}
|
|---|
| 768 | {A description of a decimal-based representation. This representation
|
|---|
| 769 | is being proposed as a standard, and underlies the new Python decimal
|
|---|
| 770 | type. Much of this material was written by Mike Cowlishaw, designer of the
|
|---|
| 771 | Rexx language.}
|
|---|
| 772 |
|
|---|
| 773 | \end{seealso}
|
|---|
| 774 |
|
|---|
| 775 |
|
|---|
| 776 | %======================================================================
|
|---|
| 777 | \section{PEP 328: Multi-line Imports}
|
|---|
| 778 |
|
|---|
| 779 | One language change is a small syntactic tweak aimed at making it
|
|---|
| 780 | easier to import many names from a module. In a
|
|---|
| 781 | \code{from \var{module} import \var{names}} statement,
|
|---|
| 782 | \var{names} is a sequence of names separated by commas. If the sequence is
|
|---|
| 783 | very long, you can either write multiple imports from the same module,
|
|---|
| 784 | or you can use backslashes to escape the line endings like this:
|
|---|
| 785 |
|
|---|
| 786 | \begin{verbatim}
|
|---|
| 787 | from SimpleXMLRPCServer import SimpleXMLRPCServer,\
|
|---|
| 788 | SimpleXMLRPCRequestHandler,\
|
|---|
| 789 | CGIXMLRPCRequestHandler,\
|
|---|
| 790 | resolve_dotted_attribute
|
|---|
| 791 | \end{verbatim}
|
|---|
| 792 |
|
|---|
| 793 | The syntactic change in Python 2.4 simply allows putting the names
|
|---|
| 794 | within parentheses. Python ignores newlines within a parenthesized
|
|---|
| 795 | expression, so the backslashes are no longer needed:
|
|---|
| 796 |
|
|---|
| 797 | \begin{verbatim}
|
|---|
| 798 | from SimpleXMLRPCServer import (SimpleXMLRPCServer,
|
|---|
| 799 | SimpleXMLRPCRequestHandler,
|
|---|
| 800 | CGIXMLRPCRequestHandler,
|
|---|
| 801 | resolve_dotted_attribute)
|
|---|
| 802 | \end{verbatim}
|
|---|
| 803 |
|
|---|
| 804 | The PEP also proposes that all \keyword{import} statements be absolute
|
|---|
| 805 | imports, with a leading \samp{.} character to indicate a relative
|
|---|
| 806 | import. This part of the PEP was not implemented for Python 2.4,
|
|---|
| 807 | but was completed for Python 2.5.
|
|---|
| 808 |
|
|---|
| 809 | \begin{seealso}
|
|---|
| 810 | \seepep{328}{Imports: Multi-Line and Absolute/Relative}
|
|---|
| 811 | {Written by Aahz. Multi-line imports were implemented by
|
|---|
| 812 | Dima Dorfman.}
|
|---|
| 813 | \end{seealso}
|
|---|
| 814 |
|
|---|
| 815 |
|
|---|
| 816 | %======================================================================
|
|---|
| 817 | \section{PEP 331: Locale-Independent Float/String Conversions}
|
|---|
| 818 |
|
|---|
| 819 | The \module{locale} modules lets Python software select various
|
|---|
| 820 | conversions and display conventions that are localized to a particular
|
|---|
| 821 | country or language. However, the module was careful to not change
|
|---|
| 822 | the numeric locale because various functions in Python's
|
|---|
| 823 | implementation required that the numeric locale remain set to the
|
|---|
| 824 | \code{'C'} locale. Often this was because the code was using the C library's
|
|---|
| 825 | \cfunction{atof()} function.
|
|---|
| 826 |
|
|---|
| 827 | Not setting the numeric locale caused trouble for extensions that used
|
|---|
| 828 | third-party C libraries, however, because they wouldn't have the
|
|---|
| 829 | correct locale set. The motivating example was GTK+, whose user
|
|---|
| 830 | interface widgets weren't displaying numbers in the current locale.
|
|---|
| 831 |
|
|---|
| 832 | The solution described in the PEP is to add three new functions to the
|
|---|
| 833 | Python API that perform ASCII-only conversions, ignoring the locale
|
|---|
| 834 | setting:
|
|---|
| 835 |
|
|---|
| 836 | \begin{itemize}
|
|---|
| 837 | \item \cfunction{PyOS_ascii_strtod(\var{str}, \var{ptr})}
|
|---|
| 838 | and \cfunction{PyOS_ascii_atof(\var{str}, \var{ptr})}
|
|---|
| 839 | both convert a string to a C \ctype{double}.
|
|---|
| 840 | \item \cfunction{PyOS_ascii_formatd(\var{buffer}, \var{buf_len}, \var{format}, \var{d})} converts a \ctype{double} to an ASCII string.
|
|---|
| 841 | \end{itemize}
|
|---|
| 842 |
|
|---|
| 843 | The code for these functions came from the GLib library
|
|---|
| 844 | (\url{http://developer.gnome.org/arch/gtk/glib.html}), whose
|
|---|
| 845 | developers kindly relicensed the relevant functions and donated them
|
|---|
| 846 | to the Python Software Foundation. The \module{locale} module
|
|---|
| 847 | can now change the numeric locale, letting extensions such as GTK+
|
|---|
| 848 | produce the correct results.
|
|---|
| 849 |
|
|---|
| 850 | \begin{seealso}
|
|---|
| 851 | \seepep{331}{Locale-Independent Float/String Conversions}
|
|---|
| 852 | {Written by Christian R. Reis, and implemented by Gustavo Carneiro.}
|
|---|
| 853 | \end{seealso}
|
|---|
| 854 |
|
|---|
| 855 | %======================================================================
|
|---|
| 856 | \section{Other Language Changes}
|
|---|
| 857 |
|
|---|
| 858 | Here are all of the changes that Python 2.4 makes to the core Python
|
|---|
| 859 | language.
|
|---|
| 860 |
|
|---|
| 861 | \begin{itemize}
|
|---|
| 862 |
|
|---|
| 863 | \item Decorators for functions and methods were added (\pep{318}).
|
|---|
| 864 |
|
|---|
| 865 | \item Built-in \function{set} and \function{frozenset} types were
|
|---|
| 866 | added (\pep{218}). Other new built-ins include the \function{reversed(\var{seq})} function (\pep{322}).
|
|---|
| 867 |
|
|---|
| 868 | \item Generator expressions were added (\pep{289}).
|
|---|
| 869 |
|
|---|
| 870 | \item Certain numeric expressions no longer return values restricted to 32 or 64 bits (\pep{237}).
|
|---|
| 871 |
|
|---|
| 872 | \item You can now put parentheses around the list of names in a
|
|---|
| 873 | \code{from \var{module} import \var{names}} statement (\pep{328}).
|
|---|
| 874 |
|
|---|
| 875 | \item The \method{dict.update()} method now accepts the same
|
|---|
| 876 | argument forms as the \class{dict} constructor. This includes any
|
|---|
| 877 | mapping, any iterable of key/value pairs, and keyword arguments.
|
|---|
| 878 | (Contributed by Raymond Hettinger.)
|
|---|
| 879 |
|
|---|
| 880 | \item The string methods \method{ljust()}, \method{rjust()}, and
|
|---|
| 881 | \method{center()} now take an optional argument for specifying a
|
|---|
| 882 | fill character other than a space.
|
|---|
| 883 | (Contributed by Raymond Hettinger.)
|
|---|
| 884 |
|
|---|
| 885 | \item Strings also gained an \method{rsplit()} method that
|
|---|
| 886 | works like the \method{split()} method but splits from the end of
|
|---|
| 887 | the string.
|
|---|
| 888 | (Contributed by Sean Reifschneider.)
|
|---|
| 889 |
|
|---|
| 890 | \begin{verbatim}
|
|---|
| 891 | >>> 'www.python.org'.split('.', 1)
|
|---|
| 892 | ['www', 'python.org']
|
|---|
| 893 | 'www.python.org'.rsplit('.', 1)
|
|---|
| 894 | ['www.python', 'org']
|
|---|
| 895 | \end{verbatim}
|
|---|
| 896 |
|
|---|
| 897 | \item Three keyword parameters, \var{cmp}, \var{key}, and
|
|---|
| 898 | \var{reverse}, were added to the \method{sort()} method of lists.
|
|---|
| 899 | These parameters make some common usages of \method{sort()} simpler.
|
|---|
| 900 | All of these parameters are optional.
|
|---|
| 901 |
|
|---|
| 902 | For the \var{cmp} parameter, the value should be a comparison function
|
|---|
| 903 | that takes two parameters and returns -1, 0, or +1 depending on how
|
|---|
| 904 | the parameters compare. This function will then be used to sort the
|
|---|
| 905 | list. Previously this was the only parameter that could be provided
|
|---|
| 906 | to \method{sort()}.
|
|---|
| 907 |
|
|---|
| 908 | \var{key} should be a single-parameter function that takes a list
|
|---|
| 909 | element and returns a comparison key for the element. The list is
|
|---|
| 910 | then sorted using the comparison keys. The following example sorts a
|
|---|
| 911 | list case-insensitively:
|
|---|
| 912 |
|
|---|
| 913 | \begin{verbatim}
|
|---|
| 914 | >>> L = ['A', 'b', 'c', 'D']
|
|---|
| 915 | >>> L.sort() # Case-sensitive sort
|
|---|
| 916 | >>> L
|
|---|
| 917 | ['A', 'D', 'b', 'c']
|
|---|
| 918 | >>> # Using 'key' parameter to sort list
|
|---|
| 919 | >>> L.sort(key=lambda x: x.lower())
|
|---|
| 920 | >>> L
|
|---|
| 921 | ['A', 'b', 'c', 'D']
|
|---|
| 922 | >>> # Old-fashioned way
|
|---|
| 923 | >>> L.sort(cmp=lambda x,y: cmp(x.lower(), y.lower()))
|
|---|
| 924 | >>> L
|
|---|
| 925 | ['A', 'b', 'c', 'D']
|
|---|
| 926 | \end{verbatim}
|
|---|
| 927 |
|
|---|
| 928 | The last example, which uses the \var{cmp} parameter, is the old way
|
|---|
| 929 | to perform a case-insensitive sort. It works but is slower than using
|
|---|
| 930 | a \var{key} parameter. Using \var{key} calls \method{lower()} method
|
|---|
| 931 | once for each element in the list while using \var{cmp} will call it
|
|---|
| 932 | twice for each comparison, so using \var{key} saves on invocations of
|
|---|
| 933 | the \method{lower()} method.
|
|---|
| 934 |
|
|---|
| 935 | For simple key functions and comparison functions, it is often
|
|---|
| 936 | possible to avoid a \keyword{lambda} expression by using an unbound
|
|---|
| 937 | method instead. For example, the above case-insensitive sort is best
|
|---|
| 938 | written as:
|
|---|
| 939 |
|
|---|
| 940 | \begin{verbatim}
|
|---|
| 941 | >>> L.sort(key=str.lower)
|
|---|
| 942 | >>> L
|
|---|
| 943 | ['A', 'b', 'c', 'D']
|
|---|
| 944 | \end{verbatim}
|
|---|
| 945 |
|
|---|
| 946 | Finally, the \var{reverse} parameter takes a Boolean value. If the
|
|---|
| 947 | value is true, the list will be sorted into reverse order.
|
|---|
| 948 | Instead of \code{L.sort() ; L.reverse()}, you can now write
|
|---|
| 949 | \code{L.sort(reverse=True)}.
|
|---|
| 950 |
|
|---|
| 951 | The results of sorting are now guaranteed to be stable. This means
|
|---|
| 952 | that two entries with equal keys will be returned in the same order as
|
|---|
| 953 | they were input. For example, you can sort a list of people by name,
|
|---|
| 954 | and then sort the list by age, resulting in a list sorted by age where
|
|---|
| 955 | people with the same age are in name-sorted order.
|
|---|
| 956 |
|
|---|
| 957 | (All changes to \method{sort()} contributed by Raymond Hettinger.)
|
|---|
| 958 |
|
|---|
| 959 | \item There is a new built-in function
|
|---|
| 960 | \function{sorted(\var{iterable})} that works like the in-place
|
|---|
| 961 | \method{list.sort()} method but can be used in
|
|---|
| 962 | expressions. The differences are:
|
|---|
| 963 | \begin{itemize}
|
|---|
| 964 | \item the input may be any iterable;
|
|---|
| 965 | \item a newly formed copy is sorted, leaving the original intact; and
|
|---|
| 966 | \item the expression returns the new sorted copy
|
|---|
| 967 | \end{itemize}
|
|---|
| 968 |
|
|---|
| 969 | \begin{verbatim}
|
|---|
| 970 | >>> L = [9,7,8,3,2,4,1,6,5]
|
|---|
| 971 | >>> [10+i for i in sorted(L)] # usable in a list comprehension
|
|---|
| 972 | [11, 12, 13, 14, 15, 16, 17, 18, 19]
|
|---|
| 973 | >>> L # original is left unchanged
|
|---|
| 974 | [9,7,8,3,2,4,1,6,5]
|
|---|
| 975 | >>> sorted('Monty Python') # any iterable may be an input
|
|---|
| 976 | [' ', 'M', 'P', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y', 'y']
|
|---|
| 977 |
|
|---|
| 978 | >>> # List the contents of a dict sorted by key values
|
|---|
| 979 | >>> colormap = dict(red=1, blue=2, green=3, black=4, yellow=5)
|
|---|
| 980 | >>> for k, v in sorted(colormap.iteritems()):
|
|---|
| 981 | ... print k, v
|
|---|
| 982 | ...
|
|---|
| 983 | black 4
|
|---|
| 984 | blue 2
|
|---|
| 985 | green 3
|
|---|
| 986 | red 1
|
|---|
| 987 | yellow 5
|
|---|
| 988 | \end{verbatim}
|
|---|
| 989 |
|
|---|
| 990 | (Contributed by Raymond Hettinger.)
|
|---|
| 991 |
|
|---|
| 992 | \item Integer operations will no longer trigger an \exception{OverflowWarning}.
|
|---|
| 993 | The \exception{OverflowWarning} warning will disappear in Python 2.5.
|
|---|
| 994 |
|
|---|
| 995 | \item The interpreter gained a new switch, \programopt{-m}, that
|
|---|
| 996 | takes a name, searches for the corresponding module on \code{sys.path},
|
|---|
| 997 | and runs the module as a script. For example,
|
|---|
| 998 | you can now run the Python profiler with \code{python -m profile}.
|
|---|
| 999 | (Contributed by Nick Coghlan.)
|
|---|
| 1000 |
|
|---|
| 1001 | \item The \function{eval(\var{expr}, \var{globals}, \var{locals})}
|
|---|
| 1002 | and \function{execfile(\var{filename}, \var{globals}, \var{locals})}
|
|---|
| 1003 | functions and the \keyword{exec} statement now accept any mapping type
|
|---|
| 1004 | for the \var{locals} parameter. Previously this had to be a regular
|
|---|
| 1005 | Python dictionary. (Contributed by Raymond Hettinger.)
|
|---|
| 1006 |
|
|---|
| 1007 | \item The \function{zip()} built-in function and \function{itertools.izip()}
|
|---|
| 1008 | now return an empty list if called with no arguments.
|
|---|
| 1009 | Previously they raised a \exception{TypeError}
|
|---|
| 1010 | exception. This makes them more
|
|---|
| 1011 | suitable for use with variable length argument lists:
|
|---|
| 1012 |
|
|---|
| 1013 | \begin{verbatim}
|
|---|
| 1014 | >>> def transpose(array):
|
|---|
| 1015 | ... return zip(*array)
|
|---|
| 1016 | ...
|
|---|
| 1017 | >>> transpose([(1,2,3), (4,5,6)])
|
|---|
| 1018 | [(1, 4), (2, 5), (3, 6)]
|
|---|
| 1019 | >>> transpose([])
|
|---|
| 1020 | []
|
|---|
| 1021 | \end{verbatim}
|
|---|
| 1022 | (Contributed by Raymond Hettinger.)
|
|---|
| 1023 |
|
|---|
| 1024 | \item Encountering a failure while importing a module no longer leaves
|
|---|
| 1025 | a partially-initialized module object in \code{sys.modules}. The
|
|---|
| 1026 | incomplete module object left behind would fool further imports of the
|
|---|
| 1027 | same module into succeeding, leading to confusing errors.
|
|---|
| 1028 | (Fixed by Tim Peters.)
|
|---|
| 1029 |
|
|---|
| 1030 | \item \constant{None} is now a constant; code that binds a new value to
|
|---|
| 1031 | the name \samp{None} is now a syntax error.
|
|---|
| 1032 | (Contributed by Raymond Hettinger.)
|
|---|
| 1033 |
|
|---|
| 1034 | \end{itemize}
|
|---|
| 1035 |
|
|---|
| 1036 |
|
|---|
| 1037 | %======================================================================
|
|---|
| 1038 | \subsection{Optimizations}
|
|---|
| 1039 |
|
|---|
| 1040 | \begin{itemize}
|
|---|
| 1041 |
|
|---|
| 1042 | \item The inner loops for list and tuple slicing
|
|---|
| 1043 | were optimized and now run about one-third faster. The inner loops
|
|---|
| 1044 | for dictionaries were also optimized, resulting in performance boosts for
|
|---|
| 1045 | \method{keys()}, \method{values()}, \method{items()},
|
|---|
| 1046 | \method{iterkeys()}, \method{itervalues()}, and \method{iteritems()}.
|
|---|
| 1047 | (Contributed by Raymond Hettinger.)
|
|---|
| 1048 |
|
|---|
| 1049 | \item The machinery for growing and shrinking lists was optimized for
|
|---|
| 1050 | speed and for space efficiency. Appending and popping from lists now
|
|---|
| 1051 | runs faster due to more efficient code paths and less frequent use of
|
|---|
| 1052 | the underlying system \cfunction{realloc()}. List comprehensions
|
|---|
| 1053 | also benefit. \method{list.extend()} was also optimized and no
|
|---|
| 1054 | longer converts its argument into a temporary list before extending
|
|---|
| 1055 | the base list. (Contributed by Raymond Hettinger.)
|
|---|
| 1056 |
|
|---|
| 1057 | \item \function{list()}, \function{tuple()}, \function{map()},
|
|---|
| 1058 | \function{filter()}, and \function{zip()} now run several times
|
|---|
| 1059 | faster with non-sequence arguments that supply a \method{__len__()}
|
|---|
| 1060 | method. (Contributed by Raymond Hettinger.)
|
|---|
| 1061 |
|
|---|
| 1062 | \item The methods \method{list.__getitem__()},
|
|---|
| 1063 | \method{dict.__getitem__()}, and \method{dict.__contains__()} are
|
|---|
| 1064 | are now implemented as \class{method_descriptor} objects rather
|
|---|
| 1065 | than \class{wrapper_descriptor} objects. This form of
|
|---|
| 1066 | access doubles their performance and makes them more suitable for
|
|---|
| 1067 | use as arguments to functionals:
|
|---|
| 1068 | \samp{map(mydict.__getitem__, keylist)}.
|
|---|
| 1069 | (Contributed by Raymond Hettinger.)
|
|---|
| 1070 |
|
|---|
| 1071 | \item Added a new opcode, \code{LIST_APPEND}, that simplifies
|
|---|
| 1072 | the generated bytecode for list comprehensions and speeds them up
|
|---|
| 1073 | by about a third. (Contributed by Raymond Hettinger.)
|
|---|
| 1074 |
|
|---|
| 1075 | \item The peephole bytecode optimizer has been improved to
|
|---|
| 1076 | produce shorter, faster bytecode; remarkably, the resulting bytecode is
|
|---|
| 1077 | more readable. (Enhanced by Raymond Hettinger.)
|
|---|
| 1078 |
|
|---|
| 1079 | \item String concatenations in statements of the form \code{s = s +
|
|---|
| 1080 | "abc"} and \code{s += "abc"} are now performed more efficiently in
|
|---|
| 1081 | certain circumstances. This optimization won't be present in other
|
|---|
| 1082 | Python implementations such as Jython, so you shouldn't rely on it;
|
|---|
| 1083 | using the \method{join()} method of strings is still recommended when
|
|---|
| 1084 | you want to efficiently glue a large number of strings together.
|
|---|
| 1085 | (Contributed by Armin Rigo.)
|
|---|
| 1086 |
|
|---|
| 1087 | \end{itemize}
|
|---|
| 1088 |
|
|---|
| 1089 | % pystone is almost useless for comparing different versions of Python;
|
|---|
| 1090 | % instead, it excels at predicting relative Python performance on
|
|---|
| 1091 | % different machines.
|
|---|
| 1092 | % So, this section would be more informative if it used other tools
|
|---|
| 1093 | % such as pybench and parrotbench. For a more application oriented
|
|---|
| 1094 | % benchmark, try comparing the timings of test_decimal.py under 2.3
|
|---|
| 1095 | % and 2.4.
|
|---|
| 1096 |
|
|---|
| 1097 | The net result of the 2.4 optimizations is that Python 2.4 runs the
|
|---|
| 1098 | pystone benchmark around 5\% faster than Python 2.3 and 35\% faster
|
|---|
| 1099 | than Python 2.2. (pystone is not a particularly good benchmark, but
|
|---|
| 1100 | it's the most commonly used measurement of Python's performance. Your
|
|---|
| 1101 | own applications may show greater or smaller benefits from Python~2.4.)
|
|---|
| 1102 |
|
|---|
| 1103 |
|
|---|
| 1104 | %======================================================================
|
|---|
| 1105 | \section{New, Improved, and Deprecated Modules}
|
|---|
| 1106 |
|
|---|
| 1107 | As usual, Python's standard library received a number of enhancements and
|
|---|
| 1108 | bug fixes. Here's a partial list of the most notable changes, sorted
|
|---|
| 1109 | alphabetically by module name. Consult the
|
|---|
| 1110 | \file{Misc/NEWS} file in the source tree for a more
|
|---|
| 1111 | complete list of changes, or look through the CVS logs for all the
|
|---|
| 1112 | details.
|
|---|
| 1113 |
|
|---|
| 1114 | \begin{itemize}
|
|---|
| 1115 |
|
|---|
| 1116 | \item The \module{asyncore} module's \function{loop()} function now
|
|---|
| 1117 | has a \var{count} parameter that lets you perform a limited number
|
|---|
| 1118 | of passes through the polling loop. The default is still to loop
|
|---|
| 1119 | forever.
|
|---|
| 1120 |
|
|---|
| 1121 | \item The \module{base64} module now has more complete RFC 3548 support
|
|---|
| 1122 | for Base64, Base32, and Base16 encoding and decoding, including
|
|---|
| 1123 | optional case folding and optional alternative alphabets.
|
|---|
| 1124 | (Contributed by Barry Warsaw.)
|
|---|
| 1125 |
|
|---|
| 1126 | \item The \module{bisect} module now has an underlying C implementation
|
|---|
| 1127 | for improved performance.
|
|---|
| 1128 | (Contributed by Dmitry Vasiliev.)
|
|---|
| 1129 |
|
|---|
| 1130 | \item The CJKCodecs collections of East Asian codecs, maintained
|
|---|
| 1131 | by Hye-Shik Chang, was integrated into 2.4.
|
|---|
| 1132 | The new encodings are:
|
|---|
| 1133 |
|
|---|
| 1134 | \begin{itemize}
|
|---|
| 1135 | \item Chinese (PRC): gb2312, gbk, gb18030, big5hkscs, hz
|
|---|
| 1136 | \item Chinese (ROC): big5, cp950
|
|---|
| 1137 | \item Japanese: cp932, euc-jis-2004, euc-jp,
|
|---|
| 1138 | euc-jisx0213, iso-2022-jp, iso-2022-jp-1, iso-2022-jp-2,
|
|---|
| 1139 | iso-2022-jp-3, iso-2022-jp-ext, iso-2022-jp-2004,
|
|---|
| 1140 | shift-jis, shift-jisx0213, shift-jis-2004
|
|---|
| 1141 | \item Korean: cp949, euc-kr, johab, iso-2022-kr
|
|---|
| 1142 | \end{itemize}
|
|---|
| 1143 |
|
|---|
| 1144 | \item Some other new encodings were added: HP Roman8,
|
|---|
| 1145 | ISO_8859-11, ISO_8859-16, PCTP-154, and TIS-620.
|
|---|
| 1146 |
|
|---|
| 1147 | \item The UTF-8 and UTF-16 codecs now cope better with receiving partial input.
|
|---|
| 1148 | Previously the \class{StreamReader} class would try to read more data,
|
|---|
| 1149 | making it impossible to resume decoding from the stream. The
|
|---|
| 1150 | \method{read()} method will now return as much data as it can and future
|
|---|
| 1151 | calls will resume decoding where previous ones left off.
|
|---|
| 1152 | (Implemented by Walter D\"orwald.)
|
|---|
| 1153 |
|
|---|
| 1154 | \item There is a new \module{collections} module for
|
|---|
| 1155 | various specialized collection datatypes.
|
|---|
| 1156 | Currently it contains just one type, \class{deque},
|
|---|
| 1157 | a double-ended queue that supports efficiently adding and removing
|
|---|
| 1158 | elements from either end:
|
|---|
| 1159 |
|
|---|
| 1160 | \begin{verbatim}
|
|---|
| 1161 | >>> from collections import deque
|
|---|
| 1162 | >>> d = deque('ghi') # make a new deque with three items
|
|---|
| 1163 | >>> d.append('j') # add a new entry to the right side
|
|---|
| 1164 | >>> d.appendleft('f') # add a new entry to the left side
|
|---|
| 1165 | >>> d # show the representation of the deque
|
|---|
| 1166 | deque(['f', 'g', 'h', 'i', 'j'])
|
|---|
| 1167 | >>> d.pop() # return and remove the rightmost item
|
|---|
| 1168 | 'j'
|
|---|
| 1169 | >>> d.popleft() # return and remove the leftmost item
|
|---|
| 1170 | 'f'
|
|---|
| 1171 | >>> list(d) # list the contents of the deque
|
|---|
| 1172 | ['g', 'h', 'i']
|
|---|
| 1173 | >>> 'h' in d # search the deque
|
|---|
| 1174 | True
|
|---|
| 1175 | \end{verbatim}
|
|---|
| 1176 |
|
|---|
| 1177 | Several modules, such as the \module{Queue} and \module{threading}
|
|---|
| 1178 | modules, now take advantage of \class{collections.deque} for improved
|
|---|
| 1179 | performance. (Contributed by Raymond Hettinger.)
|
|---|
| 1180 |
|
|---|
| 1181 | \item The \module{ConfigParser} classes have been enhanced slightly.
|
|---|
| 1182 | The \method{read()} method now returns a list of the files that
|
|---|
| 1183 | were successfully parsed, and the \method{set()} method raises
|
|---|
| 1184 | \exception{TypeError} if passed a \var{value} argument that isn't a
|
|---|
| 1185 | string. (Contributed by John Belmonte and David Goodger.)
|
|---|
| 1186 |
|
|---|
| 1187 | \item The \module{curses} module now supports the ncurses extension
|
|---|
| 1188 | \function{use_default_colors()}. On platforms where the terminal
|
|---|
| 1189 | supports transparency, this makes it possible to use a transparent
|
|---|
| 1190 | background. (Contributed by J\"org Lehmann.)
|
|---|
| 1191 |
|
|---|
| 1192 | \item The \module{difflib} module now includes an \class{HtmlDiff} class
|
|---|
| 1193 | that creates an HTML table showing a side by side comparison
|
|---|
| 1194 | of two versions of a text. (Contributed by Dan Gass.)
|
|---|
| 1195 |
|
|---|
| 1196 | \item The \module{email} package was updated to version 3.0,
|
|---|
| 1197 | which dropped various deprecated APIs and removes support for Python
|
|---|
| 1198 | versions earlier than 2.3. The 3.0 version of the package uses a new
|
|---|
| 1199 | incremental parser for MIME messages, available in the
|
|---|
| 1200 | \module{email.FeedParser} module. The new parser doesn't require
|
|---|
| 1201 | reading the entire message into memory, and doesn't throw exceptions
|
|---|
| 1202 | if a message is malformed; instead it records any problems in the
|
|---|
| 1203 | \member{defect} attribute of the message. (Developed by Anthony
|
|---|
| 1204 | Baxter, Barry Warsaw, Thomas Wouters, and others.)
|
|---|
| 1205 |
|
|---|
| 1206 | \item The \module{heapq} module has been converted to C. The resulting
|
|---|
| 1207 | tenfold improvement in speed makes the module suitable for handling
|
|---|
| 1208 | high volumes of data. In addition, the module has two new functions
|
|---|
| 1209 | \function{nlargest()} and \function{nsmallest()} that use heaps to
|
|---|
| 1210 | find the N largest or smallest values in a dataset without the
|
|---|
| 1211 | expense of a full sort. (Contributed by Raymond Hettinger.)
|
|---|
| 1212 |
|
|---|
| 1213 | \item The \module{httplib} module now contains constants for HTTP
|
|---|
| 1214 | status codes defined in various HTTP-related RFC documents. Constants
|
|---|
| 1215 | have names such as \constant{OK}, \constant{CREATED},
|
|---|
| 1216 | \constant{CONTINUE}, and \constant{MOVED_PERMANENTLY}; use pydoc to
|
|---|
| 1217 | get a full list. (Contributed by Andrew Eland.)
|
|---|
| 1218 |
|
|---|
| 1219 | \item The \module{imaplib} module now supports IMAP's THREAD command
|
|---|
| 1220 | (contributed by Yves Dionne) and new \method{deleteacl()} and
|
|---|
| 1221 | \method{myrights()} methods (contributed by Arnaud Mazin).
|
|---|
| 1222 |
|
|---|
| 1223 | \item The \module{itertools} module gained a
|
|---|
| 1224 | \function{groupby(\var{iterable}\optional{, \var{func}})} function.
|
|---|
| 1225 | \var{iterable} is something that can be iterated over to return a
|
|---|
| 1226 | stream of elements, and the optional \var{func} parameter is a
|
|---|
| 1227 | function that takes an element and returns a key value; if omitted,
|
|---|
| 1228 | the key is simply the element itself. \function{groupby()} then
|
|---|
| 1229 | groups the elements into subsequences which have matching values of
|
|---|
| 1230 | the key, and returns a series of 2-tuples containing the key value
|
|---|
| 1231 | and an iterator over the subsequence.
|
|---|
| 1232 |
|
|---|
| 1233 | Here's an example to make this clearer. The \var{key} function simply
|
|---|
| 1234 | returns whether a number is even or odd, so the result of
|
|---|
| 1235 | \function{groupby()} is to return consecutive runs of odd or even
|
|---|
| 1236 | numbers.
|
|---|
| 1237 |
|
|---|
| 1238 | \begin{verbatim}
|
|---|
| 1239 | >>> import itertools
|
|---|
| 1240 | >>> L = [2, 4, 6, 7, 8, 9, 11, 12, 14]
|
|---|
| 1241 | >>> for key_val, it in itertools.groupby(L, lambda x: x % 2):
|
|---|
| 1242 | ... print key_val, list(it)
|
|---|
| 1243 | ...
|
|---|
| 1244 | 0 [2, 4, 6]
|
|---|
| 1245 | 1 [7]
|
|---|
| 1246 | 0 [8]
|
|---|
| 1247 | 1 [9, 11]
|
|---|
| 1248 | 0 [12, 14]
|
|---|
| 1249 | >>>
|
|---|
| 1250 | \end{verbatim}
|
|---|
| 1251 |
|
|---|
| 1252 | \function{groupby()} is typically used with sorted input. The logic
|
|---|
| 1253 | for \function{groupby()} is similar to the \UNIX{} \code{uniq} filter
|
|---|
| 1254 | which makes it handy for eliminating, counting, or identifying
|
|---|
| 1255 | duplicate elements:
|
|---|
| 1256 |
|
|---|
| 1257 | \begin{verbatim}
|
|---|
| 1258 | >>> word = 'abracadabra'
|
|---|
| 1259 | >>> letters = sorted(word) # Turn string into a sorted list of letters
|
|---|
| 1260 | >>> letters
|
|---|
| 1261 | ['a', 'a', 'a', 'a', 'a', 'b', 'b', 'c', 'd', 'r', 'r']
|
|---|
| 1262 | >>> for k, g in itertools.groupby(letters):
|
|---|
| 1263 | ... print k, list(g)
|
|---|
| 1264 | ...
|
|---|
| 1265 | a ['a', 'a', 'a', 'a', 'a']
|
|---|
| 1266 | b ['b', 'b']
|
|---|
| 1267 | c ['c']
|
|---|
| 1268 | d ['d']
|
|---|
| 1269 | r ['r', 'r']
|
|---|
| 1270 | >>> # List unique letters
|
|---|
| 1271 | >>> [k for k, g in groupby(letters)]
|
|---|
| 1272 | ['a', 'b', 'c', 'd', 'r']
|
|---|
| 1273 | >>> # Count letter occurrences
|
|---|
| 1274 | >>> [(k, len(list(g))) for k, g in groupby(letters)]
|
|---|
| 1275 | [('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)]
|
|---|
| 1276 | \end{verbatim}
|
|---|
| 1277 |
|
|---|
| 1278 | (Contributed by Hye-Shik Chang.)
|
|---|
| 1279 |
|
|---|
| 1280 | \item \module{itertools} also gained a function named
|
|---|
| 1281 | \function{tee(\var{iterator}, \var{N})} that returns \var{N} independent
|
|---|
| 1282 | iterators that replicate \var{iterator}. If \var{N} is omitted, the
|
|---|
| 1283 | default is 2.
|
|---|
| 1284 |
|
|---|
| 1285 | \begin{verbatim}
|
|---|
| 1286 | >>> L = [1,2,3]
|
|---|
| 1287 | >>> i1, i2 = itertools.tee(L)
|
|---|
| 1288 | >>> i1,i2
|
|---|
| 1289 | (<itertools.tee object at 0x402c2080>, <itertools.tee object at 0x402c2090>)
|
|---|
| 1290 | >>> list(i1) # Run the first iterator to exhaustion
|
|---|
| 1291 | [1, 2, 3]
|
|---|
| 1292 | >>> list(i2) # Run the second iterator to exhaustion
|
|---|
| 1293 | [1, 2, 3]
|
|---|
| 1294 | >\end{verbatim}
|
|---|
| 1295 |
|
|---|
| 1296 | Note that \function{tee()} has to keep copies of the values returned
|
|---|
| 1297 | by the iterator; in the worst case, it may need to keep all of them.
|
|---|
| 1298 | This should therefore be used carefully if the leading iterator
|
|---|
| 1299 | can run far ahead of the trailing iterator in a long stream of inputs.
|
|---|
| 1300 | If the separation is large, then you might as well use
|
|---|
| 1301 | \function{list()} instead. When the iterators track closely with one
|
|---|
| 1302 | another, \function{tee()} is ideal. Possible applications include
|
|---|
| 1303 | bookmarking, windowing, or lookahead iterators.
|
|---|
| 1304 | (Contributed by Raymond Hettinger.)
|
|---|
| 1305 |
|
|---|
| 1306 | \item A number of functions were added to the \module{locale}
|
|---|
| 1307 | module, such as \function{bind_textdomain_codeset()} to specify a
|
|---|
| 1308 | particular encoding and a family of \function{l*gettext()} functions
|
|---|
| 1309 | that return messages in the chosen encoding.
|
|---|
| 1310 | (Contributed by Gustavo Niemeyer.)
|
|---|
| 1311 |
|
|---|
| 1312 | \item Some keyword arguments were added to the \module{logging}
|
|---|
| 1313 | package's \function{basicConfig} function to simplify log
|
|---|
| 1314 | configuration. The default behavior is to log messages to standard
|
|---|
| 1315 | error, but various keyword arguments can be specified to log to a
|
|---|
| 1316 | particular file, change the logging format, or set the logging level.
|
|---|
| 1317 | For example:
|
|---|
| 1318 |
|
|---|
| 1319 | \begin{verbatim}
|
|---|
| 1320 | import logging
|
|---|
| 1321 | logging.basicConfig(filename='/var/log/application.log',
|
|---|
| 1322 | level=0, # Log all messages
|
|---|
| 1323 | format='%(levelname):%(process):%(thread):%(message)')
|
|---|
| 1324 | \end{verbatim}
|
|---|
| 1325 |
|
|---|
| 1326 | Other additions to the \module{logging} package include a
|
|---|
| 1327 | \method{log(\var{level}, \var{msg})} convenience method, as well as a
|
|---|
| 1328 | \class{TimedRotatingFileHandler} class that rotates its log files at a
|
|---|
| 1329 | timed interval. The module already had \class{RotatingFileHandler},
|
|---|
| 1330 | which rotated logs once the file exceeded a certain size. Both
|
|---|
| 1331 | classes derive from a new \class{BaseRotatingHandler} class that can
|
|---|
| 1332 | be used to implement other rotating handlers.
|
|---|
| 1333 |
|
|---|
| 1334 | (Changes implemented by Vinay Sajip.)
|
|---|
| 1335 |
|
|---|
| 1336 | \item The \module{marshal} module now shares interned strings on unpacking a
|
|---|
| 1337 | data structure. This may shrink the size of certain pickle strings,
|
|---|
| 1338 | but the primary effect is to make \file{.pyc} files significantly smaller.
|
|---|
| 1339 | (Contributed by Martin von~L\"owis.)
|
|---|
| 1340 |
|
|---|
| 1341 | \item The \module{nntplib} module's \class{NNTP} class gained
|
|---|
| 1342 | \method{description()} and \method{descriptions()} methods to retrieve
|
|---|
| 1343 | newsgroup descriptions for a single group or for a range of groups.
|
|---|
| 1344 | (Contributed by J\"urgen A. Erhard.)
|
|---|
| 1345 |
|
|---|
| 1346 | \item Two new functions were added to the \module{operator} module,
|
|---|
| 1347 | \function{attrgetter(\var{attr})} and \function{itemgetter(\var{index})}.
|
|---|
| 1348 | Both functions return callables that take a single argument and return
|
|---|
| 1349 | the corresponding attribute or item; these callables make excellent
|
|---|
| 1350 | data extractors when used with \function{map()} or
|
|---|
| 1351 | \function{sorted()}. For example:
|
|---|
| 1352 |
|
|---|
| 1353 | \begin{verbatim}
|
|---|
| 1354 | >>> L = [('c', 2), ('d', 1), ('a', 4), ('b', 3)]
|
|---|
| 1355 | >>> map(operator.itemgetter(0), L)
|
|---|
| 1356 | ['c', 'd', 'a', 'b']
|
|---|
| 1357 | >>> map(operator.itemgetter(1), L)
|
|---|
| 1358 | [2, 1, 4, 3]
|
|---|
| 1359 | >>> sorted(L, key=operator.itemgetter(1)) # Sort list by second tuple item
|
|---|
| 1360 | [('d', 1), ('c', 2), ('b', 3), ('a', 4)]
|
|---|
| 1361 | \end{verbatim}
|
|---|
| 1362 |
|
|---|
| 1363 | (Contributed by Raymond Hettinger.)
|
|---|
| 1364 |
|
|---|
| 1365 | \item The \module{optparse} module was updated in various ways. The
|
|---|
| 1366 | module now passes its messages through \function{gettext.gettext()},
|
|---|
| 1367 | making it possible to internationalize Optik's help and error
|
|---|
| 1368 | messages. Help messages for options can now include the string
|
|---|
| 1369 | \code{'\%default'}, which will be replaced by the option's default
|
|---|
| 1370 | value. (Contributed by Greg Ward.)
|
|---|
| 1371 |
|
|---|
| 1372 | \item The long-term plan is to deprecate the \module{rfc822} module
|
|---|
| 1373 | in some future Python release in favor of the \module{email} package.
|
|---|
| 1374 | To this end, the \function{email.Utils.formatdate()} function has been
|
|---|
| 1375 | changed to make it usable as a replacement for
|
|---|
| 1376 | \function{rfc822.formatdate()}. You may want to write new e-mail
|
|---|
| 1377 | processing code with this in mind. (Change implemented by Anthony
|
|---|
| 1378 | Baxter.)
|
|---|
| 1379 |
|
|---|
| 1380 | \item A new \function{urandom(\var{n})} function was added to the
|
|---|
| 1381 | \module{os} module, returning a string containing \var{n} bytes of
|
|---|
| 1382 | random data. This function provides access to platform-specific
|
|---|
| 1383 | sources of randomness such as \file{/dev/urandom} on Linux or the
|
|---|
| 1384 | Windows CryptoAPI. (Contributed by Trevor Perrin.)
|
|---|
| 1385 |
|
|---|
| 1386 | \item Another new function: \function{os.path.lexists(\var{path})}
|
|---|
| 1387 | returns true if the file specified by \var{path} exists, whether or
|
|---|
| 1388 | not it's a symbolic link. This differs from the existing
|
|---|
| 1389 | \function{os.path.exists(\var{path})} function, which returns false if
|
|---|
| 1390 | \var{path} is a symlink that points to a destination that doesn't exist.
|
|---|
| 1391 | (Contributed by Beni Cherniavsky.)
|
|---|
| 1392 |
|
|---|
| 1393 | \item A new \function{getsid()} function was added to the
|
|---|
| 1394 | \module{posix} module that underlies the \module{os} module.
|
|---|
| 1395 | (Contributed by J. Raynor.)
|
|---|
| 1396 |
|
|---|
| 1397 | \item The \module{poplib} module now supports POP over SSL. (Contributed by
|
|---|
| 1398 | Hector Urtubia.)
|
|---|
| 1399 |
|
|---|
| 1400 | \item The \module{profile} module can now profile C extension functions.
|
|---|
| 1401 | (Contributed by Nick Bastin.)
|
|---|
| 1402 |
|
|---|
| 1403 | \item The \module{random} module has a new method called
|
|---|
| 1404 | \method{getrandbits(\var{N})} that returns a long integer \var{N}
|
|---|
| 1405 | bits in length. The existing \method{randrange()} method now uses
|
|---|
| 1406 | \method{getrandbits()} where appropriate, making generation of
|
|---|
| 1407 | arbitrarily large random numbers more efficient. (Contributed by
|
|---|
| 1408 | Raymond Hettinger.)
|
|---|
| 1409 |
|
|---|
| 1410 | \item The regular expression language accepted by the \module{re} module
|
|---|
| 1411 | was extended with simple conditional expressions, written as
|
|---|
| 1412 | \regexp{(?(\var{group})\var{A}|\var{B})}. \var{group} is either a
|
|---|
| 1413 | numeric group ID or a group name defined with \regexp{(?P<group>...)}
|
|---|
| 1414 | earlier in the expression. If the specified group matched, the
|
|---|
| 1415 | regular expression pattern \var{A} will be tested against the string; if
|
|---|
| 1416 | the group didn't match, the pattern \var{B} will be used instead.
|
|---|
| 1417 | (Contributed by Gustavo Niemeyer.)
|
|---|
| 1418 |
|
|---|
| 1419 | \item The \module{re} module is also no longer recursive, thanks to a
|
|---|
| 1420 | massive amount of work by Gustavo Niemeyer. In a recursive regular
|
|---|
| 1421 | expression engine, certain patterns result in a large amount of C
|
|---|
| 1422 | stack space being consumed, and it was possible to overflow the stack.
|
|---|
| 1423 | For example, if you matched a 30000-byte string of \samp{a} characters
|
|---|
| 1424 | against the expression \regexp{(a|b)+}, one stack frame was consumed
|
|---|
| 1425 | per character. Python 2.3 tried to check for stack overflow and raise
|
|---|
| 1426 | a \exception{RuntimeError} exception, but certain patterns could
|
|---|
| 1427 | sidestep the checking and if you were unlucky Python could segfault.
|
|---|
| 1428 | Python 2.4's regular expression engine can match this pattern without
|
|---|
| 1429 | problems.
|
|---|
| 1430 |
|
|---|
| 1431 | \item The \module{signal} module now performs tighter error-checking
|
|---|
| 1432 | on the parameters to the \function{signal.signal()} function. For
|
|---|
| 1433 | example, you can't set a handler on the \constant{SIGKILL} signal;
|
|---|
| 1434 | previous versions of Python would quietly accept this, but 2.4 will
|
|---|
| 1435 | raise a \exception{RuntimeError} exception.
|
|---|
| 1436 |
|
|---|
| 1437 | \item Two new functions were added to the \module{socket} module.
|
|---|
| 1438 | \function{socketpair()} returns a pair of connected sockets and
|
|---|
| 1439 | \function{getservbyport(\var{port})} looks up the service name for a
|
|---|
| 1440 | given port number. (Contributed by Dave Cole and Barry Warsaw.)
|
|---|
| 1441 |
|
|---|
| 1442 | \item The \function{sys.exitfunc()} function has been deprecated. Code
|
|---|
| 1443 | should be using the existing \module{atexit} module, which correctly
|
|---|
| 1444 | handles calling multiple exit functions. Eventually
|
|---|
| 1445 | \function{sys.exitfunc()} will become a purely internal interface,
|
|---|
| 1446 | accessed only by \module{atexit}.
|
|---|
| 1447 |
|
|---|
| 1448 | \item The \module{tarfile} module now generates GNU-format tar files
|
|---|
| 1449 | by default. (Contributed by Lars Gustaebel.)
|
|---|
| 1450 |
|
|---|
| 1451 | \item The \module{threading} module now has an elegantly simple way to support
|
|---|
| 1452 | thread-local data. The module contains a \class{local} class whose
|
|---|
| 1453 | attribute values are local to different threads.
|
|---|
| 1454 |
|
|---|
| 1455 | \begin{verbatim}
|
|---|
| 1456 | import threading
|
|---|
| 1457 |
|
|---|
| 1458 | data = threading.local()
|
|---|
| 1459 | data.number = 42
|
|---|
| 1460 | data.url = ('www.python.org', 80)
|
|---|
| 1461 | \end{verbatim}
|
|---|
| 1462 |
|
|---|
| 1463 | Other threads can assign and retrieve their own values for the
|
|---|
| 1464 | \member{number} and \member{url} attributes. You can subclass
|
|---|
| 1465 | \class{local} to initialize attributes or to add methods.
|
|---|
| 1466 | (Contributed by Jim Fulton.)
|
|---|
| 1467 |
|
|---|
| 1468 | \item The \module{timeit} module now automatically disables periodic
|
|---|
| 1469 | garbage collection during the timing loop. This change makes
|
|---|
| 1470 | consecutive timings more comparable. (Contributed by Raymond Hettinger.)
|
|---|
| 1471 |
|
|---|
| 1472 | \item The \module{weakref} module now supports a wider variety of objects
|
|---|
| 1473 | including Python functions, class instances, sets, frozensets, deques,
|
|---|
| 1474 | arrays, files, sockets, and regular expression pattern objects.
|
|---|
| 1475 | (Contributed by Raymond Hettinger.)
|
|---|
| 1476 |
|
|---|
| 1477 | \item The \module{xmlrpclib} module now supports a multi-call extension for
|
|---|
| 1478 | transmitting multiple XML-RPC calls in a single HTTP operation.
|
|---|
| 1479 | (Contributed by Brian Quinlan.)
|
|---|
| 1480 |
|
|---|
| 1481 | \item The \module{mpz}, \module{rotor}, and \module{xreadlines} modules have
|
|---|
| 1482 | been removed.
|
|---|
| 1483 |
|
|---|
| 1484 | \end{itemize}
|
|---|
| 1485 |
|
|---|
| 1486 |
|
|---|
| 1487 | %======================================================================
|
|---|
| 1488 | % whole new modules get described in subsections here
|
|---|
| 1489 |
|
|---|
| 1490 | %=====================
|
|---|
| 1491 | \subsection{cookielib}
|
|---|
| 1492 |
|
|---|
| 1493 | The \module{cookielib} library supports client-side handling for HTTP
|
|---|
| 1494 | cookies, mirroring the \module{Cookie} module's server-side cookie
|
|---|
| 1495 | support. Cookies are stored in cookie jars; the library transparently
|
|---|
| 1496 | stores cookies offered by the web server in the cookie jar, and
|
|---|
| 1497 | fetches the cookie from the jar when connecting to the server. As in
|
|---|
| 1498 | web browsers, policy objects control whether cookies are accepted or
|
|---|
| 1499 | not.
|
|---|
| 1500 |
|
|---|
| 1501 | In order to store cookies across sessions, two implementations of
|
|---|
| 1502 | cookie jars are provided: one that stores cookies in the Netscape
|
|---|
| 1503 | format so applications can use the Mozilla or Lynx cookie files, and
|
|---|
| 1504 | one that stores cookies in the same format as the Perl libwww library.
|
|---|
| 1505 |
|
|---|
| 1506 | \module{urllib2} has been changed to interact with \module{cookielib}:
|
|---|
| 1507 | \class{HTTPCookieProcessor} manages a cookie jar that is used when
|
|---|
| 1508 | accessing URLs.
|
|---|
| 1509 |
|
|---|
| 1510 | This module was contributed by John J. Lee.
|
|---|
| 1511 |
|
|---|
| 1512 |
|
|---|
| 1513 | % ==================
|
|---|
| 1514 | \subsection{doctest}
|
|---|
| 1515 |
|
|---|
| 1516 | The \module{doctest} module underwent considerable refactoring thanks
|
|---|
| 1517 | to Edward Loper and Tim Peters. Testing can still be as simple as
|
|---|
| 1518 | running \function{doctest.testmod()}, but the refactorings allow
|
|---|
| 1519 | customizing the module's operation in various ways
|
|---|
| 1520 |
|
|---|
| 1521 | The new \class{DocTestFinder} class extracts the tests from a given
|
|---|
| 1522 | object's docstrings:
|
|---|
| 1523 |
|
|---|
| 1524 | \begin{verbatim}
|
|---|
| 1525 | def f (x, y):
|
|---|
| 1526 | """>>> f(2,2)
|
|---|
| 1527 | 4
|
|---|
| 1528 | >>> f(3,2)
|
|---|
| 1529 | 6
|
|---|
| 1530 | """
|
|---|
| 1531 | return x*y
|
|---|
| 1532 |
|
|---|
| 1533 | finder = doctest.DocTestFinder()
|
|---|
| 1534 |
|
|---|
| 1535 | # Get list of DocTest instances
|
|---|
| 1536 | tests = finder.find(f)
|
|---|
| 1537 | \end{verbatim}
|
|---|
| 1538 |
|
|---|
| 1539 | The new \class{DocTestRunner} class then runs individual tests and can
|
|---|
| 1540 | produce a summary of the results:
|
|---|
| 1541 |
|
|---|
| 1542 | \begin{verbatim}
|
|---|
| 1543 | runner = doctest.DocTestRunner()
|
|---|
| 1544 | for t in tests:
|
|---|
| 1545 | tried, failed = runner.run(t)
|
|---|
| 1546 |
|
|---|
| 1547 | runner.summarize(verbose=1)
|
|---|
| 1548 | \end{verbatim}
|
|---|
| 1549 |
|
|---|
| 1550 | The above example produces the following output:
|
|---|
| 1551 |
|
|---|
| 1552 | \begin{verbatim}
|
|---|
| 1553 | 1 items passed all tests:
|
|---|
| 1554 | 2 tests in f
|
|---|
| 1555 | 2 tests in 1 items.
|
|---|
| 1556 | 2 passed and 0 failed.
|
|---|
| 1557 | Test passed.
|
|---|
| 1558 | \end{verbatim}
|
|---|
| 1559 |
|
|---|
| 1560 | \class{DocTestRunner} uses an instance of the \class{OutputChecker}
|
|---|
| 1561 | class to compare the expected output with the actual output. This
|
|---|
| 1562 | class takes a number of different flags that customize its behaviour;
|
|---|
| 1563 | ambitious users can also write a completely new subclass of
|
|---|
| 1564 | \class{OutputChecker}.
|
|---|
| 1565 |
|
|---|
| 1566 | The default output checker provides a number of handy features.
|
|---|
| 1567 | For example, with the \constant{doctest.ELLIPSIS} option flag,
|
|---|
| 1568 | an ellipsis (\samp{...}) in the expected output matches any substring,
|
|---|
| 1569 | making it easier to accommodate outputs that vary in minor ways:
|
|---|
| 1570 |
|
|---|
| 1571 | \begin{verbatim}
|
|---|
| 1572 | def o (n):
|
|---|
| 1573 | """>>> o(1)
|
|---|
| 1574 | <__main__.C instance at 0x...>
|
|---|
| 1575 | >>>
|
|---|
| 1576 | """
|
|---|
| 1577 | \end{verbatim}
|
|---|
| 1578 |
|
|---|
| 1579 | Another special string, \samp{<BLANKLINE>}, matches a blank line:
|
|---|
| 1580 |
|
|---|
| 1581 | \begin{verbatim}
|
|---|
| 1582 | def p (n):
|
|---|
| 1583 | """>>> p(1)
|
|---|
| 1584 | <BLANKLINE>
|
|---|
| 1585 | >>>
|
|---|
| 1586 | """
|
|---|
| 1587 | \end{verbatim}
|
|---|
| 1588 |
|
|---|
| 1589 | Another new capability is producing a diff-style display of the output
|
|---|
| 1590 | by specifying the \constant{doctest.REPORT_UDIFF} (unified diffs),
|
|---|
| 1591 | \constant{doctest.REPORT_CDIFF} (context diffs), or
|
|---|
| 1592 | \constant{doctest.REPORT_NDIFF} (delta-style) option flags. For example:
|
|---|
| 1593 |
|
|---|
| 1594 | \begin{verbatim}
|
|---|
| 1595 | def g (n):
|
|---|
| 1596 | """>>> g(4)
|
|---|
| 1597 | here
|
|---|
| 1598 | is
|
|---|
| 1599 | a
|
|---|
| 1600 | lengthy
|
|---|
| 1601 | >>>"""
|
|---|
| 1602 | L = 'here is a rather lengthy list of words'.split()
|
|---|
| 1603 | for word in L[:n]:
|
|---|
| 1604 | print word
|
|---|
| 1605 | \end{verbatim}
|
|---|
| 1606 |
|
|---|
| 1607 | Running the above function's tests with
|
|---|
| 1608 | \constant{doctest.REPORT_UDIFF} specified, you get the following output:
|
|---|
| 1609 |
|
|---|
| 1610 | \begin{verbatim}
|
|---|
| 1611 | **********************************************************************
|
|---|
| 1612 | File ``t.py'', line 15, in g
|
|---|
| 1613 | Failed example:
|
|---|
| 1614 | g(4)
|
|---|
| 1615 | Differences (unified diff with -expected +actual):
|
|---|
| 1616 | @@ -2,3 +2,3 @@
|
|---|
| 1617 | is
|
|---|
| 1618 | a
|
|---|
| 1619 | -lengthy
|
|---|
| 1620 | +rather
|
|---|
| 1621 | **********************************************************************
|
|---|
| 1622 | \end{verbatim}
|
|---|
| 1623 |
|
|---|
| 1624 |
|
|---|
| 1625 | % ======================================================================
|
|---|
| 1626 | \section{Build and C API Changes}
|
|---|
| 1627 |
|
|---|
| 1628 | Some of the changes to Python's build process and to the C API are:
|
|---|
| 1629 |
|
|---|
| 1630 | \begin{itemize}
|
|---|
| 1631 |
|
|---|
| 1632 | \item Three new convenience macros were added for common return
|
|---|
| 1633 | values from extension functions: \csimplemacro{Py_RETURN_NONE},
|
|---|
| 1634 | \csimplemacro{Py_RETURN_TRUE}, and \csimplemacro{Py_RETURN_FALSE}.
|
|---|
| 1635 | (Contributed by Brett Cannon.)
|
|---|
| 1636 |
|
|---|
| 1637 | \item Another new macro, \csimplemacro{Py_CLEAR(\var{obj})},
|
|---|
| 1638 | decreases the reference count of \var{obj} and sets \var{obj} to the
|
|---|
| 1639 | null pointer. (Contributed by Jim Fulton.)
|
|---|
| 1640 |
|
|---|
| 1641 | \item A new function, \cfunction{PyTuple_Pack(\var{N}, \var{obj1},
|
|---|
| 1642 | \var{obj2}, ..., \var{objN})}, constructs tuples from a variable
|
|---|
| 1643 | length argument list of Python objects. (Contributed by Raymond Hettinger.)
|
|---|
| 1644 |
|
|---|
| 1645 | \item A new function, \cfunction{PyDict_Contains(\var{d}, \var{k})},
|
|---|
| 1646 | implements fast dictionary lookups without masking exceptions raised
|
|---|
| 1647 | during the look-up process. (Contributed by Raymond Hettinger.)
|
|---|
| 1648 |
|
|---|
| 1649 | \item The \csimplemacro{Py_IS_NAN(\var{X})} macro returns 1 if
|
|---|
| 1650 | its float or double argument \var{X} is a NaN.
|
|---|
| 1651 | (Contributed by Tim Peters.)
|
|---|
| 1652 |
|
|---|
| 1653 | \item C code can avoid unnecessary locking by using the new
|
|---|
| 1654 | \cfunction{PyEval_ThreadsInitialized()} function to tell
|
|---|
| 1655 | if any thread operations have been performed. If this function
|
|---|
| 1656 | returns false, no lock operations are needed.
|
|---|
| 1657 | (Contributed by Nick Coghlan.)
|
|---|
| 1658 |
|
|---|
| 1659 | \item A new function, \cfunction{PyArg_VaParseTupleAndKeywords()},
|
|---|
| 1660 | is the same as \cfunction{PyArg_ParseTupleAndKeywords()} but takes a
|
|---|
| 1661 | \ctype{va_list} instead of a number of arguments.
|
|---|
| 1662 | (Contributed by Greg Chapman.)
|
|---|
| 1663 |
|
|---|
| 1664 | \item A new method flag, \constant{METH_COEXISTS}, allows a function
|
|---|
| 1665 | defined in slots to co-exist with a \ctype{PyCFunction} having the
|
|---|
| 1666 | same name. This can halve the access time for a method such as
|
|---|
| 1667 | \method{set.__contains__()}. (Contributed by Raymond Hettinger.)
|
|---|
| 1668 |
|
|---|
| 1669 | \item Python can now be built with additional profiling for the
|
|---|
| 1670 | interpreter itself, intended as an aid to people developing the
|
|---|
| 1671 | Python core. Providing \longprogramopt{--enable-profiling} to the
|
|---|
| 1672 | \program{configure} script will let you profile the interpreter with
|
|---|
| 1673 | \program{gprof}, and providing the \longprogramopt{--with-tsc}
|
|---|
| 1674 | switch enables profiling using the Pentium's Time-Stamp-Counter
|
|---|
| 1675 | register. Note that the \longprogramopt{--with-tsc} switch is slightly
|
|---|
| 1676 | misnamed, because the profiling feature also works on the PowerPC
|
|---|
| 1677 | platform, though that processor architecture doesn't call that
|
|---|
| 1678 | register ``the TSC register''. (Contributed by Jeremy Hylton.)
|
|---|
| 1679 |
|
|---|
| 1680 | \item The \ctype{tracebackobject} type has been renamed to \ctype{PyTracebackObject}.
|
|---|
| 1681 |
|
|---|
| 1682 | \end{itemize}
|
|---|
| 1683 |
|
|---|
| 1684 |
|
|---|
| 1685 | %======================================================================
|
|---|
| 1686 | \subsection{Port-Specific Changes}
|
|---|
| 1687 |
|
|---|
| 1688 | \begin{itemize}
|
|---|
| 1689 |
|
|---|
| 1690 | \item The Windows port now builds under MSVC++ 7.1 as well as version 6.
|
|---|
| 1691 | (Contributed by Martin von~L\"owis.)
|
|---|
| 1692 |
|
|---|
| 1693 | \end{itemize}
|
|---|
| 1694 |
|
|---|
| 1695 |
|
|---|
| 1696 |
|
|---|
| 1697 | %======================================================================
|
|---|
| 1698 | \section{Porting to Python 2.4}
|
|---|
| 1699 |
|
|---|
| 1700 | This section lists previously described changes that may require
|
|---|
| 1701 | changes to your code:
|
|---|
| 1702 |
|
|---|
| 1703 | \begin{itemize}
|
|---|
| 1704 |
|
|---|
| 1705 | \item Left shifts and hexadecimal/octal constants that are too
|
|---|
| 1706 | large no longer trigger a \exception{FutureWarning} and return
|
|---|
| 1707 | a value limited to 32 or 64 bits; instead they return a long integer.
|
|---|
| 1708 |
|
|---|
| 1709 | \item Integer operations will no longer trigger an \exception{OverflowWarning}.
|
|---|
| 1710 | The \exception{OverflowWarning} warning will disappear in Python 2.5.
|
|---|
| 1711 |
|
|---|
| 1712 | \item The \function{zip()} built-in function and \function{itertools.izip()}
|
|---|
| 1713 | now return an empty list instead of raising a \exception{TypeError}
|
|---|
| 1714 | exception if called with no arguments.
|
|---|
| 1715 |
|
|---|
| 1716 | \item You can no longer compare the \class{date} and \class{datetime}
|
|---|
| 1717 | instances provided by the \module{datetime} module. Two
|
|---|
| 1718 | instances of different classes will now always be unequal, and
|
|---|
| 1719 | relative comparisons (\code{<}, \code{>}) will raise a \exception{TypeError}.
|
|---|
| 1720 |
|
|---|
| 1721 | \item \function{dircache.listdir()} now passes exceptions to the caller
|
|---|
| 1722 | instead of returning empty lists.
|
|---|
| 1723 |
|
|---|
| 1724 | \item \function{LexicalHandler.startDTD()} used to receive the public and
|
|---|
| 1725 | system IDs in the wrong order. This has been corrected; applications
|
|---|
| 1726 | relying on the wrong order need to be fixed.
|
|---|
| 1727 |
|
|---|
| 1728 | \item \function{fcntl.ioctl} now warns if the \var{mutate}
|
|---|
| 1729 | argument is omitted and relevant.
|
|---|
| 1730 |
|
|---|
| 1731 | \item The \module{tarfile} module now generates GNU-format tar files
|
|---|
| 1732 | by default.
|
|---|
| 1733 |
|
|---|
| 1734 | \item Encountering a failure while importing a module no longer leaves
|
|---|
| 1735 | a partially-initialized module object in \code{sys.modules}.
|
|---|
| 1736 |
|
|---|
| 1737 | \item \constant{None} is now a constant; code that binds a new value to
|
|---|
| 1738 | the name \samp{None} is now a syntax error.
|
|---|
| 1739 |
|
|---|
| 1740 | \item The \function{signals.signal()} function now raises a
|
|---|
| 1741 | \exception{RuntimeError} exception for certain illegal values;
|
|---|
| 1742 | previously these errors would pass silently. For example, you can no
|
|---|
| 1743 | longer set a handler on the \constant{SIGKILL} signal.
|
|---|
| 1744 |
|
|---|
| 1745 | \end{itemize}
|
|---|
| 1746 |
|
|---|
| 1747 |
|
|---|
| 1748 | %======================================================================
|
|---|
| 1749 | \section{Acknowledgements \label{acks}}
|
|---|
| 1750 |
|
|---|
| 1751 | The author would like to thank the following people for offering
|
|---|
| 1752 | suggestions, corrections and assistance with various drafts of this
|
|---|
| 1753 | article: Koray Can, Hye-Shik Chang, Michael Dyck, Raymond Hettinger,
|
|---|
| 1754 | Brian Hurt, Hamish Lawson, Fredrik Lundh, Sean Reifschneider,
|
|---|
| 1755 | Sadruddin Rejeb.
|
|---|
| 1756 |
|
|---|
| 1757 | \end{document}
|
|---|