| 1 | \section{\module{collections} ---
|
|---|
| 2 | High-performance container datatypes}
|
|---|
| 3 |
|
|---|
| 4 | \declaremodule{standard}{collections}
|
|---|
| 5 | \modulesynopsis{High-performance datatypes}
|
|---|
| 6 | \moduleauthor{Raymond Hettinger}{[email protected]}
|
|---|
| 7 | \sectionauthor{Raymond Hettinger}{[email protected]}
|
|---|
| 8 | \versionadded{2.4}
|
|---|
| 9 |
|
|---|
| 10 |
|
|---|
| 11 | This module implements high-performance container datatypes. Currently,
|
|---|
| 12 | there are two datatypes, deque and defaultdict.
|
|---|
| 13 | Future additions may include balanced trees and ordered dictionaries.
|
|---|
| 14 | \versionchanged[Added defaultdict]{2.5}
|
|---|
| 15 |
|
|---|
| 16 | \subsection{\class{deque} objects \label{deque-objects}}
|
|---|
| 17 |
|
|---|
| 18 | \begin{funcdesc}{deque}{\optional{iterable}}
|
|---|
| 19 | Returns a new deque objected initialized left-to-right (using
|
|---|
| 20 | \method{append()}) with data from \var{iterable}. If \var{iterable}
|
|---|
| 21 | is not specified, the new deque is empty.
|
|---|
| 22 |
|
|---|
| 23 | Deques are a generalization of stacks and queues (the name is pronounced
|
|---|
| 24 | ``deck'' and is short for ``double-ended queue''). Deques support
|
|---|
| 25 | thread-safe, memory efficient appends and pops from either side of the deque
|
|---|
| 26 | with approximately the same \code{O(1)} performance in either direction.
|
|---|
| 27 |
|
|---|
| 28 | Though \class{list} objects support similar operations, they are optimized
|
|---|
| 29 | for fast fixed-length operations and incur \code{O(n)} memory movement costs
|
|---|
| 30 | for \samp{pop(0)} and \samp{insert(0, v)} operations which change both the
|
|---|
| 31 | size and position of the underlying data representation.
|
|---|
| 32 | \versionadded{2.4}
|
|---|
| 33 | \end{funcdesc}
|
|---|
| 34 |
|
|---|
| 35 | Deque objects support the following methods:
|
|---|
| 36 |
|
|---|
| 37 | \begin{methoddesc}{append}{x}
|
|---|
| 38 | Add \var{x} to the right side of the deque.
|
|---|
| 39 | \end{methoddesc}
|
|---|
| 40 |
|
|---|
| 41 | \begin{methoddesc}{appendleft}{x}
|
|---|
| 42 | Add \var{x} to the left side of the deque.
|
|---|
| 43 | \end{methoddesc}
|
|---|
| 44 |
|
|---|
| 45 | \begin{methoddesc}{clear}{}
|
|---|
| 46 | Remove all elements from the deque leaving it with length 0.
|
|---|
| 47 | \end{methoddesc}
|
|---|
| 48 |
|
|---|
| 49 | \begin{methoddesc}{extend}{iterable}
|
|---|
| 50 | Extend the right side of the deque by appending elements from
|
|---|
| 51 | the iterable argument.
|
|---|
| 52 | \end{methoddesc}
|
|---|
| 53 |
|
|---|
| 54 | \begin{methoddesc}{extendleft}{iterable}
|
|---|
| 55 | Extend the left side of the deque by appending elements from
|
|---|
| 56 | \var{iterable}. Note, the series of left appends results in
|
|---|
| 57 | reversing the order of elements in the iterable argument.
|
|---|
| 58 | \end{methoddesc}
|
|---|
| 59 |
|
|---|
| 60 | \begin{methoddesc}{pop}{}
|
|---|
| 61 | Remove and return an element from the right side of the deque.
|
|---|
| 62 | If no elements are present, raises an \exception{IndexError}.
|
|---|
| 63 | \end{methoddesc}
|
|---|
| 64 |
|
|---|
| 65 | \begin{methoddesc}{popleft}{}
|
|---|
| 66 | Remove and return an element from the left side of the deque.
|
|---|
| 67 | If no elements are present, raises an \exception{IndexError}.
|
|---|
| 68 | \end{methoddesc}
|
|---|
| 69 |
|
|---|
| 70 | \begin{methoddesc}{remove}{value}
|
|---|
| 71 | Removed the first occurrence of \var{value}. If not found,
|
|---|
| 72 | raises a \exception{ValueError}.
|
|---|
| 73 | \versionadded{2.5}
|
|---|
| 74 | \end{methoddesc}
|
|---|
| 75 |
|
|---|
| 76 | \begin{methoddesc}{rotate}{n}
|
|---|
| 77 | Rotate the deque \var{n} steps to the right. If \var{n} is
|
|---|
| 78 | negative, rotate to the left. Rotating one step to the right
|
|---|
| 79 | is equivalent to: \samp{d.appendleft(d.pop())}.
|
|---|
| 80 | \end{methoddesc}
|
|---|
| 81 |
|
|---|
| 82 | In addition to the above, deques support iteration, pickling, \samp{len(d)},
|
|---|
| 83 | \samp{reversed(d)}, \samp{copy.copy(d)}, \samp{copy.deepcopy(d)},
|
|---|
| 84 | membership testing with the \keyword{in} operator, and subscript references
|
|---|
| 85 | such as \samp{d[-1]}.
|
|---|
| 86 |
|
|---|
| 87 | Example:
|
|---|
| 88 |
|
|---|
| 89 | \begin{verbatim}
|
|---|
| 90 | >>> from collections import deque
|
|---|
| 91 | >>> d = deque('ghi') # make a new deque with three items
|
|---|
| 92 | >>> for elem in d: # iterate over the deque's elements
|
|---|
| 93 | ... print elem.upper()
|
|---|
| 94 | G
|
|---|
| 95 | H
|
|---|
| 96 | I
|
|---|
| 97 |
|
|---|
| 98 | >>> d.append('j') # add a new entry to the right side
|
|---|
| 99 | >>> d.appendleft('f') # add a new entry to the left side
|
|---|
| 100 | >>> d # show the representation of the deque
|
|---|
| 101 | deque(['f', 'g', 'h', 'i', 'j'])
|
|---|
| 102 |
|
|---|
| 103 | >>> d.pop() # return and remove the rightmost item
|
|---|
| 104 | 'j'
|
|---|
| 105 | >>> d.popleft() # return and remove the leftmost item
|
|---|
| 106 | 'f'
|
|---|
| 107 | >>> list(d) # list the contents of the deque
|
|---|
| 108 | ['g', 'h', 'i']
|
|---|
| 109 | >>> d[0] # peek at leftmost item
|
|---|
| 110 | 'g'
|
|---|
| 111 | >>> d[-1] # peek at rightmost item
|
|---|
| 112 | 'i'
|
|---|
| 113 |
|
|---|
| 114 | >>> list(reversed(d)) # list the contents of a deque in reverse
|
|---|
| 115 | ['i', 'h', 'g']
|
|---|
| 116 | >>> 'h' in d # search the deque
|
|---|
| 117 | True
|
|---|
| 118 | >>> d.extend('jkl') # add multiple elements at once
|
|---|
| 119 | >>> d
|
|---|
| 120 | deque(['g', 'h', 'i', 'j', 'k', 'l'])
|
|---|
| 121 | >>> d.rotate(1) # right rotation
|
|---|
| 122 | >>> d
|
|---|
| 123 | deque(['l', 'g', 'h', 'i', 'j', 'k'])
|
|---|
| 124 | >>> d.rotate(-1) # left rotation
|
|---|
| 125 | >>> d
|
|---|
| 126 | deque(['g', 'h', 'i', 'j', 'k', 'l'])
|
|---|
| 127 |
|
|---|
| 128 | >>> deque(reversed(d)) # make a new deque in reverse order
|
|---|
| 129 | deque(['l', 'k', 'j', 'i', 'h', 'g'])
|
|---|
| 130 | >>> d.clear() # empty the deque
|
|---|
| 131 | >>> d.pop() # cannot pop from an empty deque
|
|---|
| 132 | Traceback (most recent call last):
|
|---|
| 133 | File "<pyshell#6>", line 1, in -toplevel-
|
|---|
| 134 | d.pop()
|
|---|
| 135 | IndexError: pop from an empty deque
|
|---|
| 136 |
|
|---|
| 137 | >>> d.extendleft('abc') # extendleft() reverses the input order
|
|---|
| 138 | >>> d
|
|---|
| 139 | deque(['c', 'b', 'a'])
|
|---|
| 140 | \end{verbatim}
|
|---|
| 141 |
|
|---|
| 142 | \subsubsection{Recipes \label{deque-recipes}}
|
|---|
| 143 |
|
|---|
| 144 | This section shows various approaches to working with deques.
|
|---|
| 145 |
|
|---|
| 146 | The \method{rotate()} method provides a way to implement \class{deque}
|
|---|
| 147 | slicing and deletion. For example, a pure python implementation of
|
|---|
| 148 | \code{del d[n]} relies on the \method{rotate()} method to position
|
|---|
| 149 | elements to be popped:
|
|---|
| 150 |
|
|---|
| 151 | \begin{verbatim}
|
|---|
| 152 | def delete_nth(d, n):
|
|---|
| 153 | d.rotate(-n)
|
|---|
| 154 | d.popleft()
|
|---|
| 155 | d.rotate(n)
|
|---|
| 156 | \end{verbatim}
|
|---|
| 157 |
|
|---|
| 158 | To implement \class{deque} slicing, use a similar approach applying
|
|---|
| 159 | \method{rotate()} to bring a target element to the left side of the deque.
|
|---|
| 160 | Remove old entries with \method{popleft()}, add new entries with
|
|---|
| 161 | \method{extend()}, and then reverse the rotation.
|
|---|
| 162 |
|
|---|
| 163 | With minor variations on that approach, it is easy to implement Forth style
|
|---|
| 164 | stack manipulations such as \code{dup}, \code{drop}, \code{swap}, \code{over},
|
|---|
| 165 | \code{pick}, \code{rot}, and \code{roll}.
|
|---|
| 166 |
|
|---|
| 167 | A roundrobin task server can be built from a \class{deque} using
|
|---|
| 168 | \method{popleft()} to select the current task and \method{append()}
|
|---|
| 169 | to add it back to the tasklist if the input stream is not exhausted:
|
|---|
| 170 |
|
|---|
| 171 | \begin{verbatim}
|
|---|
| 172 | def roundrobin(*iterables):
|
|---|
| 173 | pending = deque(iter(i) for i in iterables)
|
|---|
| 174 | while pending:
|
|---|
| 175 | task = pending.popleft()
|
|---|
| 176 | try:
|
|---|
| 177 | yield task.next()
|
|---|
| 178 | except StopIteration:
|
|---|
| 179 | continue
|
|---|
| 180 | pending.append(task)
|
|---|
| 181 |
|
|---|
| 182 | >>> for value in roundrobin('abc', 'd', 'efgh'):
|
|---|
| 183 | ... print value
|
|---|
| 184 |
|
|---|
| 185 | a
|
|---|
| 186 | d
|
|---|
| 187 | e
|
|---|
| 188 | b
|
|---|
| 189 | f
|
|---|
| 190 | c
|
|---|
| 191 | g
|
|---|
| 192 | h
|
|---|
| 193 |
|
|---|
| 194 | \end{verbatim}
|
|---|
| 195 |
|
|---|
| 196 |
|
|---|
| 197 | Multi-pass data reduction algorithms can be succinctly expressed and
|
|---|
| 198 | efficiently coded by extracting elements with multiple calls to
|
|---|
| 199 | \method{popleft()}, applying the reduction function, and calling
|
|---|
| 200 | \method{append()} to add the result back to the queue.
|
|---|
| 201 |
|
|---|
| 202 | For example, building a balanced binary tree of nested lists entails
|
|---|
| 203 | reducing two adjacent nodes into one by grouping them in a list:
|
|---|
| 204 |
|
|---|
| 205 | \begin{verbatim}
|
|---|
| 206 | def maketree(iterable):
|
|---|
| 207 | d = deque(iterable)
|
|---|
| 208 | while len(d) > 1:
|
|---|
| 209 | pair = [d.popleft(), d.popleft()]
|
|---|
| 210 | d.append(pair)
|
|---|
| 211 | return list(d)
|
|---|
| 212 |
|
|---|
| 213 | >>> print maketree('abcdefgh')
|
|---|
| 214 | [[[['a', 'b'], ['c', 'd']], [['e', 'f'], ['g', 'h']]]]
|
|---|
| 215 |
|
|---|
| 216 | \end{verbatim}
|
|---|
| 217 |
|
|---|
| 218 |
|
|---|
| 219 |
|
|---|
| 220 | \subsection{\class{defaultdict} objects \label{defaultdict-objects}}
|
|---|
| 221 |
|
|---|
| 222 | \begin{funcdesc}{defaultdict}{\optional{default_factory\optional{, ...}}}
|
|---|
| 223 | Returns a new dictionary-like object. \class{defaultdict} is a subclass
|
|---|
| 224 | of the builtin \class{dict} class. It overrides one method and adds one
|
|---|
| 225 | writable instance variable. The remaining functionality is the same as
|
|---|
| 226 | for the \class{dict} class and is not documented here.
|
|---|
| 227 |
|
|---|
| 228 | The first argument provides the initial value for the
|
|---|
| 229 | \member{default_factory} attribute; it defaults to \code{None}.
|
|---|
| 230 | All remaining arguments are treated the same as if they were
|
|---|
| 231 | passed to the \class{dict} constructor, including keyword arguments.
|
|---|
| 232 |
|
|---|
| 233 | \versionadded{2.5}
|
|---|
| 234 | \end{funcdesc}
|
|---|
| 235 |
|
|---|
| 236 | \class{defaultdict} objects support the following method in addition to
|
|---|
| 237 | the standard \class{dict} operations:
|
|---|
| 238 |
|
|---|
| 239 | \begin{methoddesc}{__missing__}{key}
|
|---|
| 240 | If the \member{default_factory} attribute is \code{None}, this raises
|
|---|
| 241 | an \exception{KeyError} exception with the \var{key} as argument.
|
|---|
| 242 |
|
|---|
| 243 | If \member{default_factory} is not \code{None}, it is called without
|
|---|
| 244 | arguments to provide a default value for the given \var{key}, this
|
|---|
| 245 | value is inserted in the dictionary for the \var{key}, and returned.
|
|---|
| 246 |
|
|---|
| 247 | If calling \member{default_factory} raises an exception this exception
|
|---|
| 248 | is propagated unchanged.
|
|---|
| 249 |
|
|---|
| 250 | This method is called by the \method{__getitem__} method of the
|
|---|
| 251 | \class{dict} class when the requested key is not found; whatever it
|
|---|
| 252 | returns or raises is then returned or raised by \method{__getitem__}.
|
|---|
| 253 | \end{methoddesc}
|
|---|
| 254 |
|
|---|
| 255 | \class{defaultdict} objects support the following instance variable:
|
|---|
| 256 |
|
|---|
| 257 | \begin{datadesc}{default_factory}
|
|---|
| 258 | This attribute is used by the \method{__missing__} method; it is initialized
|
|---|
| 259 | from the first argument to the constructor, if present, or to \code{None},
|
|---|
| 260 | if absent.
|
|---|
| 261 | \end{datadesc}
|
|---|
| 262 |
|
|---|
| 263 |
|
|---|
| 264 | \subsubsection{\class{defaultdict} Examples \label{defaultdict-examples}}
|
|---|
| 265 |
|
|---|
| 266 | Using \class{list} as the \member{default_factory}, it is easy to group
|
|---|
| 267 | a sequence of key-value pairs into a dictionary of lists:
|
|---|
| 268 |
|
|---|
| 269 | \begin{verbatim}
|
|---|
| 270 | >>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
|
|---|
| 271 | >>> d = defaultdict(list)
|
|---|
| 272 | >>> for k, v in s:
|
|---|
| 273 | d[k].append(v)
|
|---|
| 274 |
|
|---|
| 275 | >>> d.items()
|
|---|
| 276 | [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
|
|---|
| 277 | \end{verbatim}
|
|---|
| 278 |
|
|---|
| 279 | When each key is encountered for the first time, it is not already in the
|
|---|
| 280 | mapping; so an entry is automatically created using the
|
|---|
| 281 | \member{default_factory} function which returns an empty \class{list}. The
|
|---|
| 282 | \method{list.append()} operation then attaches the value to the new list. When
|
|---|
| 283 | keys are encountered again, the look-up proceeds normally (returning the list
|
|---|
| 284 | for that key) and the \method{list.append()} operation adds another value to
|
|---|
| 285 | the list. This technique is simpler and faster than an equivalent technique
|
|---|
| 286 | using \method{dict.setdefault()}:
|
|---|
| 287 |
|
|---|
| 288 | \begin{verbatim}
|
|---|
| 289 | >>> d = {}
|
|---|
| 290 | >>> for k, v in s:
|
|---|
| 291 | d.setdefault(k, []).append(v)
|
|---|
| 292 |
|
|---|
| 293 | >>> d.items()
|
|---|
| 294 | [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
|
|---|
| 295 | \end{verbatim}
|
|---|
| 296 |
|
|---|
| 297 | Setting the \member{default_factory} to \class{int} makes the
|
|---|
| 298 | \class{defaultdict} useful for counting (like a bag or multiset in other
|
|---|
| 299 | languages):
|
|---|
| 300 |
|
|---|
| 301 | \begin{verbatim}
|
|---|
| 302 | >>> s = 'mississippi'
|
|---|
| 303 | >>> d = defaultdict(int)
|
|---|
| 304 | >>> for k in s:
|
|---|
| 305 | d[k] += 1
|
|---|
| 306 |
|
|---|
| 307 | >>> d.items()
|
|---|
| 308 | [('i', 4), ('p', 2), ('s', 4), ('m', 1)]
|
|---|
| 309 | \end{verbatim}
|
|---|
| 310 |
|
|---|
| 311 | When a letter is first encountered, it is missing from the mapping, so the
|
|---|
| 312 | \member{default_factory} function calls \function{int()} to supply a default
|
|---|
| 313 | count of zero. The increment operation then builds up the count for each
|
|---|
| 314 | letter. This technique makes counting simpler and faster than an equivalent
|
|---|
| 315 | technique using \method{dict.get()}:
|
|---|
| 316 |
|
|---|
| 317 | \begin{verbatim}
|
|---|
| 318 | >>> d = {}
|
|---|
| 319 | >>> for k in s:
|
|---|
| 320 | d[k] = d.get(k, 0) + 1
|
|---|
| 321 |
|
|---|
| 322 | >>> d.items()
|
|---|
| 323 | [('i', 4), ('p', 2), ('s', 4), ('m', 1)]
|
|---|
| 324 | \end{verbatim}
|
|---|
| 325 |
|
|---|
| 326 | Setting the \member{default_factory} to \class{set} makes the
|
|---|
| 327 | \class{defaultdict} useful for building a dictionary of sets:
|
|---|
| 328 |
|
|---|
| 329 | \begin{verbatim}
|
|---|
| 330 | >>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]
|
|---|
| 331 | >>> d = defaultdict(set)
|
|---|
| 332 | >>> for k, v in s:
|
|---|
| 333 | d[k].add(v)
|
|---|
| 334 |
|
|---|
| 335 | >>> d.items()
|
|---|
| 336 | [('blue', set([2, 4])), ('red', set([1, 3]))]
|
|---|
| 337 | \end{verbatim}
|
|---|