| 1 | \section{\module{itertools} ---
|
|---|
| 2 | Functions creating iterators for efficient looping}
|
|---|
| 3 |
|
|---|
| 4 | \declaremodule{standard}{itertools}
|
|---|
| 5 | \modulesynopsis{Functions creating iterators for efficient looping.}
|
|---|
| 6 | \moduleauthor{Raymond Hettinger}{[email protected]}
|
|---|
| 7 | \sectionauthor{Raymond Hettinger}{[email protected]}
|
|---|
| 8 | \versionadded{2.3}
|
|---|
| 9 |
|
|---|
| 10 |
|
|---|
| 11 | This module implements a number of iterator building blocks inspired
|
|---|
| 12 | by constructs from the Haskell and SML programming languages. Each
|
|---|
| 13 | has been recast in a form suitable for Python.
|
|---|
| 14 |
|
|---|
| 15 | The module standardizes a core set of fast, memory efficient tools
|
|---|
| 16 | that are useful by themselves or in combination. Standardization helps
|
|---|
| 17 | avoid the readability and reliability problems which arise when many
|
|---|
| 18 | different individuals create their own slightly varying implementations,
|
|---|
| 19 | each with their own quirks and naming conventions.
|
|---|
| 20 |
|
|---|
| 21 | The tools are designed to combine readily with one another. This makes
|
|---|
| 22 | it easy to construct more specialized tools succinctly and efficiently
|
|---|
| 23 | in pure Python.
|
|---|
| 24 |
|
|---|
| 25 | For instance, SML provides a tabulation tool: \code{tabulate(f)}
|
|---|
| 26 | which produces a sequence \code{f(0), f(1), ...}. This toolbox
|
|---|
| 27 | provides \function{imap()} and \function{count()} which can be combined
|
|---|
| 28 | to form \code{imap(f, count())} and produce an equivalent result.
|
|---|
| 29 |
|
|---|
| 30 | Likewise, the functional tools are designed to work well with the
|
|---|
| 31 | high-speed functions provided by the \refmodule{operator} module.
|
|---|
| 32 |
|
|---|
| 33 | The module author welcomes suggestions for other basic building blocks
|
|---|
| 34 | to be added to future versions of the module.
|
|---|
| 35 |
|
|---|
| 36 | Whether cast in pure python form or compiled code, tools that use iterators
|
|---|
| 37 | are more memory efficient (and faster) than their list based counterparts.
|
|---|
| 38 | Adopting the principles of just-in-time manufacturing, they create
|
|---|
| 39 | data when and where needed instead of consuming memory with the
|
|---|
| 40 | computer equivalent of ``inventory''.
|
|---|
| 41 |
|
|---|
| 42 | The performance advantage of iterators becomes more acute as the number
|
|---|
| 43 | of elements increases -- at some point, lists grow large enough to
|
|---|
| 44 | severely impact memory cache performance and start running slowly.
|
|---|
| 45 |
|
|---|
| 46 | \begin{seealso}
|
|---|
| 47 | \seetext{The Standard ML Basis Library,
|
|---|
| 48 | \citetitle[http://www.standardml.org/Basis/]
|
|---|
| 49 | {The Standard ML Basis Library}.}
|
|---|
| 50 |
|
|---|
| 51 | \seetext{Haskell, A Purely Functional Language,
|
|---|
| 52 | \citetitle[http://www.haskell.org/definition/]
|
|---|
| 53 | {Definition of Haskell and the Standard Libraries}.}
|
|---|
| 54 | \end{seealso}
|
|---|
| 55 |
|
|---|
| 56 |
|
|---|
| 57 | \subsection{Itertool functions \label{itertools-functions}}
|
|---|
| 58 |
|
|---|
| 59 | The following module functions all construct and return iterators.
|
|---|
| 60 | Some provide streams of infinite length, so they should only be accessed
|
|---|
| 61 | by functions or loops that truncate the stream.
|
|---|
| 62 |
|
|---|
| 63 | \begin{funcdesc}{chain}{*iterables}
|
|---|
| 64 | Make an iterator that returns elements from the first iterable until
|
|---|
| 65 | it is exhausted, then proceeds to the next iterable, until all of the
|
|---|
| 66 | iterables are exhausted. Used for treating consecutive sequences as
|
|---|
| 67 | a single sequence. Equivalent to:
|
|---|
| 68 |
|
|---|
| 69 | \begin{verbatim}
|
|---|
| 70 | def chain(*iterables):
|
|---|
| 71 | for it in iterables:
|
|---|
| 72 | for element in it:
|
|---|
| 73 | yield element
|
|---|
| 74 | \end{verbatim}
|
|---|
| 75 | \end{funcdesc}
|
|---|
| 76 |
|
|---|
| 77 | \begin{funcdesc}{count}{\optional{n}}
|
|---|
| 78 | Make an iterator that returns consecutive integers starting with \var{n}.
|
|---|
| 79 | If not specified \var{n} defaults to zero.
|
|---|
| 80 | Does not currently support python long integers. Often used as an
|
|---|
| 81 | argument to \function{imap()} to generate consecutive data points.
|
|---|
| 82 | Also, used with \function{izip()} to add sequence numbers. Equivalent to:
|
|---|
| 83 |
|
|---|
| 84 | \begin{verbatim}
|
|---|
| 85 | def count(n=0):
|
|---|
| 86 | while True:
|
|---|
| 87 | yield n
|
|---|
| 88 | n += 1
|
|---|
| 89 | \end{verbatim}
|
|---|
| 90 |
|
|---|
| 91 | Note, \function{count()} does not check for overflow and will return
|
|---|
| 92 | negative numbers after exceeding \code{sys.maxint}. This behavior
|
|---|
| 93 | may change in the future.
|
|---|
| 94 | \end{funcdesc}
|
|---|
| 95 |
|
|---|
| 96 | \begin{funcdesc}{cycle}{iterable}
|
|---|
| 97 | Make an iterator returning elements from the iterable and saving a
|
|---|
| 98 | copy of each. When the iterable is exhausted, return elements from
|
|---|
| 99 | the saved copy. Repeats indefinitely. Equivalent to:
|
|---|
| 100 |
|
|---|
| 101 | \begin{verbatim}
|
|---|
| 102 | def cycle(iterable):
|
|---|
| 103 | saved = []
|
|---|
| 104 | for element in iterable:
|
|---|
| 105 | yield element
|
|---|
| 106 | saved.append(element)
|
|---|
| 107 | while saved:
|
|---|
| 108 | for element in saved:
|
|---|
| 109 | yield element
|
|---|
| 110 | \end{verbatim}
|
|---|
| 111 |
|
|---|
| 112 | Note, this member of the toolkit may require significant
|
|---|
| 113 | auxiliary storage (depending on the length of the iterable).
|
|---|
| 114 | \end{funcdesc}
|
|---|
| 115 |
|
|---|
| 116 | \begin{funcdesc}{dropwhile}{predicate, iterable}
|
|---|
| 117 | Make an iterator that drops elements from the iterable as long as
|
|---|
| 118 | the predicate is true; afterwards, returns every element. Note,
|
|---|
| 119 | the iterator does not produce \emph{any} output until the predicate
|
|---|
| 120 | is true, so it may have a lengthy start-up time. Equivalent to:
|
|---|
| 121 |
|
|---|
| 122 | \begin{verbatim}
|
|---|
| 123 | def dropwhile(predicate, iterable):
|
|---|
| 124 | iterable = iter(iterable)
|
|---|
| 125 | for x in iterable:
|
|---|
| 126 | if not predicate(x):
|
|---|
| 127 | yield x
|
|---|
| 128 | break
|
|---|
| 129 | for x in iterable:
|
|---|
| 130 | yield x
|
|---|
| 131 | \end{verbatim}
|
|---|
| 132 | \end{funcdesc}
|
|---|
| 133 |
|
|---|
| 134 | \begin{funcdesc}{groupby}{iterable\optional{, key}}
|
|---|
| 135 | Make an iterator that returns consecutive keys and groups from the
|
|---|
| 136 | \var{iterable}. The \var{key} is a function computing a key value for each
|
|---|
| 137 | element. If not specified or is \code{None}, \var{key} defaults to an
|
|---|
| 138 | identity function and returns the element unchanged. Generally, the
|
|---|
| 139 | iterable needs to already be sorted on the same key function.
|
|---|
| 140 |
|
|---|
| 141 | The returned group is itself an iterator that shares the underlying
|
|---|
| 142 | iterable with \function{groupby()}. Because the source is shared, when
|
|---|
| 143 | the \function{groupby} object is advanced, the previous group is no
|
|---|
| 144 | longer visible. So, if that data is needed later, it should be stored
|
|---|
| 145 | as a list:
|
|---|
| 146 |
|
|---|
| 147 | \begin{verbatim}
|
|---|
| 148 | groups = []
|
|---|
| 149 | uniquekeys = []
|
|---|
| 150 | for k, g in groupby(data, keyfunc):
|
|---|
| 151 | groups.append(list(g)) # Store group iterator as a list
|
|---|
| 152 | uniquekeys.append(k)
|
|---|
| 153 | \end{verbatim}
|
|---|
| 154 |
|
|---|
| 155 | \function{groupby()} is equivalent to:
|
|---|
| 156 |
|
|---|
| 157 | \begin{verbatim}
|
|---|
| 158 | class groupby(object):
|
|---|
| 159 | def __init__(self, iterable, key=None):
|
|---|
| 160 | if key is None:
|
|---|
| 161 | key = lambda x: x
|
|---|
| 162 | self.keyfunc = key
|
|---|
| 163 | self.it = iter(iterable)
|
|---|
| 164 | self.tgtkey = self.currkey = self.currvalue = xrange(0)
|
|---|
| 165 | def __iter__(self):
|
|---|
| 166 | return self
|
|---|
| 167 | def next(self):
|
|---|
| 168 | while self.currkey == self.tgtkey:
|
|---|
| 169 | self.currvalue = self.it.next() # Exit on StopIteration
|
|---|
| 170 | self.currkey = self.keyfunc(self.currvalue)
|
|---|
| 171 | self.tgtkey = self.currkey
|
|---|
| 172 | return (self.currkey, self._grouper(self.tgtkey))
|
|---|
| 173 | def _grouper(self, tgtkey):
|
|---|
| 174 | while self.currkey == tgtkey:
|
|---|
| 175 | yield self.currvalue
|
|---|
| 176 | self.currvalue = self.it.next() # Exit on StopIteration
|
|---|
| 177 | self.currkey = self.keyfunc(self.currvalue)
|
|---|
| 178 | \end{verbatim}
|
|---|
| 179 | \versionadded{2.4}
|
|---|
| 180 | \end{funcdesc}
|
|---|
| 181 |
|
|---|
| 182 | \begin{funcdesc}{ifilter}{predicate, iterable}
|
|---|
| 183 | Make an iterator that filters elements from iterable returning only
|
|---|
| 184 | those for which the predicate is \code{True}.
|
|---|
| 185 | If \var{predicate} is \code{None}, return the items that are true.
|
|---|
| 186 | Equivalent to:
|
|---|
| 187 |
|
|---|
| 188 | \begin{verbatim}
|
|---|
| 189 | def ifilter(predicate, iterable):
|
|---|
| 190 | if predicate is None:
|
|---|
| 191 | predicate = bool
|
|---|
| 192 | for x in iterable:
|
|---|
| 193 | if predicate(x):
|
|---|
| 194 | yield x
|
|---|
| 195 | \end{verbatim}
|
|---|
| 196 | \end{funcdesc}
|
|---|
| 197 |
|
|---|
| 198 | \begin{funcdesc}{ifilterfalse}{predicate, iterable}
|
|---|
| 199 | Make an iterator that filters elements from iterable returning only
|
|---|
| 200 | those for which the predicate is \code{False}.
|
|---|
| 201 | If \var{predicate} is \code{None}, return the items that are false.
|
|---|
| 202 | Equivalent to:
|
|---|
| 203 |
|
|---|
| 204 | \begin{verbatim}
|
|---|
| 205 | def ifilterfalse(predicate, iterable):
|
|---|
| 206 | if predicate is None:
|
|---|
| 207 | predicate = bool
|
|---|
| 208 | for x in iterable:
|
|---|
| 209 | if not predicate(x):
|
|---|
| 210 | yield x
|
|---|
| 211 | \end{verbatim}
|
|---|
| 212 | \end{funcdesc}
|
|---|
| 213 |
|
|---|
| 214 | \begin{funcdesc}{imap}{function, *iterables}
|
|---|
| 215 | Make an iterator that computes the function using arguments from
|
|---|
| 216 | each of the iterables. If \var{function} is set to \code{None}, then
|
|---|
| 217 | \function{imap()} returns the arguments as a tuple. Like
|
|---|
| 218 | \function{map()} but stops when the shortest iterable is exhausted
|
|---|
| 219 | instead of filling in \code{None} for shorter iterables. The reason
|
|---|
| 220 | for the difference is that infinite iterator arguments are typically
|
|---|
| 221 | an error for \function{map()} (because the output is fully evaluated)
|
|---|
| 222 | but represent a common and useful way of supplying arguments to
|
|---|
| 223 | \function{imap()}.
|
|---|
| 224 | Equivalent to:
|
|---|
| 225 |
|
|---|
| 226 | \begin{verbatim}
|
|---|
| 227 | def imap(function, *iterables):
|
|---|
| 228 | iterables = map(iter, iterables)
|
|---|
| 229 | while True:
|
|---|
| 230 | args = [i.next() for i in iterables]
|
|---|
| 231 | if function is None:
|
|---|
| 232 | yield tuple(args)
|
|---|
| 233 | else:
|
|---|
| 234 | yield function(*args)
|
|---|
| 235 | \end{verbatim}
|
|---|
| 236 | \end{funcdesc}
|
|---|
| 237 |
|
|---|
| 238 | \begin{funcdesc}{islice}{iterable, \optional{start,} stop \optional{, step}}
|
|---|
| 239 | Make an iterator that returns selected elements from the iterable.
|
|---|
| 240 | If \var{start} is non-zero, then elements from the iterable are skipped
|
|---|
| 241 | until start is reached. Afterward, elements are returned consecutively
|
|---|
| 242 | unless \var{step} is set higher than one which results in items being
|
|---|
| 243 | skipped. If \var{stop} is \code{None}, then iteration continues until
|
|---|
| 244 | the iterator is exhausted, if at all; otherwise, it stops at the specified
|
|---|
| 245 | position. Unlike regular slicing,
|
|---|
| 246 | \function{islice()} does not support negative values for \var{start},
|
|---|
| 247 | \var{stop}, or \var{step}. Can be used to extract related fields
|
|---|
| 248 | from data where the internal structure has been flattened (for
|
|---|
| 249 | example, a multi-line report may list a name field on every
|
|---|
| 250 | third line). Equivalent to:
|
|---|
| 251 |
|
|---|
| 252 | \begin{verbatim}
|
|---|
| 253 | def islice(iterable, *args):
|
|---|
| 254 | s = slice(*args)
|
|---|
| 255 | it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
|
|---|
| 256 | nexti = it.next()
|
|---|
| 257 | for i, element in enumerate(iterable):
|
|---|
| 258 | if i == nexti:
|
|---|
| 259 | yield element
|
|---|
| 260 | nexti = it.next()
|
|---|
| 261 | \end{verbatim}
|
|---|
| 262 |
|
|---|
| 263 | If \var{start} is \code{None}, then iteration starts at zero.
|
|---|
| 264 | If \var{step} is \code{None}, then the step defaults to one.
|
|---|
| 265 | \versionchanged[accept \code{None} values for default \var{start} and
|
|---|
| 266 | \var{step}]{2.5}
|
|---|
| 267 | \end{funcdesc}
|
|---|
| 268 |
|
|---|
| 269 | \begin{funcdesc}{izip}{*iterables}
|
|---|
| 270 | Make an iterator that aggregates elements from each of the iterables.
|
|---|
| 271 | Like \function{zip()} except that it returns an iterator instead of
|
|---|
| 272 | a list. Used for lock-step iteration over several iterables at a
|
|---|
| 273 | time. Equivalent to:
|
|---|
| 274 |
|
|---|
| 275 | \begin{verbatim}
|
|---|
| 276 | def izip(*iterables):
|
|---|
| 277 | iterables = map(iter, iterables)
|
|---|
| 278 | while iterables:
|
|---|
| 279 | result = [it.next() for it in iterables]
|
|---|
| 280 | yield tuple(result)
|
|---|
| 281 | \end{verbatim}
|
|---|
| 282 |
|
|---|
| 283 | \versionchanged[When no iterables are specified, returns a zero length
|
|---|
| 284 | iterator instead of raising a \exception{TypeError}
|
|---|
| 285 | exception]{2.4}
|
|---|
| 286 |
|
|---|
| 287 | Note, the left-to-right evaluation order of the iterables is guaranteed.
|
|---|
| 288 | This makes possible an idiom for clustering a data series into n-length
|
|---|
| 289 | groups using \samp{izip(*[iter(s)]*n)}. For data that doesn't fit
|
|---|
| 290 | n-length groups exactly, the last tuple can be pre-padded with fill
|
|---|
| 291 | values using \samp{izip(*[chain(s, [None]*(n-1))]*n)}.
|
|---|
| 292 |
|
|---|
| 293 | Note, when \function{izip()} is used with unequal length inputs, subsequent
|
|---|
| 294 | iteration over the longer iterables cannot reliably be continued after
|
|---|
| 295 | \function{izip()} terminates. Potentially, up to one entry will be missing
|
|---|
| 296 | from each of the left-over iterables. This occurs because a value is fetched
|
|---|
| 297 | from each iterator in-turn, but the process ends when one of the iterators
|
|---|
| 298 | terminates. This leaves the last fetched values in limbo (they cannot be
|
|---|
| 299 | returned in a final, incomplete tuple and they are cannot be pushed back
|
|---|
| 300 | into the iterator for retrieval with \code{it.next()}). In general,
|
|---|
| 301 | \function{izip()} should only be used with unequal length inputs when you
|
|---|
| 302 | don't care about trailing, unmatched values from the longer iterables.
|
|---|
| 303 | \end{funcdesc}
|
|---|
| 304 |
|
|---|
| 305 | \begin{funcdesc}{repeat}{object\optional{, times}}
|
|---|
| 306 | Make an iterator that returns \var{object} over and over again.
|
|---|
| 307 | Runs indefinitely unless the \var{times} argument is specified.
|
|---|
| 308 | Used as argument to \function{imap()} for invariant parameters
|
|---|
| 309 | to the called function. Also used with \function{izip()} to create
|
|---|
| 310 | an invariant part of a tuple record. Equivalent to:
|
|---|
| 311 |
|
|---|
| 312 | \begin{verbatim}
|
|---|
| 313 | def repeat(object, times=None):
|
|---|
| 314 | if times is None:
|
|---|
| 315 | while True:
|
|---|
| 316 | yield object
|
|---|
| 317 | else:
|
|---|
| 318 | for i in xrange(times):
|
|---|
| 319 | yield object
|
|---|
| 320 | \end{verbatim}
|
|---|
| 321 | \end{funcdesc}
|
|---|
| 322 |
|
|---|
| 323 | \begin{funcdesc}{starmap}{function, iterable}
|
|---|
| 324 | Make an iterator that computes the function using arguments tuples
|
|---|
| 325 | obtained from the iterable. Used instead of \function{imap()} when
|
|---|
| 326 | argument parameters are already grouped in tuples from a single iterable
|
|---|
| 327 | (the data has been ``pre-zipped''). The difference between
|
|---|
| 328 | \function{imap()} and \function{starmap()} parallels the distinction
|
|---|
| 329 | between \code{function(a,b)} and \code{function(*c)}.
|
|---|
| 330 | Equivalent to:
|
|---|
| 331 |
|
|---|
| 332 | \begin{verbatim}
|
|---|
| 333 | def starmap(function, iterable):
|
|---|
| 334 | iterable = iter(iterable)
|
|---|
| 335 | while True:
|
|---|
| 336 | yield function(*iterable.next())
|
|---|
| 337 | \end{verbatim}
|
|---|
| 338 | \end{funcdesc}
|
|---|
| 339 |
|
|---|
| 340 | \begin{funcdesc}{takewhile}{predicate, iterable}
|
|---|
| 341 | Make an iterator that returns elements from the iterable as long as
|
|---|
| 342 | the predicate is true. Equivalent to:
|
|---|
| 343 |
|
|---|
| 344 | \begin{verbatim}
|
|---|
| 345 | def takewhile(predicate, iterable):
|
|---|
| 346 | for x in iterable:
|
|---|
| 347 | if predicate(x):
|
|---|
| 348 | yield x
|
|---|
| 349 | else:
|
|---|
| 350 | break
|
|---|
| 351 | \end{verbatim}
|
|---|
| 352 | \end{funcdesc}
|
|---|
| 353 |
|
|---|
| 354 | \begin{funcdesc}{tee}{iterable\optional{, n=2}}
|
|---|
| 355 | Return \var{n} independent iterators from a single iterable.
|
|---|
| 356 | The case where \code{n==2} is equivalent to:
|
|---|
| 357 |
|
|---|
| 358 | \begin{verbatim}
|
|---|
| 359 | def tee(iterable):
|
|---|
| 360 | def gen(next, data={}, cnt=[0]):
|
|---|
| 361 | for i in count():
|
|---|
| 362 | if i == cnt[0]:
|
|---|
| 363 | item = data[i] = next()
|
|---|
| 364 | cnt[0] += 1
|
|---|
| 365 | else:
|
|---|
| 366 | item = data.pop(i)
|
|---|
| 367 | yield item
|
|---|
| 368 | it = iter(iterable)
|
|---|
| 369 | return (gen(it.next), gen(it.next))
|
|---|
| 370 | \end{verbatim}
|
|---|
| 371 |
|
|---|
| 372 | Note, once \function{tee()} has made a split, the original \var{iterable}
|
|---|
| 373 | should not be used anywhere else; otherwise, the \var{iterable} could get
|
|---|
| 374 | advanced without the tee objects being informed.
|
|---|
| 375 |
|
|---|
| 376 | Note, this member of the toolkit may require significant auxiliary
|
|---|
| 377 | storage (depending on how much temporary data needs to be stored).
|
|---|
| 378 | In general, if one iterator is going to use most or all of the data before
|
|---|
| 379 | the other iterator, it is faster to use \function{list()} instead of
|
|---|
| 380 | \function{tee()}.
|
|---|
| 381 | \versionadded{2.4}
|
|---|
| 382 | \end{funcdesc}
|
|---|
| 383 |
|
|---|
| 384 |
|
|---|
| 385 | \subsection{Examples \label{itertools-example}}
|
|---|
| 386 |
|
|---|
| 387 | The following examples show common uses for each tool and
|
|---|
| 388 | demonstrate ways they can be combined.
|
|---|
| 389 |
|
|---|
| 390 | \begin{verbatim}
|
|---|
| 391 |
|
|---|
| 392 | >>> amounts = [120.15, 764.05, 823.14]
|
|---|
| 393 | >>> for checknum, amount in izip(count(1200), amounts):
|
|---|
| 394 | ... print 'Check %d is for $%.2f' % (checknum, amount)
|
|---|
| 395 | ...
|
|---|
| 396 | Check 1200 is for $120.15
|
|---|
| 397 | Check 1201 is for $764.05
|
|---|
| 398 | Check 1202 is for $823.14
|
|---|
| 399 |
|
|---|
| 400 | >>> import operator
|
|---|
| 401 | >>> for cube in imap(operator.pow, xrange(1,5), repeat(3)):
|
|---|
| 402 | ... print cube
|
|---|
| 403 | ...
|
|---|
| 404 | 1
|
|---|
| 405 | 8
|
|---|
| 406 | 27
|
|---|
| 407 | 64
|
|---|
| 408 |
|
|---|
| 409 | >>> reportlines = ['EuroPython', 'Roster', '', 'alex', '', 'laura',
|
|---|
| 410 | '', 'martin', '', 'walter', '', 'mark']
|
|---|
| 411 | >>> for name in islice(reportlines, 3, None, 2):
|
|---|
| 412 | ... print name.title()
|
|---|
| 413 | ...
|
|---|
| 414 | Alex
|
|---|
| 415 | Laura
|
|---|
| 416 | Martin
|
|---|
| 417 | Walter
|
|---|
| 418 | Mark
|
|---|
| 419 |
|
|---|
| 420 | # Show a dictionary sorted and grouped by value
|
|---|
| 421 | >>> from operator import itemgetter
|
|---|
| 422 | >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
|
|---|
| 423 | >>> di = sorted(d.iteritems(), key=itemgetter(1))
|
|---|
| 424 | >>> for k, g in groupby(di, key=itemgetter(1)):
|
|---|
| 425 | ... print k, map(itemgetter(0), g)
|
|---|
| 426 | ...
|
|---|
| 427 | 1 ['a', 'c', 'e']
|
|---|
| 428 | 2 ['b', 'd', 'f']
|
|---|
| 429 | 3 ['g']
|
|---|
| 430 |
|
|---|
| 431 | # Find runs of consecutive numbers using groupby. The key to the solution
|
|---|
| 432 | # is differencing with a range so that consecutive numbers all appear in
|
|---|
| 433 | # same group.
|
|---|
| 434 | >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
|
|---|
| 435 | >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
|
|---|
| 436 | ... print map(operator.itemgetter(1), g)
|
|---|
| 437 | ...
|
|---|
| 438 | [1]
|
|---|
| 439 | [4, 5, 6]
|
|---|
| 440 | [10]
|
|---|
| 441 | [15, 16, 17, 18]
|
|---|
| 442 | [22]
|
|---|
| 443 | [25, 26, 27, 28]
|
|---|
| 444 |
|
|---|
| 445 | \end{verbatim}
|
|---|
| 446 |
|
|---|
| 447 |
|
|---|
| 448 | \subsection{Recipes \label{itertools-recipes}}
|
|---|
| 449 |
|
|---|
| 450 | This section shows recipes for creating an extended toolset using the
|
|---|
| 451 | existing itertools as building blocks.
|
|---|
| 452 |
|
|---|
| 453 | The extended tools offer the same high performance as the underlying
|
|---|
| 454 | toolset. The superior memory performance is kept by processing elements one
|
|---|
| 455 | at a time rather than bringing the whole iterable into memory all at once.
|
|---|
| 456 | Code volume is kept small by linking the tools together in a functional style
|
|---|
| 457 | which helps eliminate temporary variables. High speed is retained by
|
|---|
| 458 | preferring ``vectorized'' building blocks over the use of for-loops and
|
|---|
| 459 | generators which incur interpreter overhead.
|
|---|
| 460 |
|
|---|
| 461 |
|
|---|
| 462 | \begin{verbatim}
|
|---|
| 463 | def take(n, seq):
|
|---|
| 464 | return list(islice(seq, n))
|
|---|
| 465 |
|
|---|
| 466 | def enumerate(iterable):
|
|---|
| 467 | return izip(count(), iterable)
|
|---|
| 468 |
|
|---|
| 469 | def tabulate(function):
|
|---|
| 470 | "Return function(0), function(1), ..."
|
|---|
| 471 | return imap(function, count())
|
|---|
| 472 |
|
|---|
| 473 | def iteritems(mapping):
|
|---|
| 474 | return izip(mapping.iterkeys(), mapping.itervalues())
|
|---|
| 475 |
|
|---|
| 476 | def nth(iterable, n):
|
|---|
| 477 | "Returns the nth item"
|
|---|
| 478 | return list(islice(iterable, n, n+1))
|
|---|
| 479 |
|
|---|
| 480 | def all(seq, pred=None):
|
|---|
| 481 | "Returns True if pred(x) is true for every element in the iterable"
|
|---|
| 482 | for elem in ifilterfalse(pred, seq):
|
|---|
| 483 | return False
|
|---|
| 484 | return True
|
|---|
| 485 |
|
|---|
| 486 | def any(seq, pred=None):
|
|---|
| 487 | "Returns True if pred(x) is true for at least one element in the iterable"
|
|---|
| 488 | for elem in ifilter(pred, seq):
|
|---|
| 489 | return True
|
|---|
| 490 | return False
|
|---|
| 491 |
|
|---|
| 492 | def no(seq, pred=None):
|
|---|
| 493 | "Returns True if pred(x) is false for every element in the iterable"
|
|---|
| 494 | for elem in ifilter(pred, seq):
|
|---|
| 495 | return False
|
|---|
| 496 | return True
|
|---|
| 497 |
|
|---|
| 498 | def quantify(seq, pred=None):
|
|---|
| 499 | "Count how many times the predicate is true in the sequence"
|
|---|
| 500 | return sum(imap(pred, seq))
|
|---|
| 501 |
|
|---|
| 502 | def padnone(seq):
|
|---|
| 503 | """Returns the sequence elements and then returns None indefinitely.
|
|---|
| 504 |
|
|---|
| 505 | Useful for emulating the behavior of the built-in map() function.
|
|---|
| 506 | """
|
|---|
| 507 | return chain(seq, repeat(None))
|
|---|
| 508 |
|
|---|
| 509 | def ncycles(seq, n):
|
|---|
| 510 | "Returns the sequence elements n times"
|
|---|
| 511 | return chain(*repeat(seq, n))
|
|---|
| 512 |
|
|---|
| 513 | def dotproduct(vec1, vec2):
|
|---|
| 514 | return sum(imap(operator.mul, vec1, vec2))
|
|---|
| 515 |
|
|---|
| 516 | def flatten(listOfLists):
|
|---|
| 517 | return list(chain(*listOfLists))
|
|---|
| 518 |
|
|---|
| 519 | def repeatfunc(func, times=None, *args):
|
|---|
| 520 | """Repeat calls to func with specified arguments.
|
|---|
| 521 |
|
|---|
| 522 | Example: repeatfunc(random.random)
|
|---|
| 523 | """
|
|---|
| 524 | if times is None:
|
|---|
| 525 | return starmap(func, repeat(args))
|
|---|
| 526 | else:
|
|---|
| 527 | return starmap(func, repeat(args, times))
|
|---|
| 528 |
|
|---|
| 529 | def pairwise(iterable):
|
|---|
| 530 | "s -> (s0,s1), (s1,s2), (s2, s3), ..."
|
|---|
| 531 | a, b = tee(iterable)
|
|---|
| 532 | try:
|
|---|
| 533 | b.next()
|
|---|
| 534 | except StopIteration:
|
|---|
| 535 | pass
|
|---|
| 536 | return izip(a, b)
|
|---|
| 537 |
|
|---|
| 538 | def grouper(n, iterable, padvalue=None):
|
|---|
| 539 | "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
|
|---|
| 540 | return izip(*[chain(iterable, repeat(padvalue, n-1))]*n)
|
|---|
| 541 |
|
|---|
| 542 |
|
|---|
| 543 | \end{verbatim}
|
|---|