| 1 | \section{\module{bz2} ---
|
|---|
| 2 | Compression compatible with \program{bzip2}}
|
|---|
| 3 |
|
|---|
| 4 | \declaremodule{builtin}{bz2}
|
|---|
| 5 | \modulesynopsis{Interface to compression and decompression
|
|---|
| 6 | routines compatible with \program{bzip2}.}
|
|---|
| 7 | \moduleauthor{Gustavo Niemeyer}{[email protected]}
|
|---|
| 8 | \sectionauthor{Gustavo Niemeyer}{[email protected]}
|
|---|
| 9 |
|
|---|
| 10 | \versionadded{2.3}
|
|---|
| 11 |
|
|---|
| 12 | This module provides a comprehensive interface for the bz2 compression library.
|
|---|
| 13 | It implements a complete file interface, one-shot (de)compression functions,
|
|---|
| 14 | and types for sequential (de)compression.
|
|---|
| 15 |
|
|---|
| 16 | Here is a resume of the features offered by the bz2 module:
|
|---|
| 17 |
|
|---|
| 18 | \begin{itemize}
|
|---|
| 19 | \item \class{BZ2File} class implements a complete file interface, including
|
|---|
| 20 | \method{readline()}, \method{readlines()},
|
|---|
| 21 | \method{writelines()}, \method{seek()}, etc;
|
|---|
| 22 | \item \class{BZ2File} class implements emulated \method{seek()} support;
|
|---|
| 23 | \item \class{BZ2File} class implements universal newline support;
|
|---|
| 24 | \item \class{BZ2File} class offers an optimized line iteration using
|
|---|
| 25 | the readahead algorithm borrowed from file objects;
|
|---|
| 26 | \item Sequential (de)compression supported by \class{BZ2Compressor} and
|
|---|
| 27 | \class{BZ2Decompressor} classes;
|
|---|
| 28 | \item One-shot (de)compression supported by \function{compress()} and
|
|---|
| 29 | \function{decompress()} functions;
|
|---|
| 30 | \item Thread safety uses individual locking mechanism;
|
|---|
| 31 | \item Complete inline documentation;
|
|---|
| 32 | \end{itemize}
|
|---|
| 33 |
|
|---|
| 34 |
|
|---|
| 35 | \subsection{(De)compression of files}
|
|---|
| 36 |
|
|---|
| 37 | Handling of compressed files is offered by the \class{BZ2File} class.
|
|---|
| 38 |
|
|---|
| 39 | \begin{classdesc}{BZ2File}{filename\optional{, mode\optional{,
|
|---|
| 40 | buffering\optional{, compresslevel}}}}
|
|---|
| 41 | Open a bz2 file. Mode can be either \code{'r'} or \code{'w'}, for reading
|
|---|
| 42 | (default) or writing. When opened for writing, the file will be created if
|
|---|
| 43 | it doesn't exist, and truncated otherwise. If \var{buffering} is given,
|
|---|
| 44 | \code{0} means unbuffered, and larger numbers specify the buffer size;
|
|---|
| 45 | the default is \code{0}. If
|
|---|
| 46 | \var{compresslevel} is given, it must be a number between \code{1} and
|
|---|
| 47 | \code{9}; the default is \code{9}.
|
|---|
| 48 | Add a \character{U} to mode to open the file for input with universal newline
|
|---|
| 49 | support. Any line ending in the input file will be seen as a
|
|---|
| 50 | \character{\e n} in Python. Also, a file so opened gains the
|
|---|
| 51 | attribute \member{newlines}; the value for this attribute is one of
|
|---|
| 52 | \code{None} (no newline read yet), \code{'\e r'}, \code{'\e n'},
|
|---|
| 53 | \code{'\e r\e n'} or a tuple containing all the newline types
|
|---|
| 54 | seen. Universal newlines are available only when reading.
|
|---|
| 55 | Instances support iteration in the same way as normal \class{file}
|
|---|
| 56 | instances.
|
|---|
| 57 | \end{classdesc}
|
|---|
| 58 |
|
|---|
| 59 | \begin{methoddesc}[BZ2File]{close}{}
|
|---|
| 60 | Close the file. Sets data attribute \member{closed} to true. A closed file
|
|---|
| 61 | cannot be used for further I/O operations. \method{close()} may be called
|
|---|
| 62 | more than once without error.
|
|---|
| 63 | \end{methoddesc}
|
|---|
| 64 |
|
|---|
| 65 | \begin{methoddesc}[BZ2File]{read}{\optional{size}}
|
|---|
| 66 | Read at most \var{size} uncompressed bytes, returned as a string. If the
|
|---|
| 67 | \var{size} argument is negative or omitted, read until EOF is reached.
|
|---|
| 68 | \end{methoddesc}
|
|---|
| 69 |
|
|---|
| 70 | \begin{methoddesc}[BZ2File]{readline}{\optional{size}}
|
|---|
| 71 | Return the next line from the file, as a string, retaining newline.
|
|---|
| 72 | A non-negative \var{size} argument limits the maximum number of bytes to
|
|---|
| 73 | return (an incomplete line may be returned then). Return an empty
|
|---|
| 74 | string at EOF.
|
|---|
| 75 | \end{methoddesc}
|
|---|
| 76 |
|
|---|
| 77 | \begin{methoddesc}[BZ2File]{readlines}{\optional{size}}
|
|---|
| 78 | Return a list of lines read. The optional \var{size} argument, if given,
|
|---|
| 79 | is an approximate bound on the total number of bytes in the lines returned.
|
|---|
| 80 | \end{methoddesc}
|
|---|
| 81 |
|
|---|
| 82 | \begin{methoddesc}[BZ2File]{xreadlines}{}
|
|---|
| 83 | For backward compatibility. \class{BZ2File} objects now include the
|
|---|
| 84 | performance optimizations previously implemented in the
|
|---|
| 85 | \module{xreadlines} module.
|
|---|
| 86 | \deprecated{2.3}{This exists only for compatibility with the method by
|
|---|
| 87 | this name on \class{file} objects, which is
|
|---|
| 88 | deprecated. Use \code{for line in file} instead.}
|
|---|
| 89 | \end{methoddesc}
|
|---|
| 90 |
|
|---|
| 91 | \begin{methoddesc}[BZ2File]{seek}{offset\optional{, whence}}
|
|---|
| 92 | Move to new file position. Argument \var{offset} is a byte count. Optional
|
|---|
| 93 | argument \var{whence} defaults to \code{0} (offset from start of file,
|
|---|
| 94 | offset should be \code{>= 0}); other values are \code{1} (move relative to
|
|---|
| 95 | current position, positive or negative), and \code{2} (move relative to end
|
|---|
| 96 | of file, usually negative, although many platforms allow seeking beyond
|
|---|
| 97 | the end of a file).
|
|---|
| 98 |
|
|---|
| 99 | Note that seeking of bz2 files is emulated, and depending on the parameters
|
|---|
| 100 | the operation may be extremely slow.
|
|---|
| 101 | \end{methoddesc}
|
|---|
| 102 |
|
|---|
| 103 | \begin{methoddesc}[BZ2File]{tell}{}
|
|---|
| 104 | Return the current file position, an integer (may be a long integer).
|
|---|
| 105 | \end{methoddesc}
|
|---|
| 106 |
|
|---|
| 107 | \begin{methoddesc}[BZ2File]{write}{data}
|
|---|
| 108 | Write string \var{data} to file. Note that due to buffering, \method{close()}
|
|---|
| 109 | may be needed before the file on disk reflects the data written.
|
|---|
| 110 | \end{methoddesc}
|
|---|
| 111 |
|
|---|
| 112 | \begin{methoddesc}[BZ2File]{writelines}{sequence_of_strings}
|
|---|
| 113 | Write the sequence of strings to the file. Note that newlines are not added.
|
|---|
| 114 | The sequence can be any iterable object producing strings. This is equivalent
|
|---|
| 115 | to calling write() for each string.
|
|---|
| 116 | \end{methoddesc}
|
|---|
| 117 |
|
|---|
| 118 |
|
|---|
| 119 | \subsection{Sequential (de)compression}
|
|---|
| 120 |
|
|---|
| 121 | Sequential compression and decompression is done using the classes
|
|---|
| 122 | \class{BZ2Compressor} and \class{BZ2Decompressor}.
|
|---|
| 123 |
|
|---|
| 124 | \begin{classdesc}{BZ2Compressor}{\optional{compresslevel}}
|
|---|
| 125 | Create a new compressor object. This object may be used to compress
|
|---|
| 126 | data sequentially. If you want to compress data in one shot, use the
|
|---|
| 127 | \function{compress()} function instead. The \var{compresslevel} parameter,
|
|---|
| 128 | if given, must be a number between \code{1} and \code{9}; the default
|
|---|
| 129 | is \code{9}.
|
|---|
| 130 | \end{classdesc}
|
|---|
| 131 |
|
|---|
| 132 | \begin{methoddesc}[BZ2Compressor]{compress}{data}
|
|---|
| 133 | Provide more data to the compressor object. It will return chunks of compressed
|
|---|
| 134 | data whenever possible. When you've finished providing data to compress, call
|
|---|
| 135 | the \method{flush()} method to finish the compression process, and return what
|
|---|
| 136 | is left in internal buffers.
|
|---|
| 137 | \end{methoddesc}
|
|---|
| 138 |
|
|---|
| 139 | \begin{methoddesc}[BZ2Compressor]{flush}{}
|
|---|
| 140 | Finish the compression process and return what is left in internal buffers. You
|
|---|
| 141 | must not use the compressor object after calling this method.
|
|---|
| 142 | \end{methoddesc}
|
|---|
| 143 |
|
|---|
| 144 | \begin{classdesc}{BZ2Decompressor}{}
|
|---|
| 145 | Create a new decompressor object. This object may be used to decompress
|
|---|
| 146 | data sequentially. If you want to decompress data in one shot, use the
|
|---|
| 147 | \function{decompress()} function instead.
|
|---|
| 148 | \end{classdesc}
|
|---|
| 149 |
|
|---|
| 150 | \begin{methoddesc}[BZ2Decompressor]{decompress}{data}
|
|---|
| 151 | Provide more data to the decompressor object. It will return chunks of
|
|---|
| 152 | decompressed data whenever possible. If you try to decompress data after the
|
|---|
| 153 | end of stream is found, \exception{EOFError} will be raised. If any data was
|
|---|
| 154 | found after the end of stream, it'll be ignored and saved in
|
|---|
| 155 | \member{unused\_data} attribute.
|
|---|
| 156 | \end{methoddesc}
|
|---|
| 157 |
|
|---|
| 158 |
|
|---|
| 159 | \subsection{One-shot (de)compression}
|
|---|
| 160 |
|
|---|
| 161 | One-shot compression and decompression is provided through the
|
|---|
| 162 | \function{compress()} and \function{decompress()} functions.
|
|---|
| 163 |
|
|---|
| 164 | \begin{funcdesc}{compress}{data\optional{, compresslevel}}
|
|---|
| 165 | Compress \var{data} in one shot. If you want to compress data sequentially,
|
|---|
| 166 | use an instance of \class{BZ2Compressor} instead. The \var{compresslevel}
|
|---|
| 167 | parameter, if given, must be a number between \code{1} and \code{9};
|
|---|
| 168 | the default is \code{9}.
|
|---|
| 169 | \end{funcdesc}
|
|---|
| 170 |
|
|---|
| 171 | \begin{funcdesc}{decompress}{data}
|
|---|
| 172 | Decompress \var{data} in one shot. If you want to decompress data
|
|---|
| 173 | sequentially, use an instance of \class{BZ2Decompressor} instead.
|
|---|
| 174 | \end{funcdesc}
|
|---|