| 1 | \section{\module{xml.sax} ---
|
|---|
| 2 | Support for SAX2 parsers}
|
|---|
| 3 |
|
|---|
| 4 | \declaremodule{standard}{xml.sax}
|
|---|
| 5 | \modulesynopsis{Package containing SAX2 base classes and convenience
|
|---|
| 6 | functions.}
|
|---|
| 7 | \moduleauthor{Lars Marius Garshol}{[email protected]}
|
|---|
| 8 | \sectionauthor{Fred L. Drake, Jr.}{[email protected]}
|
|---|
| 9 | \sectionauthor{Martin v. L\"owis}{[email protected]}
|
|---|
| 10 |
|
|---|
| 11 | \versionadded{2.0}
|
|---|
| 12 |
|
|---|
| 13 |
|
|---|
| 14 | The \module{xml.sax} package provides a number of modules which
|
|---|
| 15 | implement the Simple API for XML (SAX) interface for Python. The
|
|---|
| 16 | package itself provides the SAX exceptions and the convenience
|
|---|
| 17 | functions which will be most used by users of the SAX API.
|
|---|
| 18 |
|
|---|
| 19 | The convenience functions are:
|
|---|
| 20 |
|
|---|
| 21 | \begin{funcdesc}{make_parser}{\optional{parser_list}}
|
|---|
| 22 | Create and return a SAX \class{XMLReader} object. The first parser
|
|---|
| 23 | found will be used. If \var{parser_list} is provided, it must be a
|
|---|
| 24 | sequence of strings which name modules that have a function named
|
|---|
| 25 | \function{create_parser()}. Modules listed in \var{parser_list}
|
|---|
| 26 | will be used before modules in the default list of parsers.
|
|---|
| 27 | \end{funcdesc}
|
|---|
| 28 |
|
|---|
| 29 | \begin{funcdesc}{parse}{filename_or_stream, handler\optional{, error_handler}}
|
|---|
| 30 | Create a SAX parser and use it to parse a document. The document,
|
|---|
| 31 | passed in as \var{filename_or_stream}, can be a filename or a file
|
|---|
| 32 | object. The \var{handler} parameter needs to be a SAX
|
|---|
| 33 | \class{ContentHandler} instance. If \var{error_handler} is given,
|
|---|
| 34 | it must be a SAX \class{ErrorHandler} instance; if omitted,
|
|---|
| 35 | \exception{SAXParseException} will be raised on all errors. There
|
|---|
| 36 | is no return value; all work must be done by the \var{handler}
|
|---|
| 37 | passed in.
|
|---|
| 38 | \end{funcdesc}
|
|---|
| 39 |
|
|---|
| 40 | \begin{funcdesc}{parseString}{string, handler\optional{, error_handler}}
|
|---|
| 41 | Similar to \function{parse()}, but parses from a buffer \var{string}
|
|---|
| 42 | received as a parameter.
|
|---|
| 43 | \end{funcdesc}
|
|---|
| 44 |
|
|---|
| 45 | A typical SAX application uses three kinds of objects: readers,
|
|---|
| 46 | handlers and input sources. ``Reader'' in this context is another
|
|---|
| 47 | term for parser, i.e.\ some piece of code that reads the bytes or
|
|---|
| 48 | characters from the input source, and produces a sequence of events.
|
|---|
| 49 | The events then get distributed to the handler objects, i.e.\ the
|
|---|
| 50 | reader invokes a method on the handler. A SAX application must
|
|---|
| 51 | therefore obtain a reader object, create or open the input sources,
|
|---|
| 52 | create the handlers, and connect these objects all together. As the
|
|---|
| 53 | final step of preparation, the reader is called to parse the input.
|
|---|
| 54 | During parsing, methods on the handler objects are called based on
|
|---|
| 55 | structural and syntactic events from the input data.
|
|---|
| 56 |
|
|---|
| 57 | For these objects, only the interfaces are relevant; they are normally
|
|---|
| 58 | not instantiated by the application itself. Since Python does not have
|
|---|
| 59 | an explicit notion of interface, they are formally introduced as
|
|---|
| 60 | classes, but applications may use implementations which do not inherit
|
|---|
| 61 | from the provided classes. The \class{InputSource}, \class{Locator},
|
|---|
| 62 | \class{Attributes}, \class{AttributesNS}, and
|
|---|
| 63 | \class{XMLReader} interfaces are defined in the module
|
|---|
| 64 | \refmodule{xml.sax.xmlreader}. The handler interfaces are defined in
|
|---|
| 65 | \refmodule{xml.sax.handler}. For convenience, \class{InputSource}
|
|---|
| 66 | (which is often instantiated directly) and the handler classes are
|
|---|
| 67 | also available from \module{xml.sax}. These interfaces are described
|
|---|
| 68 | below.
|
|---|
| 69 |
|
|---|
| 70 | In addition to these classes, \module{xml.sax} provides the following
|
|---|
| 71 | exception classes.
|
|---|
| 72 |
|
|---|
| 73 | \begin{excclassdesc}{SAXException}{msg\optional{, exception}}
|
|---|
| 74 | Encapsulate an XML error or warning. This class can contain basic
|
|---|
| 75 | error or warning information from either the XML parser or the
|
|---|
| 76 | application: it can be subclassed to provide additional
|
|---|
| 77 | functionality or to add localization. Note that although the
|
|---|
| 78 | handlers defined in the \class{ErrorHandler} interface receive
|
|---|
| 79 | instances of this exception, it is not required to actually raise
|
|---|
| 80 | the exception --- it is also useful as a container for information.
|
|---|
| 81 |
|
|---|
| 82 | When instantiated, \var{msg} should be a human-readable description
|
|---|
| 83 | of the error. The optional \var{exception} parameter, if given,
|
|---|
| 84 | should be \code{None} or an exception that was caught by the parsing
|
|---|
| 85 | code and is being passed along as information.
|
|---|
| 86 |
|
|---|
| 87 | This is the base class for the other SAX exception classes.
|
|---|
| 88 | \end{excclassdesc}
|
|---|
| 89 |
|
|---|
| 90 | \begin{excclassdesc}{SAXParseException}{msg, exception, locator}
|
|---|
| 91 | Subclass of \exception{SAXException} raised on parse errors.
|
|---|
| 92 | Instances of this class are passed to the methods of the SAX
|
|---|
| 93 | \class{ErrorHandler} interface to provide information about the
|
|---|
| 94 | parse error. This class supports the SAX \class{Locator} interface
|
|---|
| 95 | as well as the \class{SAXException} interface.
|
|---|
| 96 | \end{excclassdesc}
|
|---|
| 97 |
|
|---|
| 98 | \begin{excclassdesc}{SAXNotRecognizedException}{msg\optional{, exception}}
|
|---|
| 99 | Subclass of \exception{SAXException} raised when a SAX
|
|---|
| 100 | \class{XMLReader} is confronted with an unrecognized feature or
|
|---|
| 101 | property. SAX applications and extensions may use this class for
|
|---|
| 102 | similar purposes.
|
|---|
| 103 | \end{excclassdesc}
|
|---|
| 104 |
|
|---|
| 105 | \begin{excclassdesc}{SAXNotSupportedException}{msg\optional{, exception}}
|
|---|
| 106 | Subclass of \exception{SAXException} raised when a SAX
|
|---|
| 107 | \class{XMLReader} is asked to enable a feature that is not
|
|---|
| 108 | supported, or to set a property to a value that the implementation
|
|---|
| 109 | does not support. SAX applications and extensions may use this
|
|---|
| 110 | class for similar purposes.
|
|---|
| 111 | \end{excclassdesc}
|
|---|
| 112 |
|
|---|
| 113 |
|
|---|
| 114 | \begin{seealso}
|
|---|
| 115 | \seetitle[http://www.saxproject.org/]{SAX: The Simple API for
|
|---|
| 116 | XML}{This site is the focal point for the definition of
|
|---|
| 117 | the SAX API. It provides a Java implementation and online
|
|---|
| 118 | documentation. Links to implementations and historical
|
|---|
| 119 | information are also available.}
|
|---|
| 120 |
|
|---|
| 121 | \seemodule{xml.sax.handler}{Definitions of the interfaces for
|
|---|
| 122 | application-provided objects.}
|
|---|
| 123 |
|
|---|
| 124 | \seemodule{xml.sax.saxutils}{Convenience functions for use in SAX
|
|---|
| 125 | applications.}
|
|---|
| 126 |
|
|---|
| 127 | \seemodule{xml.sax.xmlreader}{Definitions of the interfaces for
|
|---|
| 128 | parser-provided objects.}
|
|---|
| 129 | \end{seealso}
|
|---|
| 130 |
|
|---|
| 131 |
|
|---|
| 132 | \subsection{SAXException Objects \label{sax-exception-objects}}
|
|---|
| 133 |
|
|---|
| 134 | The \class{SAXException} exception class supports the following
|
|---|
| 135 | methods:
|
|---|
| 136 |
|
|---|
| 137 | \begin{methoddesc}[SAXException]{getMessage}{}
|
|---|
| 138 | Return a human-readable message describing the error condition.
|
|---|
| 139 | \end{methoddesc}
|
|---|
| 140 |
|
|---|
| 141 | \begin{methoddesc}[SAXException]{getException}{}
|
|---|
| 142 | Return an encapsulated exception object, or \code{None}.
|
|---|
| 143 | \end{methoddesc}
|
|---|