| 1 | % XXX Label can't be _ast?
|
|---|
| 2 | % XXX Where should this section/chapter go?
|
|---|
| 3 | \chapter{Abstract Syntax Trees\label{ast}}
|
|---|
| 4 |
|
|---|
| 5 | \sectionauthor{Martin v. L\"owis}{[email protected]}
|
|---|
| 6 |
|
|---|
| 7 | \versionadded{2.5}
|
|---|
| 8 |
|
|---|
| 9 | The \code{_ast} module helps Python applications to process
|
|---|
| 10 | trees of the Python abstract syntax grammar. The Python compiler
|
|---|
| 11 | currently provides read-only access to such trees, meaning that
|
|---|
| 12 | applications can only create a tree for a given piece of Python
|
|---|
| 13 | source code; generating byte code from a (potentially modified)
|
|---|
| 14 | tree is not supported. The abstract syntax itself might change with
|
|---|
| 15 | each Python release; this module helps to find out programmatically
|
|---|
| 16 | what the current grammar looks like.
|
|---|
| 17 |
|
|---|
| 18 | An abstract syntax tree can be generated by passing \code{_ast.PyCF_ONLY_AST}
|
|---|
| 19 | as a flag to the \function{compile} builtin function. The result will be a tree
|
|---|
| 20 | of objects whose classes all inherit from \code{_ast.AST}.
|
|---|
| 21 |
|
|---|
| 22 | The actual classes are derived from the \code{Parser/Python.asdl} file,
|
|---|
| 23 | which is reproduced below. There is one class defined for each left-hand
|
|---|
| 24 | side symbol in the abstract grammar (for example, \code{_ast.stmt} or \code{_ast.expr}).
|
|---|
| 25 | In addition, there is one class defined for each constructor on the
|
|---|
| 26 | right-hand side; these classes inherit from the classes for the left-hand
|
|---|
| 27 | side trees. For example, \code{_ast.BinOp} inherits from \code{_ast.expr}.
|
|---|
| 28 | For production rules with alternatives (aka "sums"), the left-hand side
|
|---|
| 29 | class is abstract: only instances of specific constructor nodes are ever
|
|---|
| 30 | created.
|
|---|
| 31 |
|
|---|
| 32 | Each concrete class has an attribute \code{_fields} which gives the
|
|---|
| 33 | names of all child nodes.
|
|---|
| 34 |
|
|---|
| 35 | Each instance of a concrete class has one attribute for each child node,
|
|---|
| 36 | of the type as defined in the grammar. For example, \code{_ast.BinOp}
|
|---|
| 37 | instances have an attribute \code{left} of type \code{_ast.expr}.
|
|---|
| 38 | Instances of \code{_ast.expr} and \code{_ast.stmt} subclasses also
|
|---|
| 39 | have lineno and col_offset attributes. The lineno is the line number
|
|---|
| 40 | of source text (1 indexed so the first line is line 1) and the
|
|---|
| 41 | col_offset is the utf8 byte offset of the first token that generated
|
|---|
| 42 | the node. The utf8 offset is recorded because the parser uses utf8
|
|---|
| 43 | internally.
|
|---|
| 44 |
|
|---|
| 45 | If these attributes are marked as optional in the grammar (using a
|
|---|
| 46 | question mark), the value might be \code{None}. If the attributes
|
|---|
| 47 | can have zero-or-more values (marked with an asterisk), the
|
|---|
| 48 | values are represented as Python lists.
|
|---|
| 49 |
|
|---|
| 50 | \section{Abstract Grammar}
|
|---|
| 51 |
|
|---|
| 52 | The module defines a string constant \code{__version__} which
|
|---|
| 53 | is the decimal subversion revision number of the file shown below.
|
|---|
| 54 |
|
|---|
| 55 | The abstract grammar is currently defined as follows:
|
|---|
| 56 |
|
|---|
| 57 | \verbatiminput{../../Parser/Python.asdl}
|
|---|