| 1 |
|
|---|
| 2 | \documentclass{howto}
|
|---|
| 3 |
|
|---|
| 4 | \title{Python Advocacy HOWTO}
|
|---|
| 5 |
|
|---|
| 6 | \release{0.03}
|
|---|
| 7 |
|
|---|
| 8 | \author{A.M. Kuchling}
|
|---|
| 9 | \authoraddress{\email{[email protected]}}
|
|---|
| 10 |
|
|---|
| 11 | \begin{document}
|
|---|
| 12 | \maketitle
|
|---|
| 13 |
|
|---|
| 14 | \begin{abstract}
|
|---|
| 15 | \noindent
|
|---|
| 16 | It's usually difficult to get your management to accept open source
|
|---|
| 17 | software, and Python is no exception to this rule. This document
|
|---|
| 18 | discusses reasons to use Python, strategies for winning acceptance,
|
|---|
| 19 | facts and arguments you can use, and cases where you \emph{shouldn't}
|
|---|
| 20 | try to use Python.
|
|---|
| 21 |
|
|---|
| 22 | This document is available from the Python HOWTO page at
|
|---|
| 23 | \url{http://www.python.org/doc/howto}.
|
|---|
| 24 |
|
|---|
| 25 | \end{abstract}
|
|---|
| 26 |
|
|---|
| 27 | \tableofcontents
|
|---|
| 28 |
|
|---|
| 29 | \section{Reasons to Use Python}
|
|---|
| 30 |
|
|---|
| 31 | There are several reasons to incorporate a scripting language into
|
|---|
| 32 | your development process, and this section will discuss them, and why
|
|---|
| 33 | Python has some properties that make it a particularly good choice.
|
|---|
| 34 |
|
|---|
| 35 | \subsection{Programmability}
|
|---|
| 36 |
|
|---|
| 37 | Programs are often organized in a modular fashion. Lower-level
|
|---|
| 38 | operations are grouped together, and called by higher-level functions,
|
|---|
| 39 | which may in turn be used as basic operations by still further upper
|
|---|
| 40 | levels.
|
|---|
| 41 |
|
|---|
| 42 | For example, the lowest level might define a very low-level
|
|---|
| 43 | set of functions for accessing a hash table. The next level might use
|
|---|
| 44 | hash tables to store the headers of a mail message, mapping a header
|
|---|
| 45 | name like \samp{Date} to a value such as \samp{Tue, 13 May 1997
|
|---|
| 46 | 20:00:54 -0400}. A yet higher level may operate on message objects,
|
|---|
| 47 | without knowing or caring that message headers are stored in a hash
|
|---|
| 48 | table, and so forth.
|
|---|
| 49 |
|
|---|
| 50 | Often, the lowest levels do very simple things; they implement a data
|
|---|
| 51 | structure such as a binary tree or hash table, or they perform some
|
|---|
| 52 | simple computation, such as converting a date string to a number. The
|
|---|
| 53 | higher levels then contain logic connecting these primitive
|
|---|
| 54 | operations. Using the approach, the primitives can be seen as basic
|
|---|
| 55 | building blocks which are then glued together to produce the complete
|
|---|
| 56 | product.
|
|---|
| 57 |
|
|---|
| 58 | Why is this design approach relevant to Python? Because Python is
|
|---|
| 59 | well suited to functioning as such a glue language. A common approach
|
|---|
| 60 | is to write a Python module that implements the lower level
|
|---|
| 61 | operations; for the sake of speed, the implementation might be in C,
|
|---|
| 62 | Java, or even Fortran. Once the primitives are available to Python
|
|---|
| 63 | programs, the logic underlying higher level operations is written in
|
|---|
| 64 | the form of Python code. The high-level logic is then more
|
|---|
| 65 | understandable, and easier to modify.
|
|---|
| 66 |
|
|---|
| 67 | John Ousterhout wrote a paper that explains this idea at greater
|
|---|
| 68 | length, entitled ``Scripting: Higher Level Programming for the 21st
|
|---|
| 69 | Century''. I recommend that you read this paper; see the references
|
|---|
| 70 | for the URL. Ousterhout is the inventor of the Tcl language, and
|
|---|
| 71 | therefore argues that Tcl should be used for this purpose; he only
|
|---|
| 72 | briefly refers to other languages such as Python, Perl, and
|
|---|
| 73 | Lisp/Scheme, but in reality, Ousterhout's argument applies to
|
|---|
| 74 | scripting languages in general, since you could equally write
|
|---|
| 75 | extensions for any of the languages mentioned above.
|
|---|
| 76 |
|
|---|
| 77 | \subsection{Prototyping}
|
|---|
| 78 |
|
|---|
| 79 | In \emph{The Mythical Man-Month}, Fredrick Brooks suggests the
|
|---|
| 80 | following rule when planning software projects: ``Plan to throw one
|
|---|
| 81 | away; you will anyway.'' Brooks is saying that the first attempt at a
|
|---|
| 82 | software design often turns out to be wrong; unless the problem is
|
|---|
| 83 | very simple or you're an extremely good designer, you'll find that new
|
|---|
| 84 | requirements and features become apparent once development has
|
|---|
| 85 | actually started. If these new requirements can't be cleanly
|
|---|
| 86 | incorporated into the program's structure, you're presented with two
|
|---|
| 87 | unpleasant choices: hammer the new features into the program somehow,
|
|---|
| 88 | or scrap everything and write a new version of the program, taking the
|
|---|
| 89 | new features into account from the beginning.
|
|---|
| 90 |
|
|---|
| 91 | Python provides you with a good environment for quickly developing an
|
|---|
| 92 | initial prototype. That lets you get the overall program structure
|
|---|
| 93 | and logic right, and you can fine-tune small details in the fast
|
|---|
| 94 | development cycle that Python provides. Once you're satisfied with
|
|---|
| 95 | the GUI interface or program output, you can translate the Python code
|
|---|
| 96 | into C++, Fortran, Java, or some other compiled language.
|
|---|
| 97 |
|
|---|
| 98 | Prototyping means you have to be careful not to use too many Python
|
|---|
| 99 | features that are hard to implement in your other language. Using
|
|---|
| 100 | \code{eval()}, or regular expressions, or the \module{pickle} module,
|
|---|
| 101 | means that you're going to need C or Java libraries for formula
|
|---|
| 102 | evaluation, regular expressions, and serialization, for example. But
|
|---|
| 103 | it's not hard to avoid such tricky code, and in the end the
|
|---|
| 104 | translation usually isn't very difficult. The resulting code can be
|
|---|
| 105 | rapidly debugged, because any serious logical errors will have been
|
|---|
| 106 | removed from the prototype, leaving only more minor slip-ups in the
|
|---|
| 107 | translation to track down.
|
|---|
| 108 |
|
|---|
| 109 | This strategy builds on the earlier discussion of programmability.
|
|---|
| 110 | Using Python as glue to connect lower-level components has obvious
|
|---|
| 111 | relevance for constructing prototype systems. In this way Python can
|
|---|
| 112 | help you with development, even if end users never come in contact
|
|---|
| 113 | with Python code at all. If the performance of the Python version is
|
|---|
| 114 | adequate and corporate politics allow it, you may not need to do a
|
|---|
| 115 | translation into C or Java, but it can still be faster to develop a
|
|---|
| 116 | prototype and then translate it, instead of attempting to produce the
|
|---|
| 117 | final version immediately.
|
|---|
| 118 |
|
|---|
| 119 | One example of this development strategy is Microsoft Merchant Server.
|
|---|
| 120 | Version 1.0 was written in pure Python, by a company that subsequently
|
|---|
| 121 | was purchased by Microsoft. Version 2.0 began to translate the code
|
|---|
| 122 | into \Cpp, shipping with some \Cpp code and some Python code. Version
|
|---|
| 123 | 3.0 didn't contain any Python at all; all the code had been translated
|
|---|
| 124 | into \Cpp. Even though the product doesn't contain a Python
|
|---|
| 125 | interpreter, the Python language has still served a useful purpose by
|
|---|
| 126 | speeding up development.
|
|---|
| 127 |
|
|---|
| 128 | This is a very common use for Python. Past conference papers have
|
|---|
| 129 | also described this approach for developing high-level numerical
|
|---|
| 130 | algorithms; see David M. Beazley and Peter S. Lomdahl's paper
|
|---|
| 131 | ``Feeding a Large-scale Physics Application to Python'' in the
|
|---|
| 132 | references for a good example. If an algorithm's basic operations are
|
|---|
| 133 | things like "Take the inverse of this 4000x4000 matrix", and are
|
|---|
| 134 | implemented in some lower-level language, then Python has almost no
|
|---|
| 135 | additional performance cost; the extra time required for Python to
|
|---|
| 136 | evaluate an expression like \code{m.invert()} is dwarfed by the cost
|
|---|
| 137 | of the actual computation. It's particularly good for applications
|
|---|
| 138 | where seemingly endless tweaking is required to get things right. GUI
|
|---|
| 139 | interfaces and Web sites are prime examples.
|
|---|
| 140 |
|
|---|
| 141 | The Python code is also shorter and faster to write (once you're
|
|---|
| 142 | familiar with Python), so it's easier to throw it away if you decide
|
|---|
| 143 | your approach was wrong; if you'd spent two weeks working on it
|
|---|
| 144 | instead of just two hours, you might waste time trying to patch up
|
|---|
| 145 | what you've got out of a natural reluctance to admit that those two
|
|---|
| 146 | weeks were wasted. Truthfully, those two weeks haven't been wasted,
|
|---|
| 147 | since you've learnt something about the problem and the technology
|
|---|
| 148 | you're using to solve it, but it's human nature to view this as a
|
|---|
| 149 | failure of some sort.
|
|---|
| 150 |
|
|---|
| 151 | \subsection{Simplicity and Ease of Understanding}
|
|---|
| 152 |
|
|---|
| 153 | Python is definitely \emph{not} a toy language that's only usable for
|
|---|
| 154 | small tasks. The language features are general and powerful enough to
|
|---|
| 155 | enable it to be used for many different purposes. It's useful at the
|
|---|
| 156 | small end, for 10- or 20-line scripts, but it also scales up to larger
|
|---|
| 157 | systems that contain thousands of lines of code.
|
|---|
| 158 |
|
|---|
| 159 | However, this expressiveness doesn't come at the cost of an obscure or
|
|---|
| 160 | tricky syntax. While Python has some dark corners that can lead to
|
|---|
| 161 | obscure code, there are relatively few such corners, and proper design
|
|---|
| 162 | can isolate their use to only a few classes or modules. It's
|
|---|
| 163 | certainly possible to write confusing code by using too many features
|
|---|
| 164 | with too little concern for clarity, but most Python code can look a
|
|---|
| 165 | lot like a slightly-formalized version of human-understandable
|
|---|
| 166 | pseudocode.
|
|---|
| 167 |
|
|---|
| 168 | In \emph{The New Hacker's Dictionary}, Eric S. Raymond gives the following
|
|---|
| 169 | definition for "compact":
|
|---|
| 170 |
|
|---|
| 171 | \begin{quotation}
|
|---|
| 172 | Compact \emph{adj.} Of a design, describes the valuable property
|
|---|
| 173 | that it can all be apprehended at once in one's head. This
|
|---|
| 174 | generally means the thing created from the design can be used
|
|---|
| 175 | with greater facility and fewer errors than an equivalent tool
|
|---|
| 176 | that is not compact. Compactness does not imply triviality or
|
|---|
| 177 | lack of power; for example, C is compact and FORTRAN is not,
|
|---|
| 178 | but C is more powerful than FORTRAN. Designs become
|
|---|
| 179 | non-compact through accreting features and cruft that don't
|
|---|
| 180 | merge cleanly into the overall design scheme (thus, some fans
|
|---|
| 181 | of Classic C maintain that ANSI C is no longer compact).
|
|---|
| 182 | \end{quotation}
|
|---|
| 183 |
|
|---|
| 184 | (From \url{http://www.catb.org/~esr/jargon/html/C/compact.html})
|
|---|
| 185 |
|
|---|
| 186 | In this sense of the word, Python is quite compact, because the
|
|---|
| 187 | language has just a few ideas, which are used in lots of places. Take
|
|---|
| 188 | namespaces, for example. Import a module with \code{import math}, and
|
|---|
| 189 | you create a new namespace called \samp{math}. Classes are also
|
|---|
| 190 | namespaces that share many of the properties of modules, and have a
|
|---|
| 191 | few of their own; for example, you can create instances of a class.
|
|---|
| 192 | Instances? They're yet another namespace. Namespaces are currently
|
|---|
| 193 | implemented as Python dictionaries, so they have the same methods as
|
|---|
| 194 | the standard dictionary data type: .keys() returns all the keys, and
|
|---|
| 195 | so forth.
|
|---|
| 196 |
|
|---|
| 197 | This simplicity arises from Python's development history. The
|
|---|
| 198 | language syntax derives from different sources; ABC, a relatively
|
|---|
| 199 | obscure teaching language, is one primary influence, and Modula-3 is
|
|---|
| 200 | another. (For more information about ABC and Modula-3, consult their
|
|---|
| 201 | respective Web sites at \url{http://www.cwi.nl/~steven/abc/} and
|
|---|
| 202 | \url{http://www.m3.org}.) Other features have come from C, Icon,
|
|---|
| 203 | Algol-68, and even Perl. Python hasn't really innovated very much,
|
|---|
| 204 | but instead has tried to keep the language small and easy to learn,
|
|---|
| 205 | building on ideas that have been tried in other languages and found
|
|---|
| 206 | useful.
|
|---|
| 207 |
|
|---|
| 208 | Simplicity is a virtue that should not be underestimated. It lets you
|
|---|
| 209 | learn the language more quickly, and then rapidly write code, code
|
|---|
| 210 | that often works the first time you run it.
|
|---|
| 211 |
|
|---|
| 212 | \subsection{Java Integration}
|
|---|
| 213 |
|
|---|
| 214 | If you're working with Java, Jython
|
|---|
|
|---|