source: trunk/ash/TOUR@ 3327

Last change on this file since 3327 was 2460, checked in by bird, 20 years ago

NetBSD sh 2005-07-03.

File size: 17.2 KB
Line 
1# $NetBSD: TOUR,v 1.8 1996/10/16 14:24:56 christos Exp $
2# @(#)TOUR 8.1 (Berkeley) 5/31/93
3
4NOTE -- This is the original TOUR paper distributed with ash and
5does not represent the current state of the shell. It is provided anyway
6since it provides helpful information for how the shell is structured,
7but be warned that things have changed -- the current shell is
8still under development.
9
10================================================================
11
12 A Tour through Ash
13
14 Copyright 1989 by Kenneth Almquist.
15
16
17DIRECTORIES: The subdirectory bltin contains commands which can
18be compiled stand-alone. The rest of the source is in the main
19ash directory.
20
21SOURCE CODE GENERATORS: Files whose names begin with "mk" are
22programs that generate source code. A complete list of these
23programs is:
24
25 program intput files generates
26 ------- ------------ ---------
27 mkbuiltins builtins builtins.h builtins.c
28 mkinit *.c init.c
29 mknodes nodetypes nodes.h nodes.c
30 mksignames - signames.h signames.c
31 mksyntax - syntax.h syntax.c
32 mktokens - token.h
33 bltin/mkexpr unary_op binary_op operators.h operators.c
34
35There are undoubtedly too many of these. Mkinit searches all the
36C source files for entries looking like:
37
38 INIT {
39 x = 1; /* executed during initialization */
40 }
41
42 RESET {
43 x = 2; /* executed when the shell does a longjmp
44 back to the main command loop */
45 }
46
47 SHELLPROC {
48 x = 3; /* executed when the shell runs a shell procedure */
49 }
50
51It pulls this code out into routines which are when particular
52events occur. The intent is to improve modularity by isolating
53the information about which modules need to be explicitly
54initialized/reset within the modules themselves.
55
56Mkinit recognizes several constructs for placing declarations in
57the init.c file.
58 INCLUDE "file.h"
59includes a file. The storage class MKINIT makes a declaration
60available in the init.c file, for example:
61 MKINIT int funcnest; /* depth of function calls */
62MKINIT alone on a line introduces a structure or union declara-
63tion:
64 MKINIT
65 struct redirtab {
66 short renamed[10];
67 };
68Preprocessor #define statements are copied to init.c without any
69special action to request this.
70
71INDENTATION: The ash source is indented in multiples of six
72spaces. The only study that I have heard of on the subject con-
73cluded that the optimal amount to indent is in the range of four
74to six spaces. I use six spaces since it is not too big a jump
75from the widely used eight spaces. If you really hate six space
76indentation, use the adjind (source included) program to change
77it to something else.
78
79EXCEPTIONS: Code for dealing with exceptions appears in
80exceptions.c. The C language doesn't include exception handling,
81so I implement it using setjmp and longjmp. The global variable
82exception contains the type of exception. EXERROR is raised by
83calling error. EXINT is an interrupt. EXSHELLPROC is an excep-
84tion which is raised when a shell procedure is invoked. The pur-
85pose of EXSHELLPROC is to perform the cleanup actions associated
86with other exceptions. After these cleanup actions, the shell
87can interpret a shell procedure itself without exec'ing a new
88copy of the shell.
89
90INTERRUPTS: In an interactive shell, an interrupt will cause an
91EXINT exception to return to the main command loop. (Exception:
92EXINT is not raised if the user traps interrupts using the trap
93command.) The INTOFF and INTON macros (defined in exception.h)
94provide uninterruptable critical sections. Between the execution
95of INTOFF and the execution of INTON, interrupt signals will be
96held for later delivery. INTOFF and INTON can be nested.
97
98MEMALLOC.C: Memalloc.c defines versions of malloc and realloc
99which call error when there is no memory left. It also defines a
100stack oriented memory allocation scheme. Allocating off a stack
101is probably more efficient than allocation using malloc, but the
102big advantage is that when an exception occurs all we have to do
103to free up the memory in use at the time of the exception is to
104restore the stack pointer. The stack is implemented using a
105linked list of blocks.
106
107STPUTC: If the stack were contiguous, it would be easy to store
108strings on the stack without knowing in advance how long the
109string was going to be:
110 p = stackptr;
111 *p++ = c; /* repeated as many times as needed */
112 stackptr = p;
113The folloing three macros (defined in memalloc.h) perform these
114operations, but grow the stack if you run off the end:
115 STARTSTACKSTR(p);
116 STPUTC(c, p); /* repeated as many times as needed */
117 grabstackstr(p);
118
119We now start a top-down look at the code:
120
121MAIN.C: The main routine performs some initialization, executes
122the user's profile if necessary, and calls cmdloop. Cmdloop is
123repeatedly parses and executes commands.
124
125OPTIONS.C: This file contains the option processing code. It is
126called from main to parse the shell arguments when the shell is
127invoked, and it also contains the set builtin. The -i and -j op-
128tions (the latter turns on job control) require changes in signal
129handling. The routines setjobctl (in jobs.c) and setinteractive
130(in trap.c) are called to handle changes to these options.
131
132PARSING: The parser code is all in parser.c. A recursive des-
133cent parser is used. Syntax tables (generated by mksyntax) are
134used to classify characters during lexical analysis. There are
135three tables: one for normal use, one for use when inside single
136quotes, and one for use when inside double quotes. The tables
137are machine dependent because they are indexed by character vari-
138ables and the range of a char varies from machine to machine.
139
140PARSE OUTPUT: The output of the parser consists of a tree of
141nodes. The various types of nodes are defined in the file node-
142types.
143
144Nodes of type NARG are used to represent both words and the con-
145tents of here documents. An early version of ash kept the con-
146tents of here documents in temporary files, but keeping here do-
147cuments in memory typically results in significantly better per-
148formance. It would have been nice to make it an option to use
149temporary files for here documents, for the benefit of small
150machines, but the code to keep track of when to delete the tem-
151porary files was complex and I never fixed all the bugs in it.
152(AT&T has been maintaining the Bourne shell for more than ten
153years, and to the best of my knowledge they still haven't gotten
154it to handle temporary files correctly in obscure cases.)
155
156The text field of a NARG structure points to the text of the
157word. The text consists of ordinary characters and a number of
158special codes defined in parser.h. The special codes are:
159
160 CTLVAR Variable substitution
161 CTLENDVAR End of variable substitution
162 CTLBACKQ Command substitution
163 CTLBACKQ|CTLQUOTE Command substitution inside double quotes
164 CTLESC Escape next character
165
166A variable substitution contains the following elements:
167