| 1 | # $NetBSD: TOUR,v 1.8 1996/10/16 14:24:56 christos Exp $
|
|---|
| 2 | # @(#)TOUR 8.1 (Berkeley) 5/31/93
|
|---|
| 3 |
|
|---|
| 4 | NOTE -- This is the original TOUR paper distributed with ash and
|
|---|
| 5 | does not represent the current state of the shell. It is provided anyway
|
|---|
| 6 | since it provides helpful information for how the shell is structured,
|
|---|
| 7 | but be warned that things have changed -- the current shell is
|
|---|
| 8 | still under development.
|
|---|
| 9 |
|
|---|
| 10 | ================================================================
|
|---|
| 11 |
|
|---|
| 12 | A Tour through Ash
|
|---|
| 13 |
|
|---|
| 14 | Copyright 1989 by Kenneth Almquist.
|
|---|
| 15 |
|
|---|
| 16 |
|
|---|
| 17 | DIRECTORIES: The subdirectory bltin contains commands which can
|
|---|
| 18 | be compiled stand-alone. The rest of the source is in the main
|
|---|
| 19 | ash directory.
|
|---|
| 20 |
|
|---|
| 21 | SOURCE CODE GENERATORS: Files whose names begin with "mk" are
|
|---|
| 22 | programs that generate source code. A complete list of these
|
|---|
| 23 | programs is:
|
|---|
| 24 |
|
|---|
| 25 | program intput files generates
|
|---|
| 26 | ------- ------------ ---------
|
|---|
| 27 | mkbuiltins builtins builtins.h builtins.c
|
|---|
| 28 | mkinit *.c init.c
|
|---|
| 29 | mknodes nodetypes nodes.h nodes.c
|
|---|
| 30 | mksignames - signames.h signames.c
|
|---|
| 31 | mksyntax - syntax.h syntax.c
|
|---|
| 32 | mktokens - token.h
|
|---|
| 33 | bltin/mkexpr unary_op binary_op operators.h operators.c
|
|---|
| 34 |
|
|---|
| 35 | There are undoubtedly too many of these. Mkinit searches all the
|
|---|
| 36 | C source files for entries looking like:
|
|---|
| 37 |
|
|---|
| 38 | INIT {
|
|---|
| 39 | x = 1; /* executed during initialization */
|
|---|
| 40 | }
|
|---|
| 41 |
|
|---|
| 42 | RESET {
|
|---|
| 43 | x = 2; /* executed when the shell does a longjmp
|
|---|
| 44 | back to the main command loop */
|
|---|
| 45 | }
|
|---|
| 46 |
|
|---|
| 47 | SHELLPROC {
|
|---|
| 48 | x = 3; /* executed when the shell runs a shell procedure */
|
|---|
| 49 | }
|
|---|
| 50 |
|
|---|
| 51 | It pulls this code out into routines which are when particular
|
|---|
| 52 | events occur. The intent is to improve modularity by isolating
|
|---|
| 53 | the information about which modules need to be explicitly
|
|---|
| 54 | initialized/reset within the modules themselves.
|
|---|
| 55 |
|
|---|
| 56 | Mkinit recognizes several constructs for placing declarations in
|
|---|
| 57 | the init.c file.
|
|---|
| 58 | INCLUDE "file.h"
|
|---|
| 59 | includes a file. The storage class MKINIT makes a declaration
|
|---|
| 60 | available in the init.c file, for example:
|
|---|
| 61 | MKINIT int funcnest; /* depth of function calls */
|
|---|
| 62 | MKINIT alone on a line introduces a structure or union declara-
|
|---|
| 63 | tion:
|
|---|
| 64 | MKINIT
|
|---|
| 65 | struct redirtab {
|
|---|
| 66 | short renamed[10];
|
|---|
| 67 | };
|
|---|
| 68 | Preprocessor #define statements are copied to init.c without any
|
|---|
| 69 | special action to request this.
|
|---|
| 70 |
|
|---|
| 71 | INDENTATION: The ash source is indented in multiples of six
|
|---|
| 72 | spaces. The only study that I have heard of on the subject con-
|
|---|
| 73 | cluded that the optimal amount to indent is in the range of four
|
|---|
| 74 | to six spaces. I use six spaces since it is not too big a jump
|
|---|
| 75 | from the widely used eight spaces. If you really hate six space
|
|---|
| 76 | indentation, use the adjind (source included) program to change
|
|---|
| 77 | it to something else.
|
|---|
| 78 |
|
|---|
| 79 | EXCEPTIONS: Code for dealing with exceptions appears in
|
|---|
| 80 | exceptions.c. The C language doesn't include exception handling,
|
|---|
| 81 | so I implement it using setjmp and longjmp. The global variable
|
|---|
| 82 | exception contains the type of exception. EXERROR is raised by
|
|---|
| 83 | calling error. EXINT is an interrupt. EXSHELLPROC is an excep-
|
|---|
| 84 | tion which is raised when a shell procedure is invoked. The pur-
|
|---|
| 85 | pose of EXSHELLPROC is to perform the cleanup actions associated
|
|---|
| 86 | with other exceptions. After these cleanup actions, the shell
|
|---|
| 87 | can interpret a shell procedure itself without exec'ing a new
|
|---|
| 88 | copy of the shell.
|
|---|
| 89 |
|
|---|
| 90 | INTERRUPTS: In an interactive shell, an interrupt will cause an
|
|---|
| 91 | EXINT exception to return to the main command loop. (Exception:
|
|---|
| 92 | EXINT is not raised if the user traps interrupts using the trap
|
|---|
| 93 | command.) The INTOFF and INTON macros (defined in exception.h)
|
|---|
| 94 | provide uninterruptable critical sections. Between the execution
|
|---|
| 95 | of INTOFF and the execution of INTON, interrupt signals will be
|
|---|
| 96 | held for later delivery. INTOFF and INTON can be nested.
|
|---|
| 97 |
|
|---|
| 98 | MEMALLOC.C: Memalloc.c defines versions of malloc and realloc
|
|---|
| 99 | which call error when there is no memory left. It also defines a
|
|---|
| 100 | stack oriented memory allocation scheme. Allocating off a stack
|
|---|
| 101 | is probably more efficient than allocation using malloc, but the
|
|---|
| 102 | big advantage is that when an exception occurs all we have to do
|
|---|
| 103 | to free up the memory in use at the time of the exception is to
|
|---|
| 104 | restore the stack pointer. The stack is implemented using a
|
|---|
| 105 | linked list of blocks.
|
|---|
| 106 |
|
|---|
| 107 | STPUTC: If the stack were contiguous, it would be easy to store
|
|---|
| 108 | strings on the stack without knowing in advance how long the
|
|---|
| 109 | string was going to be:
|
|---|
| 110 | p = stackptr;
|
|---|
| 111 | *p++ = c; /* repeated as many times as needed */
|
|---|
| 112 | stackptr = p;
|
|---|
| 113 | The folloing three macros (defined in memalloc.h) perform these
|
|---|
| 114 | operations, but grow the stack if you run off the end:
|
|---|
| 115 | STARTSTACKSTR(p);
|
|---|
| 116 | STPUTC(c, p); /* repeated as many times as needed */
|
|---|
| 117 | grabstackstr(p);
|
|---|
| 118 |
|
|---|
| 119 | We now start a top-down look at the code:
|
|---|
| 120 |
|
|---|
| 121 | MAIN.C: The main routine performs some initialization, executes
|
|---|
| 122 | the user's profile if necessary, and calls cmdloop. Cmdloop is
|
|---|
| 123 | repeatedly parses and executes commands.
|
|---|
| 124 |
|
|---|
| 125 | OPTIONS.C: This file contains the option processing code. It is
|
|---|
| 126 | called from main to parse the shell arguments when the shell is
|
|---|
| 127 | invoked, and it also contains the set builtin. The -i and -j op-
|
|---|
| 128 | tions (the latter turns on job control) require changes in signal
|
|---|
| 129 | handling. The routines setjobctl (in jobs.c) and setinteractive
|
|---|
| 130 | (in trap.c) are called to handle changes to these options.
|
|---|
| 131 |
|
|---|
| 132 | PARSING: The parser code is all in parser.c. A recursive des-
|
|---|
| 133 | cent parser is used. Syntax tables (generated by mksyntax) are
|
|---|
| 134 | used to classify characters during lexical analysis. There are
|
|---|
| 135 | three tables: one for normal use, one for use when inside single
|
|---|
| 136 | quotes, and one for use when inside double quotes. The tables
|
|---|
| 137 | are machine dependent because they are indexed by character vari-
|
|---|
| 138 | ables and the range of a char varies from machine to machine.
|
|---|
| 139 |
|
|---|
| 140 | PARSE OUTPUT: The output of the parser consists of a tree of
|
|---|
| 141 | nodes. The various types of nodes are defined in the file node-
|
|---|
| 142 | types.
|
|---|
| 143 |
|
|---|
| 144 | Nodes of type NARG are used to represent both words and the con-
|
|---|
| 145 | tents of here documents. An early version of ash kept the con-
|
|---|
| 146 | tents of here documents in temporary files, but keeping here do-
|
|---|
| 147 | cuments in memory typically results in significantly better per-
|
|---|
| 148 | formance. It would have been nice to make it an option to use
|
|---|
| 149 | temporary files for here documents, for the benefit of small
|
|---|
| 150 | machines, but the code to keep track of when to delete the tem-
|
|---|
| 151 | porary files was complex and I never fixed all the bugs in it.
|
|---|
| 152 | (AT&T has been maintaining the Bourne shell for more than ten
|
|---|
| 153 | years, and to the best of my knowledge they still haven't gotten
|
|---|
| 154 | it to handle temporary files correctly in obscure cases.)
|
|---|
| 155 |
|
|---|
| 156 | The text field of a NARG structure points to the text of the
|
|---|
| 157 | word. The text consists of ordinary characters and a number of
|
|---|
| 158 | special codes defined in parser.h. The special codes are:
|
|---|
| 159 |
|
|---|
| 160 | CTLVAR Variable substitution
|
|---|
| 161 | CTLENDVAR End of variable substitution
|
|---|
| 162 | CTLBACKQ Command substitution
|
|---|
| 163 | CTLBACKQ|CTLQUOTE Command substitution inside double quotes
|
|---|
| 164 | CTLESC Escape next character
|
|---|
| 165 |
|
|---|
| 166 | A variable substitution contains the following elements:
|
|---|
| 167 |
|
|---|
|
|---|