source: trunk/essentials/sys-apps/gawk/NEWS@ 3264

Last change on this file since 3264 was 3076, checked in by bird, 19 years ago

gawk 3.1.5

File size: 77.8 KB
Line 
1Changes from 3.1.4 to 3.1.5
2---------------------------
3
41. The random() suite has been updated to a current FreeBSD version, which
5 works on systems with > 32-bit ints.
6
72. A new option, `--exec' has been added. It's like -f but ends option
8 processing. It also disables `x=y' variable assignments, but not -v.
9 It's needed mainly for CGI scripts, so that source code can't be
10 passed in as part of the URL.
11
123. dfa.[ch] have been synced with GNU grep development. This also fixes
13 multiple regex matching problems in multibyte locales.
14
154. Updated to Automake 1.9.5.
16
175. Updated to Bison 2.0.
18
196. The getopt* and regex* files were synchronized with current GLIBC CVS.
20 See the ChangeLog for the versions and minor edits made.
21
227. `configure --disable-nls' now disables just gawk's own translations.
23 Gawk continues to work with the locale's numeric formatting. This
24 includes a bug fix in handling the printf ' flag (e.g., %'d).
25
268. Gawk is now multibyte aware. This means that index(), length(),
27 substr() and match() all work in terms of characters, not bytes.
28
299. Gawk is now smarter about parsing numeric constants in corner cases.
30
3111. Not closing open redirections no longer causes gawk to exit non-zero.
32
3310. The VMS port has been updated.
34
3511. Changes from Andrew Schorr at the xmlgawk project to provide for
36 open hooks from extensions are now included. This will let the
37 xmlgawk extension work in the standard gawk.
38
3912. Updated to gettext 0.14.4. Gawk no longer includes its own copy
40 of the gettext `intl' library, following current GNU practice to
41 rely on there being an external version thereof.
42
4313. A regexp of the form `//' will now generate a warning that it
44 is not a C++ comment from --lint (awk.y).
45
4614. The ^ and ^= operators with an integer exponent now use Exponentiation
47 by Squaring. This simultaneously fixes a problem with ^= and a negative
48 integer exponent.
49
5015. length(array) now returns the number of elements in the array. This is
51 is a non-standard extension that will fail in POSIX mode.
52
5316. Carriage return characters are now ignored in program source code.
54
5517. Four new translations added.
56
5718. Various minor bugs fixed. See the ChangeLog for the details.
58
59Changes from 3.1.3 to 3.1.4
60---------------------------
61
621. Gawk now supports the POSIX %F format, falling back to %f if the local
63 system printf doesn't handle it.
64
652. Gawk now supports the ' flag in printf. E.g., %'d in a locale with thousands
66 separators includes the thousands separator in the value, e.g. 12,345.
67
68 This has one problem; the ' flag is next to impossible to use on the
69 command line, without major quoting games. Oh well, TANSTAAFL.
70
713. The dfa code has been reinstated; the performance degradation was
72 just too awful. Sigh. (For fun, use `export GAWK_NO_DFA=1' to
73 see the difference.)
74
754. The special case `x = x y' is now recognized in the grammar, and gawk
76 now uses `realloc' to append the new value to the end of the existing
77 one. This can speed up the common case of appending onto a string.
78
795. The dfa code was upgraded with most of the fixes from grep 2.5.1, and
80 the regex code was upgraded with GLIBC as mid-January 2004. The regex
81 code is faster than it was, but still not as fast as the dfa code, so
82 the dfa code stays in. The getopt code was also synced to current GLIBC.
83
846. Support code upgraded to Automake 1.8.5, Autoconf 2.59, and gettext 0.14.1.
85
867. When --posix is in effect, sub/gsub now follow the 2001 POSIX behavior.
87 Yippee. This is even documented in the manual.
88
898. Gawk will now recover children that have died (input pipelines, two-way
90 pipes), upon detecting EOF from them, thus avoiding filling
91 up the process table. Open file descriptors are not recovered
92 (unfortunately), since that could break awk semantics. See the
93 ChangeLog and the source code for the details.
94
959. Handling of numbers like `0,1' in non-American locales ought to
96 work correctly now.
97
9810. IGNORECASE is now locale-aware for characters with values above 128.
99 The dfa matcher is now used for IGNORECASE matches too.
100
10111. Dynamic function loading is better. The documentation has been improved
102 and some new APIs for use by dynamic functions have been added.
103
10412. Gawk now has a fighting chance of working on older systems,
105 a la SunOS 4.1.x.
106
10713. Issues with multibyte support on HP-UX are now resolved. `configure' now
108 disables such support there, since it's not up to what gawk needs.
109
11014. There are now even more tests in the test suite.
111
11215. Various bugs fixed; see ChangeLog for the details.
113
114Changes from 3.1.2 to 3.1.3
115---------------------------
116
1171. Gawk now follows POSIX in handling of local numeric formats for
118 input, output and number/string conversions.
119
1202. Multibyte detection improved. See README_d/README.multibyte for more
121 info about multibyte locales.
122
1233. Handling of `close' made more POSIX-compliant for POSIXLY_CORRECT,
124 see the documentation.
125
1264. The record reading code was redone, again. This time it's much
127 better. Really!
128
1295. For RS = "\n" and RS = "", gawk now only sets RT when it has changed.
130 This provides considerable performance improvement.
131
1326. `match' now sets all the subscripts in the third argument array
133 correctly, even if not all subexpressions matched.
134
1357. Updated to Automake 1.7.5. configure.in renamed configure.ac.
136
1378. C-style switch statements are available, but must be enabled at
138 compile time via `configure --enable-switch'. For 3.2 they'll be
139 enabled by default. Thanks to Michael Benzinger for the initial
140 code.
141
1429. %c now always prints no more than one character, whatever
143 precision is provided.
144
14510. strtonum(<number>) now works again.
146
14711. Gawk is now much better about scalar/array typing of global
148 uninitiailzed variables passed as parameters. Once the parameter
149 is then used one way or the other, the global var's type is
150 adjusted accordingly. Thanks to Stepan Kasal for the original
151 (considerable) changes.
152
15312. Dynamic function loading under Windows32 should now be possible. See
154 README_d/README.pcdynamic. Thanks to Patrick T.J. McPhee for the changes.
155
15613. Updated to gettext 0.12.1.
157
15814. Gawk now follows historical practice and POSIX for the return
159 value of `rand': It's now 0 <= N < 1.
160
161Changes from 3.1.1 to 3.1.2
162---------------------------
163
1641. Loops of the form:
165
166 for (iggy in foo)
167 next
168
169 no longer leak memory.
170
1712. gawk -v FIELDWIDTHS="..." now sets PROCINFO["FS"] correctly.
172
1733. All builtin operations and functions should now fully evaluate their
174 arguments so that side effects take place correctly.
175
1764. Fixed a logic bug in gsub/gensub for matches to null strings that occurred
177 later in the string after a nonnull match.
178
1795. getgroups code now works on Ultrix again.
180
1816. Completely new version of the full GNU regex engine now in place.
182
1837. Argument parsing and variable assignment has been cleaned up.
184
1858. An I/O bug on HP-UX has been documented and worked around. See
186 README_d/README.hpux.
187
1889. awklib/grcat should now compile correctly.
189
19010. Updated to automake 1.7.3, autoconf 2.57 and gettext 0.11.5 ; thanks to
191 Paul Eggert for the initial automake and autoconf work.
192
19311. As a result of #6, removed the use of the dfa code from GNU grep.
194
19512. It is now possible to use ptys for |& two-way pipes instead of
196 pipes. The basic plumbing for this was provided by Paolo Bonzini.
197 To make this happen:
198
199 command = "unix command etc"
200 PROCINFO[command, "pty"] = 1
201
202 print ... |& command
203 command |& getline stuff
204
205 In other words, set the element in PROCINFO *before* opening the
206 two-way pipe, and then gawk will use ptys instead of pipes.
207
208 On systems without ptys or where all the ptys are in use, gawk
209 will fall back to using plain pipes.
210
21113. Fixed a regex matching across buffer boundaries bug, with a
212 heuristic. See io.c:rsre_get_a_record.
213
21414. Profiling no longer dumps core if there are extension functions in place.
215
21615. Grammar and scanner cleaned up, courtesy of Stepen Kasal, to hopefully
217 once and for all fix the `/=' operator vs. `/=.../' regex ambiguity.
218 Lots of other grammar simplifications applied, as well.
219
22016. BINMODE should work now on more Windows ports.
221
22217. Updated to bison 1.875. Includes fix to bisonfix.sed script.
223
22418. The NODE structure is now 20% (8 bytes) smaller (on x86, anyway), which
225 should help conserve memory.
226
22719. Builds not in the source directory should work again.
228
22920. Arrays now use 2 NODE's per element instead of three. Combined with
230 #18, (on the x86) this reduces the overhead from 120 bytes per element
231 to just 64 bytes: almost a 50% improvement.
232
23321. Programs that make heavy use of changing IGNORECASE should now be
234 much faster, particularly if using a regular expression for FS or RS.
235 IGNORECASE now correctly affects RS regex record splitting, as well.
236
23722. IGNORECASE no longer affects single-character field splitting (FS = "c"),
238 or single-character record splitting (RS = "c").
239
240 This cleans up some weird behavior, and makes gawk better match the
241 documentation, which says it only affects regex-based field splitting
242 and record splitting.
243
244 The documentation on this was improved, too.
245
24623. The framework in test/ has been simplified, making it much easier to
247 add new tests while keeping the size of Makefile.am reasonable. Thanks
248 for this to Stepan Kasal.
249
25024. --lint=invalid causes lint warnings only about stuff that's actually
251 invalid. This needs additional work.
252
25325. More translations.
254
25526. The `get_a_record' routine has been revamped (currently by splitting it
256 into three variants). This should improve long-term maintainability.
257
25827. `match' now adds more entries to 3rd array arg:
259 match("the big dog", /([a-z]+) ([a-z]+) ([a-z]+)/, data)
260 fills in variables:
261 data[1, "start"], data[1, "length"], and so on.
262
26328. New `asorti' function with same interface as `asort', but sorts indices
264 instead of values.
265
26629. Documentation updated to FDL 1.2.
267
26830. New `configure' option --disable-lint at compile time disables lint
269 checking. With GCC dead-code-elimination, cuts almost 200K off the
270 executable size on GNU/Linux x86. Presumably speeds up runtime.
271
272 Using this will cause some of the tests in the test suite to fail.
273 This option may be removed at a later date.
274
27531. Various minor cleanups, see the ChangeLog for details.
276
277Changes from 3.1.0 to 3.1.1
278---------------------------
279
2801. Six new translations.
281
2822. Having more than 4 different values for OFMT and/or CONVFMT now works.
283
2843. The handling of dynamic regexes is now more more sane, esp. w.r.t.
285 the profiling code. The profiling code has been fixed in several
286 places.
287
2884. The return value of index("", "") is now 1.
289
2905. Gawk should no longer close fd 0 in child processes.
291
2926. Fixed test for strtod semantics and regenerated configure.
293
2947. Gawk can now be built with byacc; an accidental bison dependency was
295 removed.
296
2978. `yyerror' will no longer dump core on long source lines.
298
2999. Gawk now correctly queries getgroups(2) to figure out how many groups
300 the process has.
301
30210. New configure option to force use of included strftime, e.g. on
303 Solaris systems. See `./configure --help' for the details. Replaced
304 the included strftime.c with the one from textutils.
305
30611. OS/2 port has been updated.
307
30812. Multi-byte character support has been added, courtesy of IBM Japan.
309
31013. The `for (iggy in foo) delete foo[iggy]' -> `delete foo' optimisation
311 now works.
312
31314. Upgraded to gettext 0.11.2 and automake 1.5.
314
31515. Full gettext compatibility (new dcngettext function).
316
31716. The O'Reilly copyedits and indexing changes for the documentation have
318 been folded into the texinfo version of the manuals.
319
32017. A humongously long value for the AWKPATH environment variable will no
321 longer dump core.
322
32318. Configuration / Installation issues have been straightened out in
324 Makefile.am.
325
326Changes from 3.0.6 to 3.1.0
327---------------------------
328
3291. A new PROCINFO array provides info about the process. The non-I/O /dev/xxx
330 files are now obsolete, and their use always generates a warning.
331
3322. A new `mktime' builtin function was added for creating time stamps. The
333 `mktime' function written in awk was removed from the user's guide.
334
3353. New `--gen-po' option creates GNU gettext .po files for strings marked
336 with a leading underscore.
337
3384. Gawk now completely interprets special file names internally, ignoring the
339 existence of real /dev/stdin, /dev/stdout files, etc.
340
3415. The mmap code was removed. It was a worthwhile experiment that just
342 didn't work out.
343
3446. The BINMODE variable is new; on non-UNIX systems it affects how gawk
345 opens files for text vs. binary.
346
3477. The atari port is now unsupported.
348
3498. Gawk no longer supports `next file' as two words.
350
3519. On systems that support it, gawk now sets the `close on exec' flag on all
352 files and pipes it opens. This makes sure that child processes run via
353 `system' or pipes have plenty of file descriptors available.
354
35510. New ports: Tandem and BeOS. The Tandem port is unsupported.
356
35711. If `--posix' is in effect, newlines are not allowed after ?:.
358
35912. Weird OFMT/CONVFMT formats no longer cause fatal errors.
360
36113. Diagnostics about array parameters now include the parameter's name,
362 not just its number.
363
36414. configure should now automatically add -D_SYSV3 for ISC Unix.
365 (This seems to have made it into the gawk 3.0.x line long ago.)
366
36715. It is now possible to open a two-way pipe via the `|&' operator.
368 See the discussion in the manual about putting `sort' into such a pipeline,
369 though. (NOTE! This is borrowed from ksh: it is not the same as
370 the same operator in csh!)
371
37216. The `close' function now takes an optional second string argument
373 that allows closing one or the other end of the two-way pipe to
374 a co-process. This is needed to use `sort' in a co-process, see
375 the doc.
376
37717. If TCP/IP is available, special file names beginning with `/inet'
378 can be used with `|&' for IPC. Thanks to Juergen Kahrs for the initial
379 code.
380
38118. With `--enable-portals' on the configure command line, gawk will also
382 treat file names that start with `/p/' as a 4.4 BSD type portal file,
383 i.e., a two-way pipe for `|&'.
384
38519. Unrecognized escapes, such as "\q" now always generate a warning.
386
38720. The LINT variable is new; it provides dynamic control over the --lint
388 option.
389
39021. Lint warnings can be made fatal by using --lint=fatal or `LINT = "fatal"'.
391 Use this if you're really serious about portable code.
392
39322. Due to an enhanced sed script, there is no longer any need to worry
394 about finding or using alloca. alloca.c is thus now gone.
395
39623. A number of lint warnings have been added. Most notably, gawk will
397 detect if a variable is used before assigned to. Warnings for
398 when a string that isn't a number gets converted to a number are
399 in the code but disabled; they seem to be too picky in practice.
400
401 Also, gawk will now warn about function parameter names that shadow
402 global variable names.
403
40424. It is now possible to dynamically add builtin functions on systems
405 that support dlopen. This facility is not (yet) as portable or well
406 integrated as it might be. *** WARNING *** THIS FEATURE WILL EVOLVE!
407
40825. There are *many* new tests in the test suite.
409
41026. Profiling has been added! A separate version of gawk, named pgawk, is
411 built and generates a run-time execution profile. The --profile option
412 can be used to change the default output file. In regular gawk, this
413 option pretty-prints the parse tree.
414
41527. Gawk has been internationalized, using GNU gettext. Translations for
416 future distributions are most welcome. Simultaneously, gawk was switched
417 over to using automake. You need Automake 1.4a (from the CVS archive)
418 if you want to muck with the Makefile.am files.
419
42028. New `asort' function for sorting arrays. See the doc for details.
421
42229. The match function takes an optional array third argument to hold
423 the text matched by parenthesized sub-expressions.
424
42530. The bit op functions and octal and hex source code constants are on by
426 default, no longer a configure-time option. Recognition of non-decimal
427 data is now enabled at runtime with --non-decimal-data command line option.
428
42931. Internationalization features available at the awk level: new TEXTDOMAIN
430 variable and `bindtextdomain' and `dcgettext' functions. printf formats
431 may contain the "%2$3.5d" kind of notation for use in translations. See
432 the texinfo manual for details.
433
43432. The return value from `close' has been rationalized. Most notably,
435 closing something that wasn't open returns -1 but remains non-fatal.
436
43733. The array effeciency change from 3.0.5 was reverted; the semantics were
438 not right. Additionally, index values of previously stored elements
439 can no longer change dynamically.
440
44134. The new option --dump-variables dumps a list of all global variables and
442 their final types and values to a file you give, or to `awkvars.out'.
443
44435. Gawk now uses a recent version of random.c courtesy of the FreeBSD
445 project.
446
44736. The gawk source code now uses ANSI C function definitions (new style),
448 with ansi2knr to translate code for old compilers.
449
45037. `for (iggy in foo)' loops should be more robust now in the face of
451 adding/deleting elements in the middle; they loop over just the elements
452 that are present in the array when the loop starts.
453
454Changes from 3.0.5 to 3.0.6
455---------------------------
456
457This is a bug fix release only, pending further development on 3.1.0.
458
459Bugs fixed and changes made:
460
4611. Subscripting an array with a variable that is just a number no
462 longer magically converts the variable into a string.
463
4642. Similarly, running a `for (iggy in foo)' loop where `foo' is a
465 function parameter now works correctly.
466
4673. Similarly, `i = ""; v[i] = a; if (i in v) ...' now works again.
468
4694. Gawk now special cases `for (iggy in foo) delete foo[iggy]' and
470 treats it as the moral equivalent of `delete foo'. This should be
471 a major efficiency win when portably deleting large arrays.
472
4735. VMS port brought up to date.
474
475Changes from 3.0.4 to 3.0.5
476---------------------------
477
478This is a bug fix release only, pending further development on 3.1.0.
479
480Bugs Fixed:
481
482 1. `function foo(foo)' is now a fatal error.
483
484 2. Array indexing is now much more efficient: where possible, only one
485 copy of an index string is kept, even if used in multiple arrays.
486
487 3. Support was added for MacOS X and an `install-strip' target.
488
489 4. [s]printf formatting for `0' flag and floating point formats now
490 works correctly.
491
492 5. HP-UX large file support with GCC 2.95.1 now works.
493
494 6. Arguments that contain `=' but that aren't syntactically valid are
495 now treated as filenames, instead of as fatal errors.
496
497 7. `-v NF=foo' now works.
498
499 8. Non-ascii alphanumeric characters are now treated as such in the
500 right locales by regex.c. Similarly, a Latin-1 y-umlaut (decimal
501 value 255) in the program text no longer acts like EOF.
502
503 9. Array indexes are always compared as strings; fixes an obscure bug
504 when user input gets used for the `x in array' test.
505
50610. The usage message now points users to the documentation for how
507 to report bugs.
508
50911. `/=' now works after an array.
510
51112. `b += b += 1' now works correctly.
512
51313. IGNORECASE changing with calls `match' now works better. (Fix for
514 semi-obscure bug.)
515
51614. Multicharacter values for RS now generate a lint warning.
517
51815. The gawk open file caching is now much more efficient.
519
52016. Global arrays passed to functions are now managed better. In particular,
521 test/arynocls.awk won't crash referencing freed memory.
522
52317. In obscure cases, `getline var' can no longer clobber $0.
524
525Changes from 3.0.3 to 3.0.4
526---------------------------
527
528This is a bug fix release only, pending further development on 3.1.0.
529
530Bugs Fixed:
531
532 1. A memory leak when turning a function parameter into an array was
533 fixed.
534
535 2. The non-decimal data option now works correctly.
536
537 3. Using an empty pair of brackets as an array subscript no longer causes
538 a core dump during parsing. In general, syntax errors should not
539 cause core dumps any more.
540
541 4. Standard input is no longer closed if it provides program source,
542 avoiding strange I/O problems.
543
544 5. Memory corruption during printing with `print' has been fixed.
545
546 6. The gsub function now correctly counts the number of matches.
547
548 7. A typo in doc/Makefile.in has been fixed, making installation work.
549
550 8. Calling `next' or `nextfile' from a BEGIN or END rule is now fatal.
551
552 9. Subtle problems in rebuilding $0 when fields were changed have been
553 fixed.
554
55510. `FS = FS' now correctly turns off the use of FIELDWIDTHS.
556