| 1 | restore djgpp, eventually
|
|---|
| 2 | merge TODO lists
|
|---|
| 3 | add unit tests for lib/*.c
|
|---|
| 4 |
|
|---|
| 5 | strip: add an option to specify the program used to strip binaries.
|
|---|
| 6 | suggestion from Karl Berry
|
|---|
| 7 |
|
|---|
| 8 | doc/coreutils.texi:
|
|---|
| 9 | Address this comment: FIXME: mv's behavior in this case is system-dependent
|
|---|
| 10 | Better still: fix the code so it's *not* system-dependent.
|
|---|
| 11 |
|
|---|
| 12 | implement --target-directory=DIR for install (per texinfo documentation)
|
|---|
| 13 |
|
|---|
| 14 | ls: add --format=FORMAT option that controls how each line is printed.
|
|---|
| 15 |
|
|---|
| 16 | cp --no-preserve=X should not attempt to preserve attribute X
|
|---|
| 17 | reported by Andreas Schwab
|
|---|
| 18 |
|
|---|
| 19 | copy.c: Address the FIXME-maybe comment in copy_internal.
|
|---|
| 20 | And once that's done, add an exclusion so that `cp --link'
|
|---|
| 21 | no longer incurs the overhead of saving src. dev/ino and dest. filename
|
|---|
| 22 | in the hash table.
|
|---|
| 23 |
|
|---|
| 24 | See if we can be consistent about where --verbose sends its output:
|
|---|
| 25 | These all send --verbose output to stdout:
|
|---|
| 26 | head, tail, rm, cp, mv, ln, chmod, chown, chgrp, install, ln
|
|---|
| 27 | These send it to stderr:
|
|---|
| 28 | shred mkdir split
|
|---|
| 29 | readlink is different
|
|---|
| 30 |
|
|---|
| 31 | Write an autoconf test to work around build failure in HPUX's 64-bit mode.
|
|---|
| 32 | See notes in README -- and remove them once there's a work-around.
|
|---|
| 33 |
|
|---|
| 34 | Integrate use of sendfile, suggested here:
|
|---|
| 35 | http://mail.gnu.org/archive/html/bug-fileutils/2003-03/msg00030.html
|
|---|
| 36 | I don't plan to do that, since a few tests demonstrate no significant benefit.
|
|---|
| 37 |
|
|---|
| 38 | Should printf '\0123' print "\n3"?
|
|---|
| 39 | per report from TAKAI Kousuke on Mar 27
|
|---|
| 40 | http://mail.gnu.org/archive/html/bug-coreutils/2003-03/index.html
|
|---|
| 41 |
|
|---|
| 42 | printf: consider adapting builtins/printf.def from bash
|
|---|
| 43 |
|
|---|
| 44 | df: add `--total' option, suggested here http://bugs.debian.org/186007
|
|---|
| 45 |
|
|---|
| 46 | seq: give better diagnostics for invalid formats:
|
|---|
| 47 | e.g. no or too many % directives
|
|---|
| 48 | seq: consider allowing format string to contain no %-directives
|
|---|
| 49 |
|
|---|
| 50 | m4: rename all macros that start with AC_ to start with another prefix
|
|---|
| 51 |
|
|---|
| 52 | resolve RH report on cp -a forwarded by Tim Waugh
|
|---|
| 53 |
|
|---|
| 54 | Martin Michlmayr's patch to provide ls with `--sort directory' option
|
|---|
| 55 |
|
|---|
| 56 | tail: don't use xlseek; it *exits*.
|
|---|
| 57 | Instead, maybe use a macro and return nonzero.
|
|---|
| 58 |
|
|---|
| 59 | add mktemp? Suggested by Nelson Beebe
|
|---|
| 60 |
|
|---|
| 61 | df: alignment problem of `Used' heading with e.g., -mP
|
|---|
| 62 | reported by Karl Berry
|
|---|
| 63 |
|
|---|
| 64 | tr: support nontrivial equivalence classes, e.g. [=e=] with LC_COLLATE=fr_FR
|
|---|
| 65 |
|
|---|
| 66 | lib/strftime.c: Since %N is the only format that we need but that
|
|---|
| 67 | glibc's strftime doesn't support, consider using a wrapper that
|
|---|
| 68 | would expand /%(-_)?\d*N/ to the desired string and then pass the
|
|---|
| 69 | resulting string to glibc's strftime.
|
|---|
| 70 |
|
|---|
| 71 | sort: Compress temporary files when doing large external sort/merges.
|
|---|
| 72 | This improves performance when you can compress/uncompress faster than
|
|---|
| 73 | you can read/write, which is common in these days of fast CPUs.
|
|---|
| 74 | suggestion from Charles Randall on 2001-08-10
|
|---|
| 75 |
|
|---|
| 76 | sort: Add an ordering option -R that causes 'sort' to sort according
|
|---|
| 77 | to a random permutation of the correct sort order. Also, add an
|
|---|
| 78 | option --random-seed=SEED that causes 'sort' to use an arbitrary
|
|---|
| 79 | string SEED to select which permutations to use, in a deterministic
|
|---|
| 80 | manner: that is, if you sort a permutation of the same input file
|
|---|
| 81 | with the same --random-seed=SEED option twice, you'll get the same
|
|---|
| 82 | output. The default SEED is chosen at random, and contains enough
|
|---|
| 83 | information to ensure that the output permutation is random.
|
|---|
| 84 | suggestion from Feth AREZKI, Stephan Kasal, and Paul Eggert on 2003-07-17
|
|---|
| 85 |
|
|---|
| 86 | unexpand: [http://www.opengroup.org/onlinepubs/007908799/xcu/unexpand.html]
|
|---|
| 87 | printf 'x\t \t y\n'|unexpand -t 8,9 should print its input, unmodified.
|
|---|
| 88 | printf 'x\t \t y\n'|unexpand -t 5,8 should print "x\ty\n"
|
|---|
| 89 |
|
|---|
| 90 | Let GNU su use the `wheel' group if appropriate.
|
|---|
| 91 | (there are a couple patches, already)
|
|---|
| 92 |
|
|---|
| 93 | sort: Investigate better sorting algorithms; see Knuth vol. 3.
|
|---|
| 94 |
|
|---|
| 95 | We tried list merge sort, but it was about 50% slower than the
|
|---|
| 96 | recursive algorithm currently used by sortlines, and it used more
|
|---|
| 97 | comparisons. We're not sure why this was, as the theory suggests it
|
|---|
| 98 | should do fewer comparisons, so perhaps this should be revisited.
|
|---|
| 99 | List merge sort was implemented in the style of Knuth algorithm
|
|---|
| 100 | 5.2.4L, with the optimization suggested by exercise 5.2.4-22. The
|
|---|
| 101 | test case was 140,213,394 bytes, 426,4424 lines, text taken from the
|
|---|
| 102 | GCC 3.3 distribution, sort.c compiled with GCC 2.95.4 and running on
|
|---|
| 103 | Debian 3.0r1 GNU/Linux, 2.4GHz Pentium 4, single pass with no
|
|---|
| 104 | temporary files and plenty of RAM.
|
|---|
| 105 |
|
|---|
| 106 | Since comparisons seem to be the bottleneck, perhaps the best
|
|---|
| 107 | algorithm to try next should be merge insertion. See Knuth section
|
|---|
| 108 | 5.3.1, who credits Lester Ford, Jr. and Selmer Johnson, American
|
|---|
| 109 | Mathematical Monthly 66 (1959), 387-389.
|
|---|
| 110 |
|
|---|
| 111 | cp --recursive: perform dir traversals in source and dest hierarchy rather
|
|---|
| 112 | than forming full file names. The latter (current) approach fails
|
|---|
| 113 | unnecessarily when the names become very long.
|
|---|
| 114 |
|
|---|
| 115 | tail --p is now ambiguous
|
|---|
| 116 |
|
|---|
| 117 | Remove suspicious uses of alloca (ones that may allocate more than
|
|---|
| 118 | about 4k)
|
|---|
| 119 |
|
|---|
| 120 | Adapt these contribution guidelines for coreutils:
|
|---|
| 121 | http://sources.redhat.com/automake/contribute.html
|
|---|
| 122 |
|
|---|
| 123 |
|
|---|
| 124 | Changes expected to go in, post-5.2.1:
|
|---|
| 125 | ======================================
|
|---|
| 126 |
|
|---|
| 127 | wc: add an option, --files0-from [as for du] to make it read NUL-delimited
|
|---|
| 128 | file name arguments from a file.
|
|---|
| 129 |
|
|---|
| 130 | dd patch from Olivier Delhomme
|
|---|
| 131 |
|
|---|
| 132 | Apply Andreas Gruenbacher's ACL and xattr changes
|
|---|
| 133 |
|
|---|
| 134 | Apply Bruno Haible's hostname changes
|
|---|
| 135 |
|
|---|
| 136 | test/mv/*: clean up $other_partition_tmpdir in all cases
|
|---|
| 137 |
|
|---|
| 138 | ls: when both -l and --dereference-command-line-symlink-to-dir are
|
|---|
| 139 | specified, consider whether to let the latter select whether to
|
|---|
| 140 | dereference command line symlinks to directories. Since -l has
|
|---|
| 141 | an implicit --NO-dereference-command-line-symlink-to-dir meaning.
|
|---|
| 142 | Pointed out by Karl Berry.
|
|---|
| 143 |
|
|---|
| 144 | A more efficient version of factor, and possibly one that
|
|---|
| 145 | accepts inputs of size 2^64 and larger.
|
|---|
| 146 |
|
|---|
| 147 | dd: consider adding an option to suppress `bytes/block read/written'
|
|---|
| 148 | output to stderr. Suggested here:
|
|---|
| 149 | http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=165045
|
|---|
| 150 |
|
|---|
| 151 | Pending copyright papers:
|
|---|
| 152 | ------------------------
|
|---|
| 153 | ls --color: Ed Avis' patch to suppress escape sequences for
|
|---|
| 154 | non-highlighted files
|
|---|
| 155 |
|
|---|
| 156 | getpwnam from Bruce Korb
|
|---|
| 157 |
|
|---|
| 158 | pb (progress bar) from Miika Pekkarinen
|
|---|
| 159 |
|
|---|
| 160 | ------------------------------
|
|---|
| 161 | Look into improving the performance of md5sum.
|
|---|
| 162 | `openssl md5' is consistently about 30% faster than md5sum on an idle
|
|---|
| 163 | AMD 2000-XP system with plenty of RAM and a 261 MB input file.
|
|---|
| 164 | openssl's md5 implementation is in assembly, generated by a perl script.
|
|---|
| 165 |
|
|---|
| 166 | On an AMD-64 system, using a 700MB file on a tmpfs file system
|
|---|
| 167 | (and enough RAM so that no actual disk reads were performed),
|
|---|
| 168 | GNU md5sum is slightly faster than `openssl md5', e.g.:
|
|---|
| 169 |
|
|---|
| 170 | 2.38s user 0.38s system 100% cpu 2.756 total (gnu md5sum)
|
|---|
| 171 | vs.
|
|---|
| 172 | 2.52s user 0.34s system 100% cpu 2.869 total
|
|---|
| 173 |
|
|---|
| 174 | However, `openssl sha1' is about 5% faster than GNU sha1sum:
|
|---|
| 175 |
|
|---|
| 176 | 3.32s user 0.33s system 99% cpu 3.653 total (openssl sha1)
|
|---|
| 177 | 3.45s user 0.39s system 99% cpu 3.843 total (gnu sha1sum)
|
|---|
| 178 |
|
|---|
| 179 | The above are using the debian-sid (amd_64 alioth) binaries from
|
|---|
| 180 | coreutils-5.2.1. When I compile the latest (coreutils-cvs) with
|
|---|
| 181 | gcc-4.0 -O3, I get slightly (2-3%) better sha1sum performance,
|
|---|
| 182 | and a ~7% *decrease* in performance for md5sum. I suspect that
|
|---|
| 183 | with the right compiler options you can do much better.
|
|---|
| 184 | ------------------------------
|
|---|
| 185 |
|
|---|
| 186 | Have euidaccess.m4 check for eaccess as well as euidaccess
|
|---|
| 187 | If found, then do `#define euidaccess eaccess'.
|
|---|
| 188 |
|
|---|
| 189 | Remove long-deprecated options like tail's --allow-missing
|
|---|
| 190 |
|
|---|
| 191 | Add a distcheck-time test to ensure that every distributed
|
|---|
| 192 | file is either read-only(indicating generated) or is
|
|---|
| 193 | version-controlled and up to date.
|
|---|
| 194 |
|
|---|
| 195 | Implement Ulrich Drepper's suggestion to use getgrouplist rather
|
|---|
| 196 | than getugroups. This affects only `id', but makes a big difference
|
|---|
| 197 | on systems with many users and/or groups, and makes id usable once
|
|---|
| 198 | again on systems where access restrictions make getugroups fail.
|
|---|
| 199 | But first we'll need a run-test (either in an autoconf macro or at
|
|---|
| 200 | run time) to avoid the segfault bug in libc-2.3.2's getgrouplist.
|
|---|
| 201 | In that case, we'd revert to using a new (to-be-written) getgrouplist
|
|---|
| 202 | module that does most of what `id' already does.
|
|---|
| 203 |
|
|---|
| 204 | remove `%s' notation:
|
|---|
| 205 | grep -E "\`%.{,4}s'" src/*.c
|
|---|
| 206 |
|
|---|
| 207 | remove.c should never exit, yet may do so (see uses of EXIT_FAILURE)
|
|---|
| 208 |
|
|---|
| 209 | remove or adjust chown's --changes option, since it
|
|---|
| 210 | can't always do what it currently says it does.
|
|---|
| 211 |
|
|---|
| 212 | Adapt tools like wc, tr, fmt, etc. (most of the textutils) to be
|
|---|
| 213 | multibyte aware. The problem is that I want to avoid duplicating
|
|---|
| 214 | significant blocks of logic, yet I also want to incur only minimal
|
|---|
| 215 | (preferably `no') cost when operating in single-byte mode.
|
|---|
| 216 |
|
|---|
| 217 | Remove all uses of the `register' keyword
|
|---|
| 218 |
|
|---|
| 219 | rm: add support for a -I option, like that from FreeBSD's rm:
|
|---|
| 220 | -I Request confirmation once if more than three files are being
|
|---|
| 221 | removed or if a directory is being recursively removed. This
|
|---|
| 222 | is a far less intrusive option than -i yet provides almost
|
|---|
| 223 | the same level of protection against mistakes.
|
|---|