source: trunk/essentials/sys-apps/diffutils/doc/diff.texi@ 3506

Last change on this file since 3506 was 2556, checked in by bird, 20 years ago

diffutils 2.8.1

File size: 171.4 KB
Line 
1\input texinfo @c -*-texinfo-*-
2@comment $Id$
3@comment %**start of header
4@setfilename diff.info
5@include version.texi
6@settitle Comparing and Merging Files
7@syncodeindex vr cp
8@setchapternewpage odd
9@comment %**end of header
10@copying
11This manual is for GNU Diffutils
12(version @value{VERSION}, @value{UPDATED}),
13and documents the @sc{gnu} @command{diff}, @command{diff3},
14@command{sdiff}, and @command{cmp} commands for showing the
15differences between files and the @sc{gnu} @command{patch} command for
16using their output to update files.
17
18Copyright @copyright{} 1992, 1993, 1994, 1998, 2001, 2002 Free
19Software Foundation, Inc.
20
21@quotation
22Permission is granted to copy, distribute and/or modify this document
23under the terms of the GNU Free Documentation License, Version 1.1 or
24any later version published by the Free Software Foundation; with no
25Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
26and with the Back-Cover Texts as in (a) below. A copy of the
27license is included in the section entitled ``GNU Free Documentation
28License.''
29
30(a) The FSF's Back-Cover Text is: ``You have freedom to copy and modify
31this GNU Manual, like GNU software. Copies published by the Free
32Software Foundation raise funds for GNU development.''
33@end quotation
34@end copying
35
36@c Debian install-info (up through at least version 1.9.20) uses only the
37@c first dircategory. Put this one first, as it is more useful in practice.
38@dircategory Individual utilities
39@direntry
40* cmp: (diff)Invoking cmp. Compare 2 files byte by byte.
41* diff: (diff)Invoking diff. Compare 2 files line by line.
42* diff3: (diff)Invoking diff3. Compare 3 files line by line.
43* patch: (diff)Invoking patch. Apply a patch to a file.
44* sdiff: (diff)Invoking sdiff. Merge 2 files side-by-side.
45@end direntry
46
47@dircategory GNU packages
48@direntry
49* Diff: (diff). Comparing and merging files.
50@end direntry
51
52@titlepage
53@title Comparing and Merging Files
54@subtitle for Diffutils @value{VERSION} and @code{patch} 2.5.4
55@subtitle @value{UPDATED}
56@author David MacKenzie, Paul Eggert, and Richard Stallman
57@page
58@vskip 0pt plus 1filll
59@insertcopying
60@end titlepage
61
62@shortcontents
63@contents
64
65@ifnottex
66@node Top
67@top Comparing and Merging Files
68
69@insertcopying
70@end ifnottex
71
72@menu
73* Overview:: Preliminary information.
74* Comparison:: What file comparison means.
75
76* Output Formats:: Formats for two-way difference reports.
77* Incomplete Lines:: Lines that lack trailing newlines.
78* Comparing Directories:: Comparing files and directories.
79* Adjusting Output:: Making @command{diff} output prettier.
80* diff Performance:: Making @command{diff} smarter or faster.
81
82* Comparing Three Files:: Formats for three-way difference reports.
83* diff3 Merging:: Merging from a common ancestor.
84
85* Interactive Merging:: Interactive merging with @command{sdiff}.
86
87* Merging with patch:: Using @command{patch} to change old files into new ones.
88* Making Patches:: Tips for making and using patch distributions.
89
90* Invoking cmp:: Compare two files byte by byte.
91* Invoking diff:: Compare two files line by line.
92* Invoking diff3:: Compare three files line by line.
93* Invoking patch:: Apply a diff file to an original.
94* Invoking sdiff:: Side-by-side merge of file differences.
95
96* Standards conformance:: Conformance to the @sc{posix} standard.
97* Projects:: If you've found a bug or other shortcoming.
98
99* Copying This Manual:: How to make copies of this manual.
100* Index:: Index.
101@end menu
102
103@node Overview
104@unnumbered Overview
105@cindex overview of @command{diff} and @command{patch}
106
107Computer users often find occasion to ask how two files differ. Perhaps
108one file is a newer version of the other file. Or maybe the two files
109started out as identical copies but were changed by different people.
110
111You can use the @command{diff} command to show differences between two
112files, or each corresponding file in two directories. @command{diff}
113outputs differences between files line by line in any of several
114formats, selectable by command line options. This set of differences is
115often called a @dfn{diff} or @dfn{patch}. For files that are identical,
116@command{diff} normally produces no output; for binary (non-text) files,
117@command{diff} normally reports only that they are different.
118
119You can use the @command{cmp} command to show the byte and line numbers
120where two files differ. @command{cmp} can also show all the bytes
121that differ between the two files, side by side. A way to compare
122two files character by character is the Emacs command @kbd{M-x
123compare-windows}. @xref{Other Window, , Other Window, emacs, The @sc{gnu}
124Emacs Manual}, for more information on that command.
125
126You can use the @command{diff3} command to show differences among three
127files. When two people have made independent changes to a common
128original, @command{diff3} can report the differences between the original
129and the two changed versions, and can produce a merged file that
130contains both persons' changes together with warnings about conflicts.
131
132You can use the @command{sdiff} command to merge two files interactively.
133
134You can use the set of differences produced by @command{diff} to distribute
135updates to text files (such as program source code) to other people.
136This method is especially useful when the differences are small compared
137to the complete files. Given @command{diff} output, you can use the
138@command{patch} program to update, or @dfn{patch}, a copy of the file. If you
139think of @command{diff} as subtracting one file from another to produce
140their difference, you can think of @command{patch} as adding the difference
141to one file to reproduce the other.
142
143This manual first concentrates on making diffs, and later shows how to
144use diffs to update files.
145
146@sc{gnu} @command{diff} was written by Paul Eggert, Mike Haertel,
147David Hayes, Richard Stallman, and Len Tower. Wayne Davison designed and
148implemented the unified output format. The basic algorithm is described
149in ``An O(ND) Difference Algorithm and its Variations'', Eugene W. Myers,
150@cite{Algorithmica} Vol.@: 1 No.@: 2, 1986, pp.@: 251--266; and in ``A File
151Comparison Program'', Webb Miller and Eugene W. Myers,
152@cite{Software---Practice and Experience} Vol.@: 15 No.@: 11, 1985,
153pp.@: 1025--1040.
154@c From: "Gene Myers" <[email protected]>
155@c They are about the same basic algorithm; the Algorithmica
156@c paper gives a rigorous treatment and the sub-algorithm for
157@c delivering scripts and should be the primary reference, but
158@c both should be mentioned.
159The algorithm was independently discovered as described in
160``Algorithms for Approximate String Matching'',
161E. Ukkonen, @cite{Information and Control} Vol.@: 64, 1985, pp.@: 100--118.
162@c From: "Gene Myers" <[email protected]>
163@c Date: Wed, 29 Sep 1993 08:27:55 MST
164@c Ukkonen should be given credit for also discovering the algorithm used
165@c in GNU diff.
166
167@sc{gnu} @command{diff3} was written by Randy Smith. @sc{gnu}
168@command{sdiff} was written by Thomas Lord. @sc{gnu} @command{cmp}
169was written by Torbjorn Granlund and David MacKenzie.
170
171@command{patch} was written mainly by Larry Wall and Paul Eggert;
172several @sc{gnu} enhancements were contributed by Wayne Davison and
173David MacKenzie. Parts of this manual are adapted from a manual page
174written by Larry Wall, with his permission.
175
176@node Comparison
177@chapter What Comparison Means
178@cindex introduction
179
180There are several ways to think about the differences between two files.
181One way to think of the differences is as a series of lines that were
182deleted from, inserted in, or changed in one file to produce the other
183file. @command{diff} compares two files line by line, finds groups of
184lines that differ, and reports each group of differing lines. It can
185report the differing lines in several formats, which have different
186purposes.
187
188@sc{gnu} @command{diff} can show whether files are different without detailing
189the differences. It also provides ways to suppress certain kinds of
190differences that are not important to you. Most commonly, such
191differences are changes in the amount of white space between words or
192lines. @command{diff} also provides ways to suppress differences in
193alphabetic case or in lines that match a regular expression that you
194provide. These options can accumulate; for example, you can ignore
195changes in both white space and alphabetic case.
196
197Another way to think of the differences between two files is as a
198sequence of pairs of bytes that can be either identical or
199different. @command{cmp} reports the differences between two files
200byte by byte, instead of line by line. As a result, it is often
201more useful than @command{diff} for comparing binary files. For text
202files, @command{cmp} is useful mainly when you want to know only whether
203two files are identical, or whether one file is a prefix of the other.
204
205To illustrate the effect that considering changes byte by byte
206can have compared with considering them line by line, think of what
207happens if a single newline character is added to the beginning of a
208file. If that file is then compared with an otherwise identical file
209that lacks the newline at the beginning, @command{diff} will report that a
210blank line has been added to the file, while @command{cmp} will report that
211almost every byte of the two files differs.
212
213@command{diff3} normally compares three input files line by line, finds
214groups of lines that differ, and reports each group of differing lines.
215Its output is designed to make it easy to inspect two different sets of
216changes to the same file.
217
218@menu
219* Hunks:: Groups of differing lines.
220* White Space:: Suppressing differences in white space.
221* Blank Lines:: Suppressing differences in blank lines.
222* Case Folding:: Suppressing differences in alphabetic case.
223* Specified Folding:: Suppressing differences that match regular expressions.
224* Brief:: Summarizing which files are different.
225* Binary:: Comparing binary files or forcing text comparisons.
226@end menu
227
228@node Hunks
229@section Hunks
230@cindex hunks
231
232When comparing two files, @command{diff} finds sequences of lines common to
233both files, interspersed with groups of differing lines called
234@dfn{hunks}. Comparing two identical files yields one sequence of
235common lines and no hunks, because no lines differ. Comparing two
236entirely different files yields no common lines and one large hunk that
237contains all lines of both files. In general, there are many ways to
238match up lines between two given files. @command{diff} tries to minimize
239the total hunk size by finding large sequences of common lines
240interspersed with small hunks of differing lines.
241
242For example, suppose the file @file{F} contains the three lines
243@samp{a}, @samp{b}, @samp{c}, and the file @file{G} contains the same
244three lines in reverse order @samp{c}, @samp{b}, @samp{a}. If
245@command{diff} finds the line @samp{c} as common, then the command
246@samp{diff F G} produces this output:
247
248@example
2491,2d0
250< a
251< b
2523a2,3
253> b
254> a
255@end example
256
257@noindent
258But if @command{diff} notices the common line @samp{b} instead, it produces
259this output:
260
261@example
2621c1
263< a
264---
265> c
2663c3
267< c
268---
269> a
270@end example
271
272@noindent
273It is also possible to find @samp{a} as the common line. @command{diff}
274does not always find an optimal matching between the files; it takes
275shortcuts to run faster. But its output is usually close to the
276shortest possible. You can adjust this tradeoff with the
277@option{--minimal} option (@pxref{diff Performance}).
278
279@node White Space
280@section Suppressing Differences in Blank and Tab Spacing
281@cindex blank and tab difference suppression
282@cindex tab and blank difference suppression
283
284The @option{-E} and @option{--ignore-tab-expansion} options ignore the
285distinction between tabs and spaces on input. A tab is considered to be
286equivalent to the number of spaces to the next tab stop. @command{diff}
287assumes that tab stops are set every 8 print columns.
288
289The @option{-b} and @option{--ignore-space-change} options are stronger.
290They ignore white space at line end, and consider all other sequences of
291one or more white space characters to be equivalent. With these
292options, @command{diff} considers the following two lines to be equivalent,
293where @samp{$} denotes the line end:
294
295@example
296Here lyeth muche rychnesse in lytell space. -- John Heywood$
297Here lyeth muche rychnesse in lytell space. -- John Heywood $
298@end example
299
300The @option{-w} and @option{--ignore-all-space} options are stronger still.
301They ignore difference even if one line has white space where
302the other line has none. @dfn{White space} characters include
303tab, newline, vertical tab, form feed, carriage return, and space;
304some locales may define additional characters to be white space.
305With these options, @command{diff} considers the
306following two lines to be equivalent, where @samp{$} denotes the line
307end and @samp{^M} denotes a carriage return:
308
309@example
310Here lyeth muche rychnesse in lytell space.-- John Heywood$
311 He relyeth much erychnes seinly tells pace. --John Heywood ^M$
312@end example
313
314@node Blank Lines
315@section Suppressing Differences in Blank Lines
316@cindex blank line difference suppression
317
318The @option{-B} and @option{--ignore-blank-lines} options ignore insertions
319or deletions of blank lines. These options affect only lines
320that are completely empty; they do not affect lines that look empty but
321contain space or tab characters. With these options, for example, a
322file containing
323@example
3241. A point is that which has no part.
325
3262. A line is breadthless length.
327-- Euclid, The Elements, I
328@end example
329@noindent
330is considered identical to a file containing
331@example
3321. A point is that which has no part.
3332. A line is breadthless length.
334
335
336-- Euclid, The Elements, I
337@end example
338
339@node Case Folding
340@section Suppressing Case Differences
341@cindex case difference suppression
342
343@sc{gnu} @command{diff} can treat lower case letters as equivalent to their
344upper case counterparts, so that, for example, it considers @samp{Funky
345Stuff}, @samp{funky STUFF}, and @samp{fUNKy stuFf} to all be the same.
346To request this, use the @option{-i} or @option{--ignore-case} option.
347
348@node Specified Folding
349@section Suppressing Lines Matching a Regular Expression
350@cindex regular expression suppression
351
352To ignore insertions and deletions of lines that match a
353@command{grep}-style regular expression, use the @option{-I
354@var{regexp}} or @option{--ignore-matching-lines=@var{regexp}} option.
355You should escape
356regular expressions that contain shell metacharacters to prevent the
357shell from expanding them. For example, @samp{diff -I '^[[:digit:]]'} ignores
358all changes to lines beginning with a digit.
359
360However, @option{-I} only ignores the insertion or deletion of lines that
361contain the regular expression if every changed line in the hunk---every
362insertion and every deletion---matches the regular expression. In other
363words, for each nonignorable change, @command{diff} prints the complete set
364of changes in its vicinity, including the ignorable ones.
365
366You can specify more than one regular expression for lines to ignore by
367using more than one @option{-I} option. @command{diff} tries to match each
368line against each regular expression.
369
370@node Brief
371@section Summarizing Which Files Differ
372@cindex summarizing which files differ
373@cindex brief difference reports
374
375When you only want to find out whether files are different, and you
376don't care what the differences are, you can use the summary output
377format. In this format, instead of showing the differences between the
378files, @command{diff} simply reports whether files differ. The @option{-q}
379and @option{--brief} options select this output format.
380
381This format is especially useful when comparing the contents of two
382directories. It is also much faster than doing the normal line by line
383comparisons, because @command{diff} can stop analyzing the files as soon as
384it knows that there are any differences.
385
386You can also get a brief indication of whether two files differ by using
387@command{cmp}. For files that are identical, @command{cmp} produces no
388output. When the files differ, by default, @command{cmp} outputs the byte
389and line number where the first difference occurs. You can use
390the @option{-s} option to suppress that information, so that @command{cmp}
391produces no output and reports whether the files differ using only its
392exit status (@pxref{Invoking cmp}).
393
394@c Fix this.
395Unlike @command{diff}, @command{cmp} cannot compare directories; it can only
396compare two files.
397
398@node Binary
399@section Binary Files and Forcing Text Comparisons
400@cindex binary file diff
401@cindex text versus binary diff
402
403If @command{diff} thinks that either of the two files it is comparing is
404binary (a non-text file), it normally treats that pair of files much as
405if the summary output format had been selected (@pxref{Brief}), and
406reports only that the binary files are different. This is because line
407by line comparisons are usually not meaningful for binary files.
408
409@command{diff} determines whether a file is text or binary by checking the
410first few bytes in the file; the exact number of bytes is system
411dependent, but it is typically several thousand. If every byte in
412that part of the file is non-null, @command{diff} considers the file to be
413text; otherwise it considers the file to be binary.
414
415Sometimes you might want to force @command{diff} to consider files to be
416text. For example, you might be comparing text files that contain
417null characters; @command{diff} would erroneously decide that those are
418non-text files. Or you might be comparing documents that are in a
419format used by a word processing system that uses null characters to
420indicate special formatting. You can force @command{diff} to consider all
421files to be text files, and compare them line by line, by using the
422@option{-a} or @option{--text} option. If the files you compare using this
423option do not in fact contain text, they will probably contain few
424newline characters, and the @command{diff} output will consist of hunks
425showing differences between long lines of whatever characters the files
426contain.
427
428You can also force @command{diff} to consider all files to be binary files,
429and report only whether they differ (but not how). Use the
430@option{-q} or @option{--brief} option for this.
431
432Differing binary files are considered to cause trouble because the
433resulting @command{diff} output does not capture all the differences.
434This trouble causes @command{diff} to exit with status 2. However,
435this trouble cannot occur with the @option{--a} or @option{--text}
436option, or with the @option{-q} or @option{--brief} option, as these
437options both cause @command{diff} to treat binary files like text
438files.
439
440In operating systems that distinguish between text and binary files,
441@command{diff} normally reads and writes all data as text. Use the
442@option{--binary} option to force @command{diff} to read and write binary
443data instead. This option has no effect on a @sc{posix}-compliant system
444like @sc{gnu} or traditional Unix. However, many personal computer
445operating systems represent the end of a line with a carriage return
446followed by a newline. On such systems, @command{diff} normally ignores
447these carriage returns on input and generates them at the end of each
448output line, but with the @option{--binary} option @command{diff} treats
449each carriage return as just another input character, and does not
450generate a carriage return at the end of each output line. This can be
451useful when dealing with non-text files that are meant to be
452interchanged with @sc{posix}-compliant systems.
453
454The @option{--strip-trailing-cr} causes @command{diff} to treat input
455lines that end in carriage return followed by newline as if they end
456in plain newline. This can be useful when comparing text that is
457imperfectly imported from many personal computer operating systems.
458This option affects how lines are read, which in turn affects how they
459are compared and output.
460
461If you want to compare two files byte by byte, you can use the
462@command{cmp} program with the @option{-l} option to show the values
463of each differing byte in the two files. With @sc{gnu} @command{cmp},
464you can also use the @option{-b} option to show the @sc{ascii}
465representation of those bytes. @xref{Invoking cmp}, for more
466information.
467
468If @command{diff3} thinks that any of the files it is comparing is binary
469(a non-text file), it normally reports an error, because such
470comparisons are usually not useful. @command{diff3} uses the same test as
471@command{diff} to decide whether a file is binary. As with @command{diff}, if
472the input files contain a few non-text bytes but otherwise are like
473text files, you can force @command{diff3} to consider all files to be text
474files and compare them line by line by using the @option{-a} or
475@option{--text} options.
476
477@node Output Formats
478@chapter @command{diff} Output Formats
479@cindex output formats
480@cindex format of @command{diff} output
481
482@command{diff} has several mutually exclusive options for output format.
483The following sections describe each format, illustrating how
484@command{diff} reports the differences between two sample input files.
485
486@menu
487* Sample diff Input:: Sample @command{diff} input files for examples.
488* Normal:: Showing differences without surrounding text.
489* Context:: Showing differences with the surrounding text.
490* Side by Side:: Showing differences in two columns.
491* Scripts:: Generating scripts for other programs.
492* If-then-else:: Merging files with if-then-else.
493@end menu
494
495@node Sample diff Input
496@section Two Sample Input Files
497@cindex @command{diff} sample input
498@cindex sample input for @command{diff}
499
500Here are two sample files that we will use in numerous examples to
501illustrate the output of @command{diff} and how various options can change
502it.
503
504This is the file @file{lao}:
505
506@example
507The Way that can be told of is not the eternal Way;
508The name that can be named is not the eternal name.
509The Nameless is the origin of Heaven and Earth;
510The Named is the mother of all things.
511Therefore let there always be non-being,
512 so we may see their subtlety,
513And let there always be being,
514 so we may see their outcome.
515The two are the same,
516But after they are produced,
517 they have different names.
518@end example
519
520This is the file @file{tzu}:
521
522@example
523The Nameless is the origin of Heaven and Earth;
524The named is the mother of all things.
525
526Therefore let there always be non-being,
527 so we may see their subtlety,
528And let there always be being,
529 so we may see their outcome.
530The two are the same,
531But after they are produced,
532 they have different names.
533They both may be called deep and profound.
534Deeper and more profound,
535The door of all subtleties!
536@end example
537
538In this example, the first hunk contains just the first two lines of
539@file{lao}, the second hunk contains the fourth line of @file{lao}
540opposing the second and third lines of @file{tzu}, and the last hunk
541contains just the last three lines of @file{tzu}.
542
543@node Normal
544@section Showing Differences Without Context
545@cindex normal output format
546@cindex @samp{<} output format
547
548The ``normal'' @command{diff} output format shows each hunk of differences
549without any surrounding context. Sometimes such output is the clearest
550way to see how lines have changed, without the clutter of nearby
551unchanged lines (although you can get similar results with the context
552or unified formats by using 0 lines of context). However, this format
553is no longer widely used for sending out patches; for that purpose, the
554context format (@pxref{Context Format}) and the unified format
555(@pxref{Unified Format}) are superior. Normal format is the default for
556compatibility with older versions of @command{diff} and the @sc{posix}
557standard. Use the @option{--normal} option to select this output
558format explicitly.
559
560@menu
561* Detailed Normal:: A detailed description of normal output format.
562* Example Normal:: Sample output in the normal format.
563@end menu
564
565@node Detailed Normal
566@subsection Detailed Description of Normal Format
567
568The normal output format consists of one or more hunks of differences;
569each hunk shows one area where the files differ. Normal format hunks
570look like this:
571
572@example
573@var{change-command}
574< @var{from-file-line}
575< @var{from-file-line}@dots{}
576---
577> @var{to-file-line}
578> @var{to-file-line}@dots{}
579@end example
580
581There are three types of change commands. Each consists of a line
582number or comma-separated range of lines in the first file, a single
583character indicating the kind of change to make, and a line number or
584comma-separated range of lines in the second file. All line numbers are
585the original line numbers in each file. The types of change commands
586are:
587
588@table @samp
589@item @var{l}a@var{r}
590Add the lines in range @var{r} of the second file after line @var{l} of
591the first file. For example, @samp{8a12,15} means append lines 12--15
592of file 2 after line 8 of file 1; or, if changing file 2 into file 1,
593delete lines 12--15 of file 2.
594
595@item @var{f}c@var{t}
596Replace the lines in range @var{f} of the first file with lines in range
597@var{t} of the second file. This is like a combined add and delete, but
598more compact. For example, @samp{5,7c8,10} means change lines 5--7 of
599file 1 to read as lines 8--10 of file 2; or, if changing file 2 into
600file 1, change lines 8--10 of file 2 to read as lines 5--7 of file 1.
601
602@item @var{r}d@var{l}
603Delete the lines in range @var{r} from the first file; line @var{l} is where
604they would have appeared in the second file had they not been deleted.
605For example, @samp{5,7d3} means delete lines 5--7 of file 1; or, if
606changing file 2 into file 1, append lines 5--7 of file 1 after line 3 of
607file 2.
608@end table
609
610@node Example Normal
611@subsection An Example of Normal Format
612
613Here is the output of the command @samp{diff lao tzu}
614(@pxref{Sample diff Input}, for the complete contents of the two files).
615Notice that it shows only the lines that are different between the two
616files.
617
618@example
6191,2d0
620< The Way that can be told of is not the eternal Way;
621< The name that can be named is not the eternal name.
6224c2,3
623< The Named is the mother of all things.
624---
625> The named is the mother of all things.
626>
62711a11,13
628> They both may be called deep and profound.
629> Deeper and more profound,
630> The door of all subtleties!
631@end example
632
633@node Context
634@section Showing Differences in Their Context
635@cindex context output format
636@cindex @samp{!} output format
637
638Usually, when you are looking at the differences between files, you will
639also want to see the parts of the files near the lines that differ, to
640help you understand exactly what has changed. These nearby parts of the
641files are called the @dfn{context}.
642
643@sc{gnu} @command{diff} provides two output formats that show context
644around the differing lines: @dfn{context format} and @dfn{unified
645format}. It can optionally show in which function or section of the
646file the differing lines are found.
647
648If you are distributing new versions of files to other people in the
649form of @command{diff} output, you should use one of the output formats
650that show context so that they can apply the diffs even if they have
651made small changes of their own to the files. @command{patch} can apply
652the diffs in this case by searching in the files for the lines of
653context around the differing lines; if those lines are actually a few
654lines away from where the diff says they are, @command{patch} can adjust
655the line numbers accordingly and still apply the diff correctly.
656@xref{Imperfect}, for more information on using @command{patch} to apply
657imperfect diffs.
658
659@menu
660* Context Format:: An output format that shows surrounding lines.
661* Unified Format:: A more compact output format that shows context.
662* Sections:: Showing which sections of the files differences are in.
663* Alternate Names:: Showing alternate file names in context headers.
664@end menu
665
666@node Context Format
667@subsection Context Format
668
669The context output format shows several lines of context around the
670lines that differ. It is the standard format for distributing updates
671to source code.
672
673To select this output format, use the @option{-C @var{lines}},
674@option{--context@r{[}=@var{lines}@r{]}}, or @option{-c} option. The
675argument @var{lines} that some of these options take is the number of
676lines of context to show. If you do not specify @var{lines}, it
677defaults to three. For proper operation, @command{patch} typically needs
678at least two lines of context.
679
680@menu
681* Detailed Context:: A detailed description of the context output format.
682* Example Context:: Sample output in context format.
683* Less Context:: Another sample with less context.
684@end menu
685
686@node Detailed Context
687@subsubsection Detailed Description of Context Format
688
689The context output format starts with a two-line header, which looks
690like this:
691
692@example
693*** @var{from-file} @var{from-file-modification-time}
694--- @var{to-file} @var{to-file-modification time}
695@end example
696
697@noindent
698@vindex LC_TIME
699@cindex time stamp format, context diffs
700The time stamp normally looks like @samp{2002-02-21 23:30:39.942229878
701-0800} to indicate the date, time with fractional seconds, and time
702zone in @uref{ftp://ftp.isi.edu/in-notes/rfc2822.txt, Internet RFC
7032822 format}. However, a traditional time stamp like @samp{Thu Feb 21
70423:30:39 2002} is used if the @env{LC_TIME} locale category is either
705@samp{C} or @samp{POSIX}.
706
707You can change the header's content with the
708@option{--label=@var{label}} option; see @ref{Alternate Names}.
709
710Next come one or more hunks of differences; each hunk shows one area
711where the files differ. Context format hunks look like this:
712
713@example
714***************
715*** @var{from-file-line-range} ****
716 @var{from-file-line}
717 @var{from-file-line}@dots{}
718--- @var{to-file-line-range} ----
719 @var{to-file-line}
720 @var{to-file-line}@dots{}
721@end example
722
723The lines of context around the lines that differ start with two space
724characters. The lines that differ between the two files start with one
725of the following indicator characters, followed by a space character:
726
727@table @samp
728@item !
729A line that is part of a group of one or more lines that changed between
730the two files. There is a corresponding group of lines marked with
731@samp{!} in the part of this hunk for the other file.
732
733@item +
734An ``inserted'' line in the second file that corresponds to nothing in
735the first file.
736
737@item -
738A ``deleted'' line in the first file that corresponds to nothing in the
739second file.
740@end table
741
742If all of the changes in a hunk are insertions, the lines of
743@var{from-file} are omitted. If all of the changes are deletions, the
744lines of @var{to-file} are omitted.
745
746@node Example Context
747@subsubsection An Example of Context Format
748
749Here is the output of @samp{diff -c lao tzu} (@pxref{Sample diff Input},
750for the complete contents of the two files). Notice that up to three
751lines that are not different are shown around each line that is
752different; they are the context lines. Also notice that the first two
753hunks have run together, because their contents overlap.
754
755@example
756*** lao 2002-02-21 23:30:39.942229878 -0800
757--- tzu 2002-02-21 23:30:50.442260588 -0800
758***************
759*** 1,7 ****
760- The Way that can be told of is not the eternal Way;
761- The name that can be named is not the eternal name.
762 The Nameless is the origin of Heaven and Earth;
763! The Named is the mother of all things.
764 Therefore let there always be non-being,
765 so we may see their subtlety,
766 And let there always be being,
767--- 1,6 ----
768 The Nameless is the origin of Heaven and Earth;
769! The named is the mother of all things.
770!
771 Therefore let there always be non-being,
772 so we may see their subtlety,
773 And let there always be being,
774***************
775*** 9,11 ****
776--- 8,13 ----
777 The two are the same,
778 But after they are produced,
779 they have different names.
780+ They both may be called deep and profound.
781+ Deeper and more profound,
782+ The door of all subtleties!
783@end example
784
785@node Less Context
786@subsubsection An Example of Context Format with Less Context
787
788Here is the output of @samp{diff -C 1 lao tzu} (@pxref{Sample diff
789Input}, for the complete contents of the two files). Notice that at
790most one context line is reported here.
791
792@example
793*** lao 2002-02-21 23:30:39.942229878 -0800
794--- tzu 2002-02-21 23:30:50.442260588 -0800
795***************
796*** 1,5 ****
797- The Way that can be told of is not the eternal Way;
798- The name that can be named is not the eternal name.
799 The Nameless is the origin of Heaven and Earth;
800! The Named is the mother of all things.
801 Therefore let there always be non-being,
802--- 1,4 ----
803 The Nameless is the origin of Heaven and Earth;
804! The named is the mother of all things.
805!
806 Therefore let there always be non-being,
807***************
808*** 11 ****
809--- 10,13 ----
810 they have different names.
811+ They both may be called deep and profound.
812+ Deeper and more profound,
813+ The door of all subtleties!
814@end example
815
816@node Unified Format
817@subsection Unified Format
818@cindex unified output format
819@cindex @samp{+-} output format
820
821The unified output format is a variation on the context format that is
822more compact because it omits redundant context lines. To select this
823output format, use the @option{-U @var{lines}},
824@option{--unified@r{[}=@var{lines}@r{]}}, or @option{-u}
825option. The argument @var{lines} is the number of lines of context to
826show. When it is not given, it defaults to three.
827
828At present, only @sc{gnu} @command{diff} can produce this format and
829only @sc{gnu} @command{patch} can automatically apply diffs in this
830format. For proper operation, @command{patch} typically needs at
831least three lines of context.
832
833@menu
834* Detailed Unified:: A detailed description of unified format.
835* Example Unified:: Sample output in unified format.
836@end menu
837
838@node Detailed Unified
839@subsubsection Detailed Description of Unified Format
840
841The unified output format starts with a two-line header, which looks
842like this:
843
844@example
845--- @var{from-file} @var{from-file-modification-time}
846+++ @var{to-file} @var{to-file-modification-time}
847@end example
848
849@noindent
850@cindex time stamp format, unified diffs
851The time stamp looks like @samp{2002-02-21 23:30:39.942229878 -0800}
852to indicate the date, time with fractional seconds, and time zone.
853
854You can change the header's content with the
855@option{--label=@var{label}} option; see @xref{Alternate Names}.
856
857Next come one or more hunks of differences; each hunk shows one area
858where the files differ. Unified format hunks look like this:
859
860@example
861@@@@ @var{from-file-range} @var{to-file-range} @@@@
862 @var{line-from-either-file}
863 @var{line-from-either-file}@dots{}
864@end example
865
866The lines common to both files begin with a space character. The lines
867that actually differ between the two files have one of the following
868indicator characters in the left print column:
869
870@table @samp
871@item +
872A line was added here to the first file.
873
874@item -
875A line was removed here from the first file.
876@end table
877
878@node Example Unified
879@subsubsection An Example of Unified Format
880
881Here is the output of the command @samp{diff -u lao tzu}
882(@pxref{Sample diff Input}, for the complete contents of the two files):
883
884@example
885--- lao 2002-02-21 23:30:39.942229878 -0800
886+++ tzu 2002-02-21 23:30:50.442260588 -0800
887@@@@ -1,7 +1,6 @@@@
888-The Way that can be told of is not the eternal Way;
889-The name that can be named is not the eternal name.
890 The Nameless is the origin of Heaven and Earth;
891-The Named is the mother of all things.
892+The named is the mother of all things.
893+
894 Therefore let there always be non-being,
895 so we may see their subtlety,
896 And let there always be being,
897@@@@ -9,3 +8,6 @@@@
898 The two are the same,
899 But after they are produced,
900 they have different names.
901+They both may be called deep and profound.
902+Deeper and more profound,
903+The door of all subtleties!
904@end example
905
906@node Sections
907@subsection Showing Which Sections Differences Are in
908@cindex headings
909@cindex section headings
910
911Sometimes you might want to know which part of the files each change
912falls in. If the files are source code, this could mean which function
913was changed. If the files are documents, it could mean which chapter or
914appendix was changed. @sc{gnu} @command{diff} can show this by displaying the
915nearest section heading line that precedes the differing lines. Which
916lines are ``section headings'' is determined by a regular expression.
917
918@menu
919* Specified Headings:: Showing headings that match regular expressions.
920* C Function Headings:: Showing headings of C functions.
921@end menu
922
923@node Specified Headings
924@subsubsection Showing Lines That Match Regular Expressions
925@cindex specified headings
926@cindex regular expression matching headings
927
928To show in which sections differences occur for files that are not
929source code for C or similar languages, use the @option{-F @var{regexp}}
930or @option{--show-function-line=@var{regexp}} option. @command{diff}
931considers lines that match the @command{grep}-style regular expression
932@var{regexp} to be the beginning
933of a section of the file. Here are suggested regular expressions for
934some common languages:
935
936@c Please add to this list, e.g. Fortran, Pascal, Perl, Python.
937@table @samp
938@item ^[[:alpha:]$_]
939C, C++, Prolog
940@item ^(
941Lisp
942@item ^@@node
943Texinfo
944@end table
945
946This option does not automatically select an output format; in order to
947use it, you must select the context format (@pxref{Context Format}) or
948unified format (@pxref{Unified Format}). In other output formats it
949has no effect.
950
951The @option{-F} and @option{--show-function-line} options find the nearest
952unchanged line that precedes each hunk of differences and matches the
953given regular expression. Then they add that line to the end of the
954line of asterisks in the context format, or to the @samp{@@@@} line in
955unified format. If no matching line exists, they leave the output for
956that hunk unchanged. If that line is more than 40 characters long, they
957output only the first 40 characters. You can specify more than one
958regular expression for such lines; @command{diff} tries to match each line
959against each regular expression, starting with the last one given. This
960means that you can use @option{-p} and @option{-F} together, if you wish.
961
962@node C Function Headings
963@subsubsection Showing C Function Headings
964@cindex C function headings
965@cindex function headings, C
966
967To show in which functions differences occur for C and similar
968languages, you can use the @option{-p} or @option{--show-c-function} option.
969This option automatically defaults to the context output format
970(@pxref{Context Format}), with the default number of lines of context.
971You can override that number with @option{-C @var{lines}} elsewhere in the
972command line. You can override both the format and the number with
973@option{-U @var{lines}} elsewhere in the command line.
974
975The @option{-p} and @option{--show-c-function} options are equivalent to
976@option{-F '^[[:alpha:]$_]'} if the unified format is specified, otherwise
977@option{-c -F '^[[:alpha:]$_]'} (@pxref{Specified Headings}). @sc{gnu}
978@command{diff} provides them for the sake of convenience.
979
980@node Alternate Names
981@subsection Showing Alternate File Names
982@cindex alternate file names
983@cindex file name alternates
984
985If you are comparing two files that have meaningless or uninformative
986names, you might want @command{diff} to show alternate names in the header
987of the context and unified output formats. To do this, use the
988@option{--label=@var{label}} option. The first time
989you give this option, its argument replaces the name and date of the
990first file in the header; the second time, its argument replaces the
991name and date of the second file. If you give this option more than
992twice, @command{diff} reports an error. The @option{--label} option does not
993affect the file names in the @command{pr} header when the @option{-l} or
994@option{--paginate} option is used (@pxref{Pagination}).
995
996Here are the first two lines of the output from @samp{diff -C 2
997--label=original --label=modified lao tzu}:
998
999@example
1000*** original
1001--- modified
1002@end example
1003
1004@node Side by Side
1005@section Showing Differences Side by Side
1006@cindex side by side
1007@cindex two-column output
1008@cindex columnar output
1009
1010@command{diff} can produce a side by side difference listing of two files.
1011The files are listed in two columns with a gutter between them. The
1012gutter contains one of the following markers:
1013
1014@table @asis
1015@item white space
1016The corresponding lines are in common. That is, either the lines are
1017identical, or the difference is ignored because of one of the
1018@option{--ignore} options (@pxref{White Space}).
1019
1020@item @samp{|}
1021The corresponding lines differ, and they are either both complete
1022or both incomplete.
1023
1024@item @samp{<}
1025The files differ and only the first file contains the line.
1026
1027@item @samp{>}
1028The files differ and only the second file contains the line.
1029
1030@item @samp{(}
1031Only the first file contains the line, but the difference is ignored.
1032
1033@item @samp{)}
1034Only the second file contains the line, but the difference is ignored.
1035
1036@item @samp{\}
1037The corresponding lines differ, and only the first line is incomplete.
1038
1039@item @samp{/}
1040The corresponding lines differ, and only the second line is incomplete.
1041@end table
1042
1043Normally, an output line is incomplete if and only if the lines that it
1044contains are incomplete; @xref{Incomplete Lines}. However, when an
1045output line represents two differing lines, one might be incomplete
1046while the other is not. In this case, the output line is complete,
1047but its the gutter is marked @samp{\} if the first line is incomplete,
1048@samp{/} if the second line is.
1049
1050Side by side format is sometimes easiest to read, but it has limitations.
1051It generates much wider output than usual, and truncates lines that are
1052too long to fit. Also, it relies on lining up output more heavily than
1053usual, so its output looks particularly bad if you use varying
1054width fonts, nonstandard tab stops, or nonprinting characters.
1055
1056You can use the @command{sdiff} command to interactively merge side by side
1057differences. @xref{Interactive Merging}, for more information on merging files.
1058
1059@menu
1060* Side by Side Format:: Controlling side by side output format.
1061* Example Side by Side:: Sample side by side output.
1062@end menu
1063
1064@node Side by Side Format
1065@subsection Controlling Side by Side Format
1066@cindex side by side format
1067
1068The @option{-y} or @option{--side-by-side} option selects side by side
1069format. Because side by side output lines contain two input lines, the
1070output is wider than usual: normally 130 print columns, which can fit
1071onto a traditional printer line. You can set the width of the output
1072with the @option{-W @var{columns}} or @option{--width=@var{columns}}
1073option. The output is split into two halves of equal width, separated by a
1074small gutter to mark differences; the right half is aligned to a tab
1075stop so that tabs line up. Input lines that are too long to fit in half
1076of an output line are truncated for output.
1077
1078The @option{--left-column} option prints only the left column of two
1079common lines. The @option{--suppress-common-lines} option suppresses
1080common lines entirely.
1081
1082@node Example Side by Side
1083@subsection An Example of Side by Side Format
1084
1085Here is the output of the command @samp{diff -y -W 72 lao tzu}
1086(@pxref{Sample diff Input}, for the complete contents of the two files).
1087
1088@example
1089The Way that can be told of is n <
1090The name that can be named is no <
1091The Nameless is the origin of He The Nameless is the origin of He
1092The Named is the mother of all t | The named is the mother of all t
1093 >
1094Therefore let there always be no Therefore let there always be no
1095 so we may see their subtlety, so we may see their subtlety,
1096And let there always be being, And let there always be being,
1097 so we may see their outcome. so we may see their outcome.
1098The two are the same, The two are the same,
1099But after they are produced, But after they are produced,
1100 they have different names. they have different names.
1101 > They both may be called deep and
1102 > Deeper and more profound,
1103 > The door of all subtleties!
1104@end example
1105
1106@node Scripts
1107@section Making Edit Scripts
1108@cindex script output formats
1109
1110Several output modes produce command scripts for editing @var{from-file}
1111to produce @var{to-file}.
1112
1113@menu
1114* ed Scripts:: Using @command{diff} to produce commands for @command{ed}.
1115* Forward ed:: Making forward @command{ed} scripts.
1116* RCS:: A special @command{diff} output format used by @sc{rcs}.
1117@end menu
1118
1119@node ed Scripts
1120@subsection @command{ed} Scripts
1121@cindex @command{ed} script output format
1122
1123@command{diff} can produce commands that direct the @command{ed} text editor
1124to change the first file into the second file. Long ago, this was the
1125only output mode that was suitable for editing one file into another
1126automatically; today, with @command{patch}, it is almost obsolete. Use the
1127@option{-e} or @option{--ed} option to select this output format.
1128
1129Like the normal format (@pxref{Normal}), this output format does not
1130show any context; unlike the normal format, it does not include the
1131information necessary to apply the diff in reverse (to produce the first
1132file if all you have is the second file and the diff).
1133
1134If the file @file{d} contains the output of @samp{diff -e old new}, then
1135the command @samp{(cat d && echo w) | ed - old} edits @file{old} to make
1136it a copy of @file{new}. More generally, if @file{d1}, @file{d2},
1137@dots{}, @file{dN} contain the outputs of @samp{diff -e old new1},
1138@samp{diff -e new1 new2}, @dots{}, @samp{diff -e newN-1 newN},
1139respectively, then the command @samp{(cat d1 d2 @dots{} dN && echo w) |
1140ed - old} edits @file{old} to make it a copy of @file{newN}.
1141
1142@menu
1143* Detailed ed:: A detailed description of @command{ed} format.
1144* Example ed:: A sample @command{ed} script.
1145@end menu
1146
1147@node Detailed ed
1148@subsubsection Detailed Description of @command{ed} Format
1149
1150The @command{ed} output format consists of one or more hunks of
1151differences. The changes closest to the ends of the files come first so
1152that commands that change the number of lines do not affect how
1153@command{ed} interprets line numbers in succeeding commands. @command{ed}
1154format hunks look like this:
1155
1156@example
1157@var{change-command}
1158@var{to-file-line}
1159@var{to-file-line}@dots{}
1160.
1161@end example
1162
1163Because @command{ed} uses a single period on a line to indicate the end of
1164input, @sc{gnu} @command{diff} protects lines of changes that contain a single
1165period on a line by writing two periods instead, then writing a
1166subsequent @command{ed} command to change the two periods into one. The
1167@command{ed} format cannot represent an incomplete line, so if the second
1168file ends in a changed incomplete line, @command{diff} reports an error and
1169then pretends that a newline was appended.
1170
1171There are three types of change commands. Each consists of a line
1172number or comma-separated range of lines in the first file and a single
1173character indicating the kind of change to make. All line numbers are
1174the original line numbers in the file. The types of change commands
1175are:
1176
1177@table @samp
1178@item @var{l}a
1179Add text from the second file after line @var{l} in the first file. For
1180example, @samp{8a} means to add the following lines after line 8 of file
11811.
1182
1183@item @var{r}c
1184Replace the lines in range @var{r} in the first file with the following