Linux diff and sdiff Command Examples

Compare files with diff and sdiff

Diff Command


The diff command is used to compare files line by line. The general syntax of the diff command is as follows:

syntax: diff file1 file2

Basically, diff describes what changes would need to be applied to file1 in order for it to match the contents of file2.

When you run the diff command, various descriptive keys are used to describe the differences that need to be applied. Below is a list of these descriptors:



Key Description
a Added
c Changed
d Deleted
< File 1
> File 2

The easiest way to demonstrate the diff command is by example.

Diff Command Example 1


Below are two files (filea and fileb), the content of these files has been displayed using the "cat" command:

Content of filea


john@ubuntu01-pc:~/test$ cat filea
I am line one of this test
I am line two of this test
I am line three of this test
I am line four of this test

Content of fileb


john@ubuntu01-pc:~/test$ cat fileb
I am line one of this test
I am not line two
I am not line three
I am not line four

Output from diff command:


john@ubuntu01-pc:~/test$ diff filea fileb
2,4c2,4
< I am line two of this test
< I am line three of this test
< I am line four of this test
---
> I am not line two
> I am not line three
> I am not line four

The first line of the diff output contains:

"Line Numbers" corresponding to the first file, a descriptor (a for Add, c for Change, or d for Delete), and line numbers that correspond to the second file.

From the output above, the descriptive information "2,4c2,4" was output. This descriptive output means:

"Lines 2-4" in the first file need to be changed in order to match lines "2-4" of the second file. A description of each line in each file is then displayed.

Lines Prefixed by a < are lines from the first file. (filea)

Lines Prefixed by > are lines from the second file. (fileb)

The three dashes between the output ("---") separate the lines of "file 1" and "file 2".

diff - Side by Side Comparison


An alternative way to display the output of the diff command is to pass the "-y" flag. This flag corresponds to a "side by side comparison:


john@ubuntu01-pc:~/test$ diff -y filea fileb
I am line one of this test					I am line one of this test
I am line two of this test				      |	I am not line two
I am line three of this test				      |	I am not line three
I am line four of this test				      |	I am not line four

From the above output we can see that lines "2-4" from "fileb" are different to that of "filea". (fileb is shown on the right hand side)

Another way of carrying out a "side by side" comparison is to use the command "sdiff"


john@ubuntu01-pc:~/test$ sdiff filea fileb
I am line one of this test					I am line one of this test
I am line two of this test				      |	I am not line two
I am line three of this test				      |	I am not line three
I am line four of this test				      |	I am not line four

Diff Command Example 2


Content of file1


john@ubuntu01-pc:~/test$ cat file1
I like Arch Linux
I like Ubuntu
I like Debian

Content of file2


john@ubuntu01-pc:~/test$ cat file2
I like Arch Linux
I like Ubuntu
I like Debian
I like CentOS

Output from the above diff command:


john@ubuntu01-pc:~/test$ diff file1 file2
3a4
> I like CentOS

From the above output we can see that after "line 3", a line needs to be added. ("line 4" from the second file.) This line is then shown.

Diff Command Example 2


Content of filec


john@ubuntu01-pc:~/test$ cat filec
I am line one
I am line two
I am line three
I am line four

Content of filed


john@ubuntu01-pc:~/test$ cat filed
I am line one
I am line two
I am line three

Output from the above diff command:


john@ubuntu01-pc:~/test$ diff filec filed
4d3
< I am line four

Here, the output is telling us: Delete "line 4" in the first file so that both files Sync up at "line 3". The contents of the line that needs to be deleted is then displayed.

Output from diff command with -y parameter:


john@ubuntu01-pc:~/test$ diff -y filec filed
I am line one							I am line one
I am line two							I am line two
I am line three							I am line three
I am line four						      <

Diff Command Example 3


The diff command can also be used to quickly compare the contents of two directories:


john@ubuntu01-pc:~$ ls dir*
dira:
file1  file2  file3  file4

dirb:
file1  file2  file4

Output from the above diff Directory Comparison:


john@ubuntu01-pc:~$ diff dira dirb
Only in dira: file3

From the above you can see that only directory "dira" has the file "file3"




Context and Unified modes


As we saw in the examples earlier, the "diff" command can easily identify differences between files and directories. However, the output generated is not always that simple to understand in larger files. There are two different modes available to the "diff" command. These modes are the "Context" mode and "Unified" mode.

Context Mode


To use the diff command in "Context Mode",pass the "-c" flag.

Syntax: diff -c file1 file2

Content of filec


john@ubuntu01-pc:~/test$ cat filec
I am line one
I am line two
I am line three
I am line four

Content of filed


john@ubuntu01-pc:~/test$ cat filed
I am line one
I am line two
I am line three

Output from diff "Context Mode" Command:


john@ubuntu01-pc:~/test$ diff -c filec filed
*** filec	2014-11-11 20:28:06.225478584 +0000
--- filed	2014-11-11 20:28:28.769478124 +0000
***************
*** 1,4 ****
  I am line one
  I am line two
  I am line three
- I am line four
--- 1,3 ----

The first two lines of output show us information about the "from" file (filec) and the "to" file (filed).

It lists the name of the file, its "Modification Date", and the "Modification Time" of each of the files. The "from" file is indicated by "***", and the "to" file is indicated by "---".

The line "***************" is a separator field.

The next line has three asterisks ("***") followed by a line range from the first file (in this case lines 1-4, separated by a comma). Then four asterisks ("****").

Then it shows us the contents of those lines. If the line is unchanged, it is simply prefixed by two spaces. If the line is changed, it's prefixed by a character "-" and a space.

The meaning of these characters are described in the table below:

Character Description
! Indicates a line is part of a group of one or more lines that needs to be changed. There is a corresponding group of lines prefixed with an "!" in the other file's context too.
+ Indicates a line within the second file that needs to be added to the first file.
- Indicates a line within the first file that needs to be deleted.

After the lines from the first file, there are three dashes ("---"), then a line range, then four dashes ("----"). This indicates the line range in the second file that will sync up with the changes in the first file.

If there are more than one section that needs to change, the diff command will show these sections one after the other. Lines from the first file will still be indicated with "***", and lines from the second file with "---".

Unified Mode


To use the diff command in "Unified Mode" simply pass the "-u" parameter. Unified Mode is similar to Context Mode, however, it is more compact because it omits redundant information. In the example below we use the the same files as per the previous example:

Output from diff "Unified Mode" Command:


john@ubuntu01-pc:~/test$ diff -u filec filed
--- filec	2014-11-11 20:28:06.225478584 +0000
+++ filed	2014-11-11 20:28:28.769478124 +0000
@@ -1,4 +1,3 @@
 I am line one
 I am line two
 I am line three
-I am line four

The output is similar to the "context example", but the differences are "unified" into one set.

Arguments that can be passed to diff



	--normal
              output a normal diff (the default)

	-q, --brief
              report only when files differ

	-s, --report-identical-files
              report when two files are the same

	-c, -C NUM, --context[=NUM]
              output NUM (default 3) lines of copied context

	-u, -U NUM, --unified[=NUM]
              output NUM (default 3) lines of unified context

	-e, --ed
              output an ed script

	-n, --rcs
              output an RCS format diff

	-y, --side-by-side
              output in two columns

	-W, --width=NUM
              output at most NUM (default 130) print columns

	--left-column
              output only the left column of common lines

	--suppress-common-lines
              do not output common lines

	-p, --show-c-function
              show which C function each change is in

	-F, --show-function-line=RE
              show the most recent line matching RE

	--label LABEL
              use LABEL instead of file name (can be repeated)

	-t, --expand-tabs
              expand tabs to spaces in output

	-T, --initial-tab
              make tabs line up by prepending a tab

	--tabsize=NUM
              tab stops every NUM (default 8) print columns

	--suppress-blank-empty
              suppress space or tab before empty output lines

	-l, --paginate
              pass output through `pr' to paginate it

	-r, --recursive
              recursively compare any subdirectories found
	
	-N, --new-file
              treat absent files as empty

	--unidirectional-new-file
              treat absent first files as empty

	--ignore-file-name-case
              ignore case when comparing file names

	--no-ignore-file-name-case
              consider case when comparing file names

	-x, --exclude=PAT
              exclude files that match PAT

	-X, --exclude-from=FILE
              exclude files that match any pattern in FILE

	-S, --starting-file=FILE
              start with FILE when comparing directories

	--from-file=FILE1
              compare FILE1 to all operands; FILE1 can be a directory

	--to-file=FILE2
              compare all operands to FILE2; FILE2 can be a directory

	-i, --ignore-case
              ignore case differences in file contents

	-E, --ignore-tab-expansion
              ignore changes due to tab expansion

	-Z, --ignore-trailing-space
              ignore white space at line end

	-b, --ignore-space-change
              ignore changes in the amount of white space

	-w, --ignore-all-space
              ignore all white space

	-B, --ignore-blank-lines
              ignore changes whose lines are all blank

	-I, --ignore-matching-lines=RE
              ignore changes whose lines all match RE

	-a, --text
              treat all files as text

	--strip-trailing-cr
              strip trailing carriage return on input

	-D, --ifdef=NAME
              output merged file with `#ifdef NAME' diffs

	--GTYPE-group-format=GFMT
              format GTYPE input groups with GFMT

	--line-format=LFMT
              format all input lines with LFMT

	--LTYPE-line-format=LFMT
              format LTYPE input lines with LFMT

              These format options provide fine-grained control over the  out‐
              put

              of diff, generalizing -D/--ifdef.

       LTYPE is `old', `new', or `unchanged'.
              GTYPE is LTYPE or `changed'.

              GFMT (only) may contain:

	%<     lines from FILE1

	%>     lines from FILE2

	%=     lines common to FILE1 and FILE2

	%[-][WIDTH][.[PREC]]{doxX}LETTER
              printf-style spec for LETTER

              LETTERs are as follows for new group, lower case for old group:

	F      first line number

	L      last line number

	N      number of lines = L-F+1

	E      F-1

	M      L+1

	%(A=B?T:E)
              if A equals B then T else E

              LFMT (only) may contain:

	%L     contents of line

	%l     contents of line, excluding any trailing newline

	%[-][WIDTH][.[PREC]]{doxX}n
              printf-style spec for input line number

              Both GFMT and LFMT may contain:

	%%     %

	%c'C'  the single character C

	%c'\OOO'
              the character with octal code OOO

	C      the character C (other characters represent themselves)

	-d, --minimal
              try hard to find a smaller set of changes

	--horizon-lines=NUM
              keep NUM lines of the common prefix and suffix

	--speed-large-files
              assume large files and many scattered small changes

	--help display this help and exit

	-v, --version


	FILES  are  `FILE1  FILE2'  or `DIR1 DIR2' or `DIR FILE...' or `FILE...
	DIR'.  If --from-file or --to-file is given, there are no  restrictions
	on  FILE(s).   If a FILE is `-', read standard input.  Exit status is 0
	if inputs are the same, 1 if different, 2 if trouble.