قالب وردپرس درنا توس
Home / Tips and Tricks / How to Compare Two Text Files in the Linux Terminal

How to Compare Two Text Files in the Linux Terminal

  Illustration of a Terminal Window on Linux
Fatmawati Achmad Zaenuri / Shutterstock.com

Need to see the differences between two revisions of a text file? Then diff is the command you need. This tutorial will show you how to easily use diff on Linux and MacOS.

Dipping in diff

The command diff compares two files and creates a list of files differences between the two. To be more specific, a list of the changes that would have to be made to the first file to match the second file is created. If you take this into account, you will be able to understand the output of diff more easily. The command diff is used to find differences between the source code files and produce output that can be read and edited by other programs, such as the patch command. In this tutorial, you will find the most useful, easy-to-use methods for using diff .

described. Let's analyze two files. The order of the files in the command line determines which file diff is considered the "first file" and which is the "second file". In the following example, alpha1

is the first file and Alpha2 is the second file. Both files contain the phonetic alphabet, but the second file, alpha2, has been edited further, so the two files are not identical.

We can compare the files with this command. Type diff a space, the name of the first file, a space, the name of the second file, and then press Enter.

  diff alpha1 alpha2 

  Output of the command diff without options

How do we analyze this output? Once you know what to look for, it's not so bad. Each difference is listed in a column and every difference is marked. The label contains numbers on both sides of a letter, such as 4c4 . The first number is the line number in Alpha1 and the second number is the line number in Alpha2. The letter in the middle may read:

  • c : The line in the first file must be changed to match the line in the second file.
  • d : The line in the first file must be deleted to match the second file.
  • a : Additional content must be added to the first file to match the second file.

The 4c4 in our example tells us that line four of Alpha1 needs to be changed to match line 4 of Alpha2. This is the first difference between the two files that diff found.

Lines beginning with < refer to the first file, in our example alpha1, and lines referring to > refer to the second file, alpha2. The line <delta tells us that the word delta is the content of line 4 in alpha1. The line > Dave tells us that the word Dave is the content of line 4 in Alpha2. In summary, it must be said that delta on line 1 in line 1 needs to be replaced by dave so that this line will match in both files.

The next change is indicated by 12c12 . Following the same logic, we learn that line 12 in Alpha1 contains the word Lima, but line 12 of Alpha2 contains the word Linux.

The third change refers to a line deleted from Alpha2. The label 21d20 is decrypted as "Line 21 must be deleted from the first file so that both files are synchronized from line 20." The line <Uniform shows us the content of the line which has to be deleted from Alpha1.

The fourth difference is designated 26a26,28 . This change refers to three additional lines added to Alpha2. Note the 26,28 in the label. Two comma-separated line numbers represent a line number range. In this example, the range is from line 26 to line 28. The label is interpreted as "inserting lines 26 to 28 from the second file on line 26 in the first file." Alpha2 displays the three lines that are required for alpha1 to be added. These contain the words Quirk, Strange and Charm.

Snappy One-Liner

If you just want to know if two files are the same, use the -s (report identical files) option.

  diff -s alpha1 alpha3 

  Output of the command diff with option -s

You can use the option -q (short) to obtain an equally concise statement about two files that are different.

  diff -q alpha1 alpha2 

  Output of the command diff with option -q

It should be noted that with two identical files the - q (short) option breaks completely and does not report anything.

An alternate view

The option -y (side by side) uses a different layout to describe file differences. It is often convenient to use the option -W (width) side by side to limit the number of columns displayed. This avoids ugly lines that make the output difficult to read. Here we have instructed diff to create a side-by-side ad and limit the output to 70 columns.

  diff -yW 70 alpha1 alpha2 

  Output diff command side by side display

The first file in the command line, alpha1, appears on the left and the second line in the command line, alpha2, appears on the right , The lines of each file are displayed side by side. In addition to the lines in alpha2, there are next to the lines characters that have been changed, deleted or added.

  • | : A line that has been changed in the second file.
  • <: A line that was already inserted was deleted from the second file.
  • > : A line added to the second file and not included in the first file.

If you prefer a more compact side-by-side summary of the file, use the - General Line Suppression option. This forces diff to list only the changed, added or deleted rows.

  diff -y -W 70 --suppress common-lines alpha1 alpha2 

  Output of the diff command with --suppress -common-lines option

Adding a paint splash

Another Utility named colordiff adds color highlighting to the output. This makes it much easier to identify which lines have differences.

Use apt-get to install this package on your system if you are using Ubuntu or any other Debian-based distribution. Instead, use the package management tool of your Linux distribution on other Linux distributions.

  sudo apt-get install colordiff 

Use colordiff just as you would use it diff .

 Output of the colordiff command without options

In fact, colordiff is a wrapper for diff and diff diff work behind the scenes. For this reason, all diff options work with colordiff .

 Output of the colordiff command with the --suppress-common-lines option [19659019] Providing context

A middle ground between displaying all the lines in the files on the screen and listing the changed lines to find diff you can ask for a context. There are two options. Both options have the same purpose, namely to display a few lines before and after each changed line. You can see what happens in the file in the place where the difference was detected.

The first method uses the option -c (copied context).

  colordiff -c alpha1 alpha2 

 Output of the colordiff with the option -c

The output diff has a header. The header lists the two file names and their modification times. Asterisks ( * ) appear in front of the name of the first file, and hyphens ( - ) in front of the name of the second file. Asterisks and hyphens are used to indicate which file the lines in the output belong to.

An asterisk with 1.7 in the middle indicates that we are looking at lines from Alpha1. To be specific, let's look at lines one through seven. The word delta is marked as changed. It has an exclamation mark (! ) and is red. Before and after this line, three lines of plain text are displayed, so we can see the context of that line in the file.

The dashed line with 1.7 in the middle indicates that we are now looking at lines from alpha2. Again, we look at lines 1 through 7, where the word Dave in line 4 is marked as different.

Three context lines above and below each change are the default. You can specify how many context lines should contain diff . Use the option -C (copied context) with a "C" and specify the desired number of lines:

  colordiff -C 2 alpha1 alpha2 

[19456546] Output of the coloriff with option -C 2 ” width=”644″ height=”380″ src=”/pagespeed_static/1.JiBnMqyl6S.gif” onload=”pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);” onerror=”this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);”/>

The second option that provides context is the option -u (unified context).

  Colordiff -u alpha1 alpha2 

 Output of colordiff with option -u

As before, we have a header in the output. The two files are named and their change times are displayed. In front of the name of alpha1 are hyphens ( - ) and before the name of alpha2 plus sign ( + ). This tells us that hyphens are used for Alpha1 and plus signs for Alpha2. Scattered throughout the list are lines starting with at signs ( @ ). These lines mark the beginning of each difference. You also tell us which lines from each file are displayed.

We show the three lines before and after the line marked as different so that we can see the context of the changed line. In the unified view, the lines are displayed with the difference on top of each other. The line of Alpha1 is preceded by a hyphen, and the line of Alpha2 is preceded by a plus sign. This ad reaches in eight lines what was required for the copied context display above fifteen.

As you might expect, we can specify diff exactly as many lines of a consistent context as we would like to see. Use the option -U (unified context) with a large "U" and specify the desired number of lines:

  colordiff -U 2 alpha1 alpha2 

] Output by colordiff with the option -U 2

Ignoring spaces and case

Let's analyze two more files, test4 and test5. These have the names of six superheroes.

  colordiff -y -W 70 test4 

 Output of colordiff in files test4 and test5

The results show that diff takes place with the lines Black Widow, Spider -Man and Thor nothing else. It features changes to the lines Captain America, Ironman and The Hulk.

What is different now? Well, in Test5, Hulk is spelled with a lower case letter "h", and Captain America has an extra space between "Captain" and "America." Okay, that's obvious, but what's wrong with the Ironman line? There are no visible differences. Here's a good rule of thumb. If you can not see it, the answer is a space. At the end of this line, there are almost certainly one or two scattered fields or a tab character.

If you do not care, you can direct diff to ignore certain types of line differences, including:

  • -i : Ignore differences in the case.
  • -Z : Ignore ignoring spaces.
  • -b : Ignore changes in the amount of white space. 19659012] -w : Ignore all changes to the white space.

Let's diff again to re-examine these two files, but this time to ignore the differences in this case.


 Colordiff Ignore Case

The lines with "The Hulk" and "The Hulk" are now considered as match, and for lowercase "h" no difference is marked. Let's ask [19459009

  colordiff -i -Z -yW-70 test4 

<img class = "alignnone size-full wp-image-410892" data-pagespeed-lazy-src = "https: // www.howtogeek.com/wp-content/uploads/2019/04/xDiff_010.png.pagespeed.gp+jp+jw+pj+ws+js+rj+rp+rw+ri+cp+md.ic.0Zal4tGXLI. png "alt =" output from colordiff ignores the trailing space [19] 459055]

As suspected, the trailing space must have been the difference on the Ironman line since diff for this line is no difference That leaves Captain America behind Let's ask diff to ignore the case and ignore all white space issues.

  colordiff -i -w -y -W70 test4 test5 

 Output from colordiff Ignore All Spaces

By saying diff ignore the differences we do not care about, diff tells us that the files match for our purposes.

The command diff has many more options, but the majority of them relate to the generation of machine-readable output. These can be viewed on the Linux manpage. With the options we used in the examples above, you can identify any differences between the versions of your text files using the command line and eyeballs of humans.

Source link