Linux comm Command


Introduction

Linux is an open-source operating system that provides a wide range of powerful and flexible tools for managing and manipulating files and data. One of essential tools in Linux is "comm" command, which is used to compare two sorted files line by line. This command can be used to identify common lines or differences between files. In this article, we will discuss comm command, its syntax, and examples.

Syntax of comm Command

The syntax of comm command is as follows −

comm [OPTION]... FILE1 FILE2

Here, FILE1 and FILE2 are two files that need to be compared. options used with comm command are −

  • -1 − Suppress printing of column 1

  • -2 − Suppress printing of column 2

  • -3 − Suppress printing of common lines

  • -i − Ignore case distinctions in comparisons

  • -u − Print only lines that are unique to FILE1 and FILE2

  • -z − Use zero-byte rather than newline as a line separator

  • --check-order − Check that input is correctly sorted, even if all input lines are pairable

Comparing Two Sorted Files Using comm Command

The comm command is used to compare two sorted files. If files are not sorted, then output may not be correct. command compares two files line by line and displays output in three columns. first column displays lines that are unique to first file, second column displays lines that are unique to second file, and third column displays lines that are common in both files.

Example

Let's assume we have two sorted files: file1.txt and file2.txt. content of file1.txt is −

apple
banana
grape
mango
orange

The content of file2.txt is −

apple
banana
cherry
mango
watermelon

To compare these two files, we can use following command −

$ comm file1.txt file2.txt

The output of this command will be −

apple
   banana
cherry
   grape
mango
orange
watermelon

In output, lines that are unique to file1.txt are displayed in first column, lines that are unique to file2.txt are displayed in second column, and lines that are common in both files are displayed in third column.

Comparing Two Unsorted Files Using comm Command

If files are not sorted, then output of comm command may not be correct. In such cases, we can use sort command to sort files before comparing them. We can use following command to sort files −

$ sort FILENAME > SORTED_FILENAME

Here, FILENAME is name of file that needs to be sorted, and SORTED_FILENAME is name of sorted file.

Example

Let's assume we have two unsorted files: file1.txt and file2.txt. content of file1.txt is −

grape
apple
orange
banana
mango

The content of file2.txt is −

mango
watermelon
cherry
apple
banana

To compare these two files, we can sort files first and then use comm command to compare them. We can use following commands to sort files −

$ sort file1.txt > sorted_file1.txt
$ sort file2.txt > sorted_file2.txt

Now, we can use comm command to compare sorted files −

$ comm sorted_file1.txt sorted_file2.txt

The output of this command will be −

apple
   banana
cherry
   grape
mango
orange
watermelon

As we can see, output is same as previous example, where files were already sorted.

Ignoring Case Distinctions While Comparing Files

The comm command compares files based on case of characters. However, sometimes we may want to compare files without considering case of characters. In such cases, we can use -i option with comm command. This option tells comm command to ignore case distinctions while comparing files.

Example

Let's assume we have two files: file1.txt and file2.txt. content of file1.txt is −

Apple
banana
Grape
Mango
orange

The content of file2.txt is −

apple
banana
cherry
mango
watermelon

To compare these two files without considering case of characters, we can use following command −

$ comm -i file1.txt file2.txt

The output of this command will be −

cherry
grape
orange
watermelon

As we can see, output only displays lines that are unique to either file and does not consider case of characters.

Printing Only Unique Lines From Two Files

Sometimes, we may want to print only lines that are unique to files and not lines that are common in both files. In such cases, we can use -u option with comm command. This option tells comm command to print only lines that are unique to FILE1 and FILE2.

Example

Let's assume we have two files: file1.txt and file2.txt. content of file1.txt is −

apple
banana
grape
mango
orange

The content of file2.txt is −

apple
banana
cherry
mango
watermelon

To print only lines that are unique to these two files, we can use following command −

$ comm -u file1.txt file2.txt

The output of this command will be −

cherry
grape
orange
watermelon

As we can see, output only displays lines that are unique to either file and does not display lines that are common in both files.

Using zero-byte as a Line Separator

By default, comm command uses a newline character as a line separator. However, sometimes we may want to use a zero-byte as a line separator. In such cases, we can use -z option with comm command. This option tells comm command to use a zero-byte as a line separator.

Example

Let's assume we have two files: file1.txt and file2.txt. content of file1.txt is −

apple
banana
grape
mango
orange

The content of file2.txt is −

apple
banana
cherry
mango
watermelon

To use a zero-byte as a line separator while comparing these two files, we can use following command −

$ comm -z file1.txt file2.txt

The output of this command will be −

apple�banana�cherry�grape�mango�orange�watermelon�

As we can see, output uses a zero-byte as a line separator instead of a newline character.

Conclusion

The comm command is a powerful tool that allows us to compare two files line by line. We can use various options with comm command to customize output according to our requirements. examples discussed in this article demonstrate various ways in which we can use comm command. With knowledge of comm command and its options, we can efficiently compare files and find differences between them.

Updated on: 23-Mar-2023

465 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements