Linux comm Command

The comm command is a powerful Linux utility used to compare two sorted files line by line. It displays the comparison results in three columns: lines unique to the first file, lines unique to the second file, and lines common to both files. This command is essential for file analysis, data comparison, and finding differences between datasets.

Syntax

comm [OPTION]... FILE1 FILE2

Where FILE1 and FILE2 are the two sorted files to be compared.

Common Options

  • -1 Suppress column 1 (lines unique to FILE1)

  • -2 Suppress column 2 (lines unique to FILE2)

  • -3 Suppress column 3 (lines common to both files)

  • -i Ignore case distinctions in comparisons

  • --check-order Verify that input files are correctly sorted

How It Works

The comm command produces output in three tab-separated columns:

Column 1 Column 2 Column 3
Lines unique to FILE1 Lines unique to FILE2 Lines common to both files

Example Basic Comparison

Consider two sorted files with the following content:

file1.txt:

apple
banana
grape
mango
orange

file2.txt:

apple
banana
cherry
mango
watermelon

Comparing these files:

comm file1.txt file2.txt

Output:

		apple
		banana
	cherry
grape
		mango
orange
	watermelon

In this output, grape and orange are unique to file1.txt, cherry and watermelon are unique to file2.txt, while apple, banana, and mango are common to both.

Suppressing Columns

You can suppress specific columns to focus on particular comparisons:

# Show only lines unique to file1
comm -23 file1.txt file2.txt
grape
orange
# Show only common lines
comm -12 file1.txt file2.txt
apple
banana
mango

Handling Unsorted Files

Important: Files must be sorted for comm to work correctly. For unsorted files, use sort first:

# Sort files before comparison
sort file1.txt > sorted_file1.txt
sort file2.txt > sorted_file2.txt
comm sorted_file1.txt sorted_file2.txt

# Or use process substitution
comm <(sort file1.txt) <(sort file2.txt)

Case-Insensitive Comparison

To ignore case differences when comparing files, use the -i option:

comm -i file1.txt file2.txt

This treats "Apple" and "apple" as identical lines.

Practical Use Cases

  • Finding unique entries: Identify items present in one file but not the other

  • Data validation: Verify completeness of datasets by finding missing records

  • Set operations: Perform union, intersection, and difference operations on sorted lists

  • Log analysis: Compare log files to identify changes or differences

Conclusion

The comm command is an efficient tool for comparing sorted files and identifying unique or common lines. By understanding its column-based output and various options, you can perform sophisticated file comparisons and data analysis tasks in Linux environments.

Updated on: 2026-03-17T09:01:38+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements