Article Categories

Selected Reading

The uniq Command in Linux

Linux Operating System Open Source

The uniq command in Linux is a powerful text processing utility used to filter out duplicate lines from sorted text files. It works by comparing adjacent lines and removing consecutive duplicates, making it an essential tool for data cleaning and text manipulation tasks.

Syntax

The basic syntax of the uniq command is straightforward:

uniq [options] [input_file] [output_file]

Where options are command-line switches that modify the behavior, input_file is the file to process (defaults to stdin), and output_file is where results are written (defaults to stdout).

Important Note

Critical: The uniq command only removes adjacent duplicate lines. For unsorted data, you typically need to sort first:

sort file.txt | uniq

Options

Option	Description	Example Output
-c	Prefix lines with occurrence count	3 apple
-d	Show only duplicate lines	Shows repeated lines once
-u	Show only unique lines (non-duplicated)	Lines that appear exactly once
-i	Ignore case when comparing	Treats "Apple" and "apple" as same
-f N	Skip first N fields	Compare starting from field N+1
-s N	Skip first N characters	Compare starting from character N+1

Examples

Example 1: Basic Duplicate Removal

Given a sorted file fruits.txt:

cat fruits.txt

apple
apple
banana
banana
orange

Remove duplicates:

uniq fruits.txt

apple
banana
orange

Example 2: Counting Occurrences

uniq -c fruits.txt

   2 apple
   2 banana
   1 orange

Example 3: Show Only Duplicates

uniq -d fruits.txt

apple
banana

Example 4: Show Only Unique Lines

uniq -u fruits.txt

orange

Example 5: Case-Insensitive Processing

Given a file with mixed case:

uniq -i -c mixed_case.txt

   3 Apple
   2 Banana

Practical Use Cases

Log Analysis: Remove duplicate entries from log files for cleaner analysis
Data Cleaning: Eliminate duplicate records from CSV files or datasets
System Administration: Find unique IP addresses in access logs
Text Processing: Remove repeated lines from configuration files
Shell Scripting: Create unique lists for automated processing

Common Patterns

Process Unsorted Data

sort data.txt | uniq -c | sort -nr

This sorts the data, counts duplicates, then sorts by frequency (highest first).

Find Most Common Lines

sort access.log | uniq -c | sort -nr | head -10

Skip Headers When Processing

tail -n +2 file.csv | sort | uniq

Common Errors

"uniq: missing operand" Provide an input file or use stdin
"uniq: output file is same as input file" Use different output filename
"uniq: cannot open file" Check file path and permissions
Unexpected results Remember to sort data first for non-adjacent duplicates

Conclusion

The uniq command is essential for text processing and data analysis in Linux. It efficiently removes adjacent duplicate lines and provides options for counting, filtering, and case-insensitive processing. Remember that uniq only works on adjacent lines, so combine it with sort for complete duplicate removal from unsorted data.

Satish Kumar

Updated on: 2026-03-17T09:01:38+05:30

2K+ Views

Previous Next