The uniq Command in Linux

The uniq command in Linux is a powerful text processing utility used to filter out duplicate lines from sorted text files. It works by comparing adjacent lines and removing consecutive duplicates, making it an essential tool for data cleaning and text manipulation tasks.

Syntax

The basic syntax of the uniq command is straightforward:

uniq [options] [input_file] [output_file]

Where options are command-line switches that modify the behavior, input_file is the file to process (defaults to stdin), and output_file is where results are written (defaults to stdout).

Important Note

Critical: The uniq command only removes adjacent duplicate lines. For unsorted data, you typically need to sort first:

sort file.txt | uniq

Options

Option Description Example Output
-c Prefix lines with occurrence count 3 apple
-d Show only duplicate lines Shows repeated lines once
-u Show only unique lines (non-duplicated) Lines that appear exactly once
-i Ignore case when comparing Treats "Apple" and "apple" as same
-f N Skip first N fields Compare starting from field N+1
-s N Skip first N characters Compare starting from character N+1

Examples

Example 1: Basic Duplicate Removal

Given a sorted file fruits.txt:

cat fruits.txt
apple
apple
banana
banana
orange

Remove duplicates:

uniq fruits.txt
apple
banana
orange

Example 2: Counting Occurrences

uniq -c fruits.txt
   2 apple
   2 banana
   1 orange

Example 3: Show Only Duplicates

uniq -d fruits.txt
apple
banana

Example 4: Show Only Unique Lines

uniq -u fruits.txt
orange

Example 5: Case-Insensitive Processing

Given a file with mixed case:

uniq -i -c mixed_case.txt
   3 Apple
   2 Banana

Practical Use Cases

  • Log Analysis: Remove duplicate entries from log files for cleaner analysis

  • Data Cleaning: Eliminate duplicate records from CSV files or datasets

  • System Administration: Find unique IP addresses in access logs

  • Text Processing: Remove repeated lines from configuration files

  • Shell Scripting: Create unique lists for automated processing

Common Patterns

Process Unsorted Data

sort data.txt | uniq -c | sort -nr

This sorts the data, counts duplicates, then sorts by frequency (highest first).

Find Most Common Lines

sort access.log | uniq -c | sort -nr | head -10

Skip Headers When Processing

tail -n +2 file.csv | sort | uniq

Common Errors

  • "uniq: missing operand" Provide an input file or use stdin

  • "uniq: output file is same as input file" Use different output filename

  • "uniq: cannot open file" Check file path and permissions

  • Unexpected results Remember to sort data first for non-adjacent duplicates

Conclusion

The uniq command is essential for text processing and data analysis in Linux. It efficiently removes adjacent duplicate lines and provides options for counting, filtering, and case-insensitive processing. Remember that uniq only works on adjacent lines, so combine it with sort for complete duplicate removal from unsorted data.

Updated on: 2026-03-17T09:01:38+05:30

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements