Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to removes duplicate lines from a sorted file in Linux?
To remove duplicate lines from a sorted file and make it unique, we use the uniq command in the Linux system. The uniq command works as a filter program that reports and removes duplicate lines in a file. It filters adjacent matching lines from the input and gives a unique output. This command is also available in Windows and IBM i operating systems.
Syntax
The general syntax of the uniq command is as follows −
uniq [OPTION]... [INPUT [OUTPUT]]
Options
Brief description of options available in the uniq command −
| Sr.No. | Option & Description |
|---|---|
| 1 |
-c, --count Display how many times each line was repeated. |
| 2 |
-d, --repeated Display only repeated lines, one for each group. |
| 3 |
-D Display all duplicate lines. |
| 4 |
-f, --skip-fields=N Avoid comparing the first N fields. |
| 5 |
-i, --ignore-case Ignore differences in case while comparing. |
| 6 |
-s, --skip-chars=N Avoid comparing the first N characters. |
| 7 |
-u, --unique Print only unique lines (lines that appear exactly once). |
| 8 |
-w, --check-chars=N Compare no more than N characters in lines. |
| 9 |
--help Display help and exit. |
| 10 |
--version Output version information and exit. |
Examples
Basic Usage − Removing Duplicates
To remove duplicate lines from a file, use the uniq command as shown below −
$ cat > text.txt Print only unique lines. The earth is round. The earth is round. Welcome to the tutorialspoint... Welcome to the tutorialspoint... $ uniq text.txt
Print only unique lines. The earth is round. Welcome to the tutorialspoint...
Counting Duplicate Lines
To print the number of occurrences for each line, use the -c or --count option −
$ uniq -c text.txt
1 Print only unique lines.
2 The earth is round.
2 Welcome to the tutorialspoint...
Displaying Only Unique Lines
To print only lines that appear exactly once (no duplicates), use the -u or --unique option −
$ uniq -u text.txt
Print only unique lines.
Displaying Only Duplicate Lines
To show only lines that appear more than once, use the -d or --repeated option −
$ uniq -d text.txt
The earth is round. Welcome to the tutorialspoint...
Important Notes
The
uniqcommand only removes adjacent duplicate lines. If duplicate lines are not consecutive, they won't be removed.To remove all duplicates regardless of position, first sort the file using
sort filename | uniq.You can redirect output to a new file using
uniq input.txt > output.txt.
Getting Help and Version Information
To check more information about the uniq command −
$ uniq --help
To check version information of the uniq command −
$ uniq --version
Conclusion
The uniq command is an essential Linux utility for removing duplicate lines from sorted files. It works best when combined with the sort command to ensure all duplicates are adjacent and can be properly filtered.
