The uniq Command in Linux


Introduction

The Unix operating system is known for its powerful command-line interface and an extensive collection of tools. Among these tools, "uniq" command is a popular utility that is used to filter out duplicate lines in text files. This command is often used in conjunction with other command-line tools and shell scripts to manipulate data and automate tasks. In this article, we will explore "uniq" command in detail, including its syntax, options, and examples of its usage.

Syntax

The syntax of "uniq" command is simple and easy to understand. basic syntax of command is as follows −

uniq [options] [input_file] [output_file]

Here, "options" are various command-line switches that modify behavior of command, while "input_file" is name of file to be processed, and "output_file" is name of file where processed data will be written.

Options

The "uniq" command has several options that allow you to customize its behavior. Here are some of most commonly used options −

  • -c − This option adds a count of number of times each line is repeated in input file. For example, if input file contains two lines that are same, "uniq -c" command will output one line with count of "2".

  • -d − This option displays only lines that are repeated in input file. In other words, it removes all unique lines and only displays duplicates.

  • -i − This option ignores case of letters when comparing lines. For example, lines "Apple" and "apple" will be considered same if -i option is used.

  • -u − This option displays only lines that occur once in input file. In other words, it removes all duplicate lines and only displays unique ones

Examples

Now that we have covered syntax and options of "uniq" command let's explore some examples of its usage.

Example 1: Removing Duplicates

Suppose you have a file called "data.txt" that contains a list of names, with some names appearing multiple times. To remove duplicates from file, you can use following command −

uniq data.txt > output.txt

This command will read data from "data.txt" file, remove all duplicate lines, and write unique lines to a new file called "output.txt".

Example 2: Counting Duplicates

Suppose you want to know how many times each name appears in "data.txt" file. You can use following command −

uniq -c data.txt > output.txt

This command will read data from "data.txt" file, count number of times each line appears, and write results to a new file called "output.txt". output will show count of each unique line in file.

Example 3: Ignoring Case

Suppose you have a file called "data.txt" that contains a list of names, with some names appearing in different cases. To remove duplicates regardless of case, you can use following command −

uniq -i data.txt > output.txt

This command will read data from "data.txt" file, remove all duplicate lines regardless of case, and write unique lines to a new file called "output.txt".

Example 4: Displaying Duplicates

Suppose you want to see only lines that are repeated in "data.txt" file. You can use following command −

uniq -d data.txt > output.txt

This command will read data from "data.txt" file, remove all unique lines, and write only duplicate lines to a new file called "output.txt".

Example 5: Displaying Unique Lines

Suppose you want to see only lines that occur once in "data.txt" file. You can use following command −

uniq -u data.txt > output.txt

This command will read data from "data.txt" file, remove all duplicate lines, and write only unique lines to a new file called "output.txt".

Uses of "uniq" command

The "uniq" command is a versatile tool that can be used in various situations. Some of most common use cases of command are −

  • Data cleaning − "uniq" command can be used to remove duplicate entries from data files, making them easier to analyze.

  • Data analysis − By using "-c" option, "uniq" command can count number of occurrences of each line in a file, allowing for easy data analysis.

  • Data transformation − "uniq" command can be used in conjunction with other command-line tools to transform data files into different formats.

  • Scripting − "uniq" command can be used in shell scripts to automate tasks involving text files.

  • Version control − "uniq" command can be used to remove duplicate lines from code files, making them easier to manage in version control systems.

Common errors with "uniq" command

While "uniq" command is generally straightforward to use, there are some common errors that users may encounter. Here are some of most common errors and how to fix them −

  • "uniq: missing operand" − This error occurs when "uniq" command is not provided with an input file to process. To fix this error, provide command with name of file to be processed.

  • "uniq: output file is same as input file" − This error occurs when output file specified is same as input file. To fix this error, specify a different output file name.

  • "uniq: cannot open file" − This error occurs when "uniq" command cannot find specified input or output file. To fix this error, make sure that file exists and that correct file path is specified.

  • "uniq: invalid option" − This error occurs when an invalid option is specified with "uniq" command. To fix this error, make sure that option is spelled correctly and that it is supported by version of "uniq" command that you are using.

Conclusion

In conclusion, "uniq" command is a powerful tool that can be used to filter out duplicate lines in text files. Its various options allow for customization of its behavior to suit your specific needs. Whether you want to remove duplicates, count duplicates, ignore case, or display only unique lines, "uniq" command is a useful tool to have in your Linux toolkit. With examples provided in this article, you can start using "uniq" command with confidence and efficiency in your everyday work.

Updated on: 24-Mar-2023

767 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements