Delete Lines in a Text File That Contain a Specific String


Introduction

Text files are widely used for storing data and information in various fields such as computer science, engineering, healthcare, finance, etc. However, sometimes it is necessary to remove certain lines that contain specific strings or patterns from a text file. This task can be accomplished easily using various programming languages such as Python, Perl, Bash, etc. In this article, we will focus on how to delete lines in a text file that contain a specific string.

Prerequisites

Before proceeding further, we need to make sure that we have following prerequisites −

  • A text file containing data and information.

  • A programming language such as Python, Perl, Bash, etc.

  • Basic knowledge of regular expressions (regex) for pattern matching.

Method 1: Using Python

Python is a popular programming language that is used for a variety of tasks including data analysis, web development, machine learning, etc. It is also well-suited for text processing tasks such as deleting lines that contain a specific string.

  • Step 1 − Open text file first step is to open text file that we want to modify. This can be done using open() function in Python. We also need to specify mode in which we want to open file (read, write, append, etc.).

file = open("data.txt", "r")
  • Step 2 − Read file next step is to read contents of file. This can be done using readlines() function in Python. readlines() function reads all lines in file and returns them as a list.

lines = file.readlines()
  • Step 3 − Remove lines that contain a specific string next step is to remove lines that contain a specific string. This can be done using a for loop and an if statement. We can use strip() function to remove any whitespace characters (spaces, tabs, newlines, etc.) from beginning and end of each line.

new_lines = []
for line in lines:
   if "specific string" not in line.strip():
      new_lines.append(line)
  • Step 4 − Write modified data to file final step is to write modified data back to file. This can be done using write() function in Python. We also need to close file after we are done with it.

file = open("data.txt", "w")
file.writelines(new_lines)
file.close()

Example: Suppose we have a text file called data.txt that contains following lines −

Line 1: This is a sample text file.
Line 2: It contains some lines that we want to remove.
Line 3: This is line 3.
Line 4: Line 4 contains specific string that we want to remove.
Line 5: This is line 5.

If we want to remove all lines that contain string "specific string", we can use following Python code −

file = open("data.txt", "r")
lines = file.readlines()
new_lines = []
for line in lines:
   if "specific string" not in line.strip():
      new_lines.append(line)
file.close()
file = open("data.txt", "w")
file.writelines(new_lines)
file.close()

After running this code, contents of data.txt file will be −

Line 1: This is a sample text file.
Line 2: It contains some lines that we want to remove.
Line 3: This is line 3.
Line 5: This is line 5.

Method 2: Using Bash

Bash is a Unix shell and command language that is commonly used for scripting and automation tasks. It also provides a simple and efficient way to delete lines that contain a specific string from a text file.

  • Step 1 − Create a backup of original file first step is to create a backup of original file. This is important in case we make a mistake while modifying file and want to restore it to its original state. We can use cp command to create a backup of file

  • Step 2 − Use sed command to delete lines next step is to use sed command to delete lines that contain a specific string. sed command is a powerful text editor that can be used to perform various operations on text files, including search and replace.

The sed command takes following form −

sed 'pattern/action/' filename

Where pattern is regular expression that we want to match, and action is command that we want to perform when pattern is found.

To delete lines that contain a specific string, we can use following sed command −

sed '/specific string/d' data.txt

This command will delete all lines that contain string "specific string" from data.txt file.

  • Step 3 − Save modified file final step is to save modified file. We can redirect output of sed command to a new file, or we can use -i option to edit file in place.

To redirect output to a new file, we can use following command −

sed '/specific string/d' data.txt > newdata.txt

This command will create a new file called newdata.txt that contains all lines from original file except ones that contain string "specific string".

To edit file in place, we can use following command −

sed -i '/specific string/d' data.txt

This command will modify data.txt file directly, deleting all lines that contain string "specific string".

Example: Suppose we have a text file called data.txt that contains following lines −

Line 1: This is a sample text file.
Line 2: It contains some lines that we want to remove.
Line 3: This is line 3.
Line 4: Line 4 contains specific string that we want to remove.
Line 5: This is line 5.

If we want to remove all lines that contain string "specific string", we can use following sed command −

sed '/specific string/d' data.txt

After running this command, output will be −

Line 1: This is a sample text file.
Line 2: It contains some lines that we want to remove.
Line 3: This is line 3.
Line 5: This is line 5.

If we want to save modified data to a new file, we can use following command −

sed '/specific string/d' data.txt > newdata.txt

After running this command, contents of newdata.txt file will be −

Line 1: This is a sample text file.
Line 2: It contains some lines that we want to remove.
Line 3: This is line 3.
Line 5: This is line 5.

If we want to modify data.txt file directly, we can use following command −

sed -i '/specific string/d' data.txt

After running this command, contents of data.txt file will be −

Line 1: This is a sample text file.
Line 2: It contains some lines that we want to remove.
Line 3: This is line 3.
Line 5: This is line 5.

Conclusion

In this article, we have discussed three different methods for deleting lines in a text file that contain a specific string. first method used Python, second method used Perl, and third method used sed command. Each method has its own advantages and disadvantages, and choice of method depends on user's preferences and requirements.

Python is a popular programming language that is easy to learn and use. It has a large number of libraries and modules that can be used to perform various operations on text files. However, Python may not be best choice for users who are not familiar with programming, or for users who want to perform simple operations quickly.

The sed command is a powerful text editor that is widely available on most Unix-like systems. It has a simple syntax and can be used to perform various operations on text files, including search and replace. However, users may need to learn some basic regular expression syntax to use sed command effectively.

In conclusion, deleting lines in a text file that contain a specific string is a common task that can be accomplished using various methods. Python, and sed command are three popular methods that can be used for this task. By choosing right method, users can perform task quickly and easily.

Updated on: 24-Mar-2023

6K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements