 
 Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Remove Lines Which Appear in File B From Another File A in Linux
You can use the grep command in Linux to remove the lines from file A that appear in file B.
The basic syntax is ?
grep -v -f fileB.txt fileA.txt > outputFile.txt
This command uses the -v option to invert the match, so that it returns lines that do not match those in file B. The -f option specifies the file containing the patterns to match. The output is redirected to a new file called outputFile.txt.
Alternatively, you can use sed command
sed -i '/$(grep -f fileB.txt fileA.txt)/d' fileA.txt
This command uses the -i option to edit the file in-place, the /.../d specifies that lines matching the pattern should be deleted.
You can also use awk command
awk 'FNR==NR{a[$0];next} !($0 in a)' fileB.txt fileA.txt > outputFile.txt
This command compares fileB.txt and fileA.txt and prints the lines from fileA.txt that does not exist in fileB.txt into outputFile.txt
Using the comm and sort Commands
You can use the comm and sort commands in Linux to remove the lines from file A that appear in file B.
First, you need to sort both files ?
sort fileA.txt > fileA_sorted.txt sort fileB.txt > fileB_sorted.txt
Then, use the comm command to compare the two sorted files ?
comm -23 fileA_sorted.txt fileB_sorted.txt > outputFile.txt
The -23 option tells comm to print only the lines that are unique to file A (lines that do not appear in file B). The output is redirected to a new file called outputFile.txt.
Alternatively, you can also use
comm -13 fileA_sorted.txt fileB_sorted.txt > outputFile.txt
this will print only the lines that appear in file A but not in file B.
It's important to note that both of the files need to be sorted before using the comm command.
Using the join and sort Commands
You can use the join and sort commands in Linux to remove the lines from file A that appear in file B.
First, you need to sort both files ?
sort fileA.txt > fileA_sorted.txt sort fileB.txt > fileB_sorted.txt
Then, use the join command to compare the two sorted files ?
join -v 1 fileA_sorted.txt fileB_sorted.txt > outputFile.txt
The -v 1 option tells join to print only the lines that are unique to file A (lines that do not appear in file B). The output is redirected to a new file called outputFile.txt.
Alternatively, you can also use
join -v 2 fileA_sorted.txt fileB_sorted.txt > outputFile.txt
this will print only the lines that appear in file B but not in file A
It's important to note that both of the files need to be sorted before using the join command and that the join command needs to have a common field, if the files do not have any common field you need to add it before using the command.
Using the grep Command
You can use the grep command in Linux to remove the lines from file A that appear in file B.
The basic syntax is ?
grep -v -f fileB.txt fileA.txt > outputFile.txt
This command uses the -v option to invert the match, so that it returns lines that do not match those in file B. The -f option specifies the file containing the patterns to match. The output is redirected to a new file called outputFile.txt.
Alternatively, you can also use
grep -vxf fileB.txt fileA.txt > outputFile.txt
This command also uses the -v option to invert the match, and the -x option to match the whole line, and the -f option to specify the file containing the patterns to match.
It's important to note that this command works best if the lines in both files are unique, if the files contain duplicate lines, you might end up removing lines that you want to keep in the output file.
Using the awk Command
You can use the awk command in Linux to remove the lines from file A that appear in file B.
The basic syntax is ?
awk 'FNR==NR{a[$0];next} !($0 in a)' fileB.txt fileA.txt > outputFile.txt
This command compares fileB.txt and fileA.txt and prints the lines from fileA.txt that does not exist in fileB.txt into outputFile.txt
Alternatively, you can also use ?
awk 'NR==FNR{a[$0];next} !($0 in a)' fileB.txt fileA.txt > outputFile.txt
This command also compares fileB.txt and fileA.txt and prints the lines from fileA.txt that does not exist in fileB.txt into outputFile.txt.
It's important to note that this command works best if the lines in both files are unique, if the files contain duplicate lines, you might end up removing lines that you want to keep in the output file.
Conclusion
There are several ways to remove the lines from file A that appear in file B in Linux, such as using the grep, comm, join, sed, and awk commands. The grep command uses the -v option to invert the match and the -f option to specify the file containing the patterns to match. The comm, join command requires both files to be sorted before using the command. The sed command uses -i option to edit the file in-place, the /.../d specifies that lines matching the pattern should be deleted. The awk command uses FNR==NR{a[$0];next} !($0 in a) or NR==FNR{a[$0];next} !($0 in a) to compare file B and file A and print the lines from file A that do not exist in file B into outputFile.txt. It is important to note that all of these commands work best if the lines in both files are unique, if the files contain duplicate lines, you might end up removing lines that you want to keep in the output file.
