Exclude directories while using grep command?


Overview

We often run a grep command to look for specific strings of text within files. The grep command provides some additional functionality that makes the search even better. A feature that allows you to exclude certain directories from recurrence. This is useful when searching through large amounts of data.

Grep can be used with the −r option which will allow you to specify multiple patterns and then use the −v option to show only those files that match your pattern.

We’ll discuss the different ways to achieve this.

Exclude Single Directory

The simplest way to do this would be to simply add an exclusion directory name to the end of the file path. For example −

grep -r 'pattern' /path/to/directory1/*

This will find all files in the specified directory or any sub−directories. However, it won’t exclude anything.

To exclude a single directory, you need to include the −d flag. So, if you want to exclude the /home directory, you could use −

grep -rd '/home' /path/to/*

We're going to create some files and folders to use as an illustration.

$ mkdir tdir1 tdir2 tdir3 logs apache-logs
$ echo "This is sample text from tdir1/file1.txt file" > tdir1/file1.txt
$ echo "This is sample text from tdir2/file2.txt file" > tdir2/file2.txt
$ echo "This is sample text from tdir3/file3.txt file" > tdir3/file3.txt
$ echo "This is sample text from logs/service.log file" > logs/service.log
$ echo "This is sample text from apache-logs/apache.log file" > apache-logs/apache.log

Let’s now look at the directory tree we just created −

$ tree -h .
.
├──   [4.0K]  tdir1
     └── [  45]  file1.txt
├──   [4.0K]  tdir2
     └── [  45]  file2.txt
├──   [4.0K]  tdir3
     └── [  45]  file3.txt
├──   [4.0K]  logs
     └── [  47]  service.log
└──   [4.0K]  apache-logs
      └── [  51]  apache.log

5 directories, 5 files

We can use the −exclude−dir option of the grep command to exclude a directory −

$ grep -R "sample" --exclude-dir=tdir1
logs/service.log:This is sample text from logs/service.log file
tdir3/file3.txt:This is sample text from tdir3/file3.txt file
tdir2/file2.txt:This is sample text from tdir2/file2.txt file
apache-logs/apache.log:This is sample text from apache-logs/apache.log file

In the above example, the grep command searches for a pattern in all directories except tdir1.

Exclude Multiple Directories

If you wanted to exclude more than one directory, you could combine them into one string using the pipe character (|). You can also use wildcards as well. For example, let’s say you have two directories that you want to exclude −

You can use either the * or? characters to represent a single character. If you are looking for a literal asterisk (*), you should escape it by putting a backslash before it.

You can specify multiple − exclude−directories options to exclude multiple directories.

$ grep -R "sample" --exclude-dir=tdir1 --exclude-dir=tdir2 --exclude-dir=tdir3
logs/service.log:This is sample text from logs/service.log file
apache-logs/apache.log:This is sample text from apache-logs/apache.log file

In the above example, the grep command searches for a pattern in all directories except tdir1, tdir2, and tdir3.

You can use an alternative syntax to achieve the same result. We can provide a listing of directories in curly braces.

$ grep -R "sample" --exclude-dir={tdir1,tdir2,tdir3}
logs/service.log:This is sample text from logs/service.log file
apache-logs/apache.log:This is sample text from apache-logs/apache.log file

Note that there shouldn’t be any spaces before or after the comma.

Exclude Directories Using Pattern Matching

If we want to exclude a lot of directories at once, we can often just match them using regular expressions. The grep command supports regular expression matching to exclude directories via *wildcard* characters.

  • ? it is used to zero or one occurrence of the previous character

  • * it is used to zero or more occurrences of the previous character

  • \ is used to quote a wildcard

Let’s use the pattern tdir? to exclude tdir1, tdir2, and tdir3 directories −

$ grep -R "sample" --exclude-dir=tdir?
logs/service.log:This is sample text from logs/service.log file
apache-logs/apache.log:This is sample text from apache-logs/apache.log file

Let’s use logs\* and \*logs patterns to exclude directories whose name either starts or ends with logs −

$ grep -R "sample" --exclude-dir={logs\*,\*logs}
tdir1/file1.txt:This is sample text from tdir1/file1.txt file
tdir3/file3.txt:This is sample text from tdir3/file3.txt file
tdir2/file2.txt:This is sample text from tdir2/file2.txt file

Conclusion

We discussed three practical ways to exclude directories when recursing through the file system. These commands can be used in everyday life while using the Linux system.

Updated on: 26-Dec-2022

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements