Filtering Files Copied When Using rsync on Linux


Introduction

The Linux command-line utility rsync is a powerful and flexible tool for synchronizing files and directories across different computers and locations. It is commonly used for a variety of tasks including backups, file transfers and data replication. One of the main features of rsync is its ability to filter files based on various criteria such as file type, size and modification time. In this article, we will explore in detail how to use rsync filters to selectively copy files during directory synchronization on a Linux system.

Rsync Overview

Rsync is a powerful and efficient file transfer tool that allows users to synchronize files and directories across different systems. It works by comparing the source and destination directories and copying only the files that have changed or added to the source directory. This feature makes rsync an ideal tool for transferring large amounts of data over the network, as it minimizes the amount of data that needs to be transferred. Additionally, rsync can also be used to remove files that no longer exist in the source directory, making it a versatile tool for maintaining file consistency across multiple systems.

Using the Include Option

One of rsync's most powerful features is the ability to filter files based on various criteria such as file type, size, and modification time. The include option is a powerful tool that allows users to selectively copy only certain types of files. The include option works in conjunction with the exclude option, which is used to exclude files based on specific criteria.

For example, suppose you want to transfer only text files in the current directory. We will use the include option to filter the files we want to transfer and the exclude option with the wildcard character to exclude all other files −

$ rsync --include=*.txt --exclude=* /source/* /destination/

In this example, the order of the options is important. The include and exclude options are filters and are applied in the order in which they are specified. So first we have to use the filter that represents the files we want to transfer and then exclude everything else. When this command is run, all text files in the source directory will be selected for transfer and all other files will be excluded.

We can also use multiple include options to transfer multiple file types. For example, if we want to transfer text and log files, we can use an include option for each, followed by the exclude option −

$ rsync --include=*.txt --include=*.log --exclude=* /source/* /destination/

Recursing through Subdirectories

In some cases, we may want to transfer files that match specific criteria in all subdirectories of the source directory. To do this, we can use the -a and -r options to traverse subdirectories. For example, to transfer all text files in all subdirectories of the source directory, we can use the following command −

$ rsync -ar --include=*/ --include=*.txt --exclude=* /source/* /destination/

However, this command has the potential to copy all subdirectories (no files), so we also need to use the --prune-empty-dirs or -m option to prevent this from happening −

$ rsync -ar --prune-empty-dirs --include=*/ --include=*.txt --exclude=* /source/* /destination/

Advanced Filtering

Rsync provides several advanced filter options that can be used to filter files based on more complex criteria. These include the ability to filter based on file size, modification time, and file attributes.

For example, the following command synchronizes all files larger than 100 MB in the “/source/” directory to the “/destination/” directory −

$ rsync -av --size-gt 100M /source/ /destination/

As another example, the following command synchronizes all files that have changed in the last 30 days in the “/source/” directory to the “/destination/” directory −

$ rsync -av --ignore-existing --modify-window=30 /source/ /destination/

Versions and Variants

It is important to note that the behavior and syntax of the rsync command may vary between versions. This tutorial is based on version 3.2.4. The latest version of rsync is available on the official website −

https://rsync.samba.org. It is always a good idea to consult the documentation for the specific version you are using to ensure that the commands and options used in this tutorial work as intended.

Conclusion

Rsync is a powerful and versatile tool for synchronizing files and directories on Linux systems. Its ability to filter files based on various criteria, such as file type, size, and modification time, makes it a useful tool for a wide variety of tasks, including backups, file transfers, and file replication. data. The include option, in particular, is a powerful tool that allows users to selectively copy only certain types of files, making it an invaluable tool for managing and maintaining file consistency across multiple systems. With proper use of include and exclude options, combined with the ability to bring up subdirectories and advanced filtering options, rsync can be an essential tool in any Linux system administrator's toolkit.

Updated on: 20-Jan-2023

528 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements