Rsync File Matching
Rsync is a super effective file synchronization tool. It is versatile in how it synchronizes a source file tree to a destination file tree, updating, adding and deleting files. It uses file meta-data and block diffing of contents to provide efficient file transfers. Selecting the list of files to transfer can be difficult, for exmaple doing backups of file trees where you want to exclude some paths.
Basics
When transferring files usually what you want to do is select a local directory and specify a remote directory to copy the contents of the local directory into. In the example below copying the contents of /home/tim/
on local filesystem into /backup/tim_home
on the server.
1rsync -a /home/tim/ server_hostname:/backup/tim_home
Note
tim_home
will be created if it does not already exist on the server in the/backup
directory.- Note that the trailing slash in the source path
/home/tim/
is important without ittim_home/tim
will be created on the server. rsync -a server_hostname:/backup/tim_home/ /home/tim
is reverse remote to local.
Filter Rules
Filter rule matching is a big topic in rsync, reading the man page can be useful, but it is very long and can be hard to read. This is a discussion of the most common uses. Rules are made up of patterns that match file paths (file or directories names) in a tree.
Include/Exclude
The -a
flag to rsync
is archive mode, that almost always should be set when syncing trees, it means that the whole tree of files and subdirectories under the source path will be included.
To exclude a path use a --exclude=
argument with a pattern, this will override the -a
inclusion. But this too can be overridden by using --include=
. To do this it must precede the --exclude=
on the command line and all parent directories specified in the rule must be included. This is shown in the example below.
*
and **
pattern operators
*
and **
can be used in patterns to match any path. *
will match any path excluding slashes. **
Will also include slashes.
Transfer root
If a pattern starts with a /
it will match against transfer root, otherwise it will match from the end of the path. The example matches against transfer root.
Example
For example let’s say you have a directory /home/tim/project
with a sub directory editor
that you want to select but you want exclude all other files and directories in /home/tim/project
from the home directory backup.
1rsync -a --include=/project/editor** --exclude=/project/** /home/tim/ server:/backup/tim_home
The important parts:
The include comes first.
The exclude does not match /project
directory. If the exclude was --exclude=/project**
instead, /project
would be excluded even with the include rule. Because the include rule only matches the subdirectory, all parents of the directories it matches must be included for it to take effect.
The root of the patterns is from the perspective of the transfer, since we have a trailing slash in the source /home/tim/
the root is the contents of this directory.
Note
Note that in this example another directory with “editor” as the prefix would be matched too, this could be fixed by having --include=/code/editor --include=/code/editor/**
.
Trying it out
Easiest way to try this out is to use the -v
and -n
options to rsync
these add verbosity (v) which lists the files that matched and only do a dry run (n) with no actual transfer, respectively. Create a small directory tree in a temporary directory and just try to sync it including and excluding different sets of files to see how this works.