Rsync File Matching
Rsync is a super effective file synchronization tool. It is versatile in how it synchronizes a source file tree to a destination file tree, updating, adding and deleting files. It uses file meta-data and block diffing of contents to provide efficient file transfers. Selecting the list of files to transfer can be difficult, for exmaple doing backups of file trees where you want to exclude some paths.
When transferring files usually what you want to do is select a local directory and specify a remote directory to copy the contents of the local directory into. In the example below copying the contents of
/home/tim/ on local filesystem into
/backup/tim_home on the server.
1rsync -a /home/tim/ server_hostname:/backup/tim_home
tim_homewill be created if it does not already exist on the server in the
- Note that the trailing slash in the source path
/home/tim/is important without it
tim_home/timwill be created on the server.
rsync -a server_hostname:/backup/tim_home/ /home/timis reverse remote to local.
Filter rule matching is a big topic in rsync, reading the man page can be useful, but it is very long and can be hard to read. This is a discussion of the most common uses. Rules are made up of patterns that match file paths (file or directories names) in a tree.
-a flag to
rsync is archive mode, that almost always should be set when syncing trees, it means that the whole tree of files and subdirectories under the source path will be included.
To exclude a path use a
--exclude= argument with a pattern, this will override the
-a inclusion. But this too can be overridden by using
--include=. To do this it must precede the
--exclude= on the command line and all parent directories specified in the rule must be included. This is shown in the example below.
** pattern operators
** can be used in patterns to match any path.
* will match any path excluding slashes.
** Will also include slashes.
If a pattern starts with a
/ it will match against transfer root, otherwise it will match from the end of the path. The example matches against transfer root.
For example let’s say you have a directory
/home/tim/project with a sub directory
editor that you want to select but you want exclude all other files and directories in
/home/tim/project from the home directory backup.
1rsync -a --include=/project/editor** --exclude=/project/** /home/tim/ server:/backup/tim_home
The important parts:
The include comes first.
The exclude does not match
/project directory. If the exclude was
/project would be excluded even with the include rule. Because the include rule only matches the subdirectory, all parents of the directories it matches must be included for it to take effect.
The root of the patterns is from the perspective of the transfer, since we have a trailing slash in the source
/home/tim/ the root is the contents of this directory.
Note that in this example another directory with “editor” as the prefix would be matched too, this could be fixed by having
Trying it out
Easiest way to try this out is to use the
-n options to
rsync these add verbosity (v) which lists the files that matched and only do a dry run (n) with no actual transfer, respectively. Create a small directory tree in a temporary directory and just try to sync it including and excluding different sets of files to see how this works.