Backup with rsync and git

Written by Federico E. Galli

Do we need to backup files between two server and we wish to obtain an high potential backup without using complex software? A combo of rsync and git is the solution!

The easy way: Fast and simple.

Identify the server to backup as remote, the local server wich is going to receive the backup as local and be sure to have installed rsync on both and git on the local one.
Save the following script (edit the directories) and launch it!

#!/bin/bash

SCRIPT=`readlink -f $0`
BASEDIR=$(dirname $SCRIPT)
cd $BASEDIR

rsync -az --delete -e ssh rsync@remote.com:/dir/to/backup/ /local/copy
# Add as many directory as you wish.
# rsync -az --delete -e ssh rsync@remote.com:/other/dir/ /other/local/dir

if [ ! -d ".git" ]; then
  git init .
fi

git add -A
git commit -m "`date --rfc-3339=seconds`"
git gc

Git is automatically initalised and it keeps the updates' history. It's possible to call many times rsync to sync the directories to backup. It you want you can add the script to the cron ones, to automate the process.

Don't you know how to use git? Read the third chapter!

The Cool way: Client-Server over VPN

Rsync is packed for every linux distribution and there is also a win32 port downloadable from sito di rsync.

The main advantage of using Rsync instead of rcp, is that rsync send/receive only the bytes inside files that changed since the last replication, and remove files on the destination host if those files were deleted on the source host to keep both hosts in sync. You can use Rsync itself connected to the TCP port 873: this is very interesting for using it with a VPN.

Our two servers are connected trough a VPN (openvpn, for example) and we don't want to have scp/ssh traffic encapsuled into the tunnel to have the best throughput.
We need to simplify the structure of our directories in order to mantain clean our backup script.

On the remote server (the one to backup) create an rsyncd.conf file under /etc

#/etc/rsyncd.conf
motd file = /etc/rsyncd.motd
read only = yes
list = yes
uid = nobody
gid = nobody

[intranet-users]
comment = homes directories
path = /home/intranet

[conf]
comment = etc directory
path = /etc

remember to choose UID e GID allowed to read the data to transfer.
Rsync is written by the samba developers so the configuration file structure it's almost the same. In our example we define two shared directories to backup (as usual, change them with the actual ones!)
Launch now the rsync daemon with

rsyncd --daemon
#oppure, a seconda della distribuzione
/etc/init.d/rsync

(on debian based distributions it could be necessary to uncomment a line on /etc/default/rsync to enable it)

If we want to try if everything is working...

rsync -avz rsync@remote.host.com::intranet-users /home/rsync/intranet
we should find in /home/rsync/intranet the files of the [intranet-users] directory, defined in rsyncd.conf

On the local machine, the one wich is going to receive the backup, create the backup directory, i.e. /home/backup and write the rsync.sh script into. This script is the one wich will execute rsync and git properly.

#!/bin/bash

SCRIPT=`readlink -f $0`
BASEDIR=$(dirname $SCRIPT)
cd $BASEDIR

rsync -az --delete rsync@remote.host.com::etc etc/
rsync -az --delete rsync@remote.host.com::intranet-users intranet/

if [ ! -d ".git" ]; then
  git init .
fi

git add -A
git commit -m "`date --rfc-3339=seconds`"
git gc

rsync lines tell to copy the files from the shares while git init initialize the destination directory if not ever done before. Last, git add -A add and remove last modified files and with git commit we finalize saving the timestamp.


Last link in /etc/cron.daily (ln -s /home/backup/rsync.sh /etc/cron.daily/rsync) the rsync.sh file to cron our backups!

Basic use of GIT:

Now that the backup system is active, let's see how to use git to know when a file was modified or read it at his previous state.

If you aren't git savvy the easiest way is to use a graphical program to browse git history: some alternatives are gitg for GNU/Linux systems, GitX for Mac OS and Git Extensions for MS Windows.

If you like the command line or you know yet git, I'm going to write some useful commands.

git log --oneline path/to/file  # show when a file was modifiedgit show <rev>:./path/to/file # show how a file was at a certain revision (the checksum that we saw with git log)

If we'd want to go back in the past with the whole directory (if you aren't too confident with git do a backup copy, and however remember to save somewhere the hash of the last revision obtained with "git log HEAD | head -n 1")

git reset --hard <rev>  # rev is a revision found via git log

to bring back everything to the present execute again the same command using the revision previous saved.

blog comments powered by Disqus