Peter's Blog

Redefining the Impossible

Backup Strategy


I decided to put a new backup strategy in place at work. I have my desktop PC running windows and an Ubuntu server. I wanted to back up my day-to-day work under windows to the server. I wanted incremental backup so I have the option to backtrack through file history if necessary.

rsync is a nice utility to copy an set of files from one pc to another and works under windows {via Cygwin) and Linux. It can copy over ssh and hence I can use my ssh keys to avoid having to log into the server or put my password in scripts. However it does not do incremental backups, it just duplicates.

rdiff-backup is a nice backup tool that can do cross-network incremental backups. It uses the rsync protocol so it is very efficient. It is also easy to use, no weird command line switches, just give it the name of the source and target directories. However, support for this on windows is not straightforward and it relies on using a cygwin version of python rather than the standard distribution.

So, a compromise solution, use both. I have set things up so that this is done every night when I go home:

cd c:\Projects
rsync -avz --exclude-from="rsync.cnf" -e ssh ./ pcw@rd-pcw2:Projects/ > backup.log
blat backup.log -to pcw@itl.co.uk

this copies files from my 'Projects' directory to the server. The "rsync.cnf" file is a set of things to exclude from the copy, e.g.:

#
# Doxygen output files
#
- Doxygen/

#
# Anything downloaded
#
- Download/
- lstfiles/
- ofiles/
- *.bak
- *.Bak

#
# Anything generated by py2exe
#
- build/
- dist/

#
# Anything in a folder called Old
#
- Old/

#
# VC build directorys
#
Debug/
Release/
debug/
release/

#
# Miscellaneous.
#
- *.obj
- *.tmp
- *.pyc
- setup/*.exe
- Output/setup.exe

After running this I use blat to email me what happened so I know it succeeded.

On the server I have crontab set up to run rdiff-backup every night after the files have been uploaded:

0 18 * * * rdiff-backup /home/pcw/Projects /home/pcw/Backup

This system gives me two full copies of my project files and incremental backups to boot.

Todo: rdiff-backup to a different disk, giving three copies.


9 Comments

Peter Says:

over 5 years ago

blat wants crlf's in the input file and cygwin rsync is giving it lf's so I have changed the blatting as follows:

cat backup.log | tr \n \r\n | blat - -to pcw@itl.co.uk

Peter

Peter Says:

over 5 years ago

I needed to recover an old version of a file. The command:

rdiff-backup -r 2D Backup/694/Src/BootStrap/Main.c main.c

got back the version of main.c from 2 days ago (2D). Cool.

Peter

Peter Says:

over 4 years ago

For the record, I am of the opinion that source control systems are not primarily for backup, they are for source control. I only check changes into source control when they are tested and shippable. I don't want to pull stuff out of source control (especially someone elses stuff) and have to spend time bug fixing and getting it to build. The system I use here is for my dayly backups when I go home and is just my work in progress.

Peter

Peter Says:

over 4 years ago

Don't know about you, but when my backup tool does things like this:

  File "/usr/lib/python2.3/site-packages/rdiff_backup/restore.py",
 line 277, in get_diff
    mir_rorp.setfile(cls.rf_cache.get_fp(expanded_index))
  File "/usr/lib/python2.3/site-packages/rdiff_backup/restore.py",
line 363, in get_fp
    rf = self.get_rf(index)
  File "/usr/lib/python2.3/site-packages/rdiff_backup/restore.py",
line 348, in get_rf
    if not self.add_rfs(index): return None
  File "/usr/lib/python2.3/site-packages/rdiff_backup/restore.py",
line 382, in add_rfs
    if Globals.process_uid != 0: self.perm_changer(temp_rf.mirror_rp)
  File "/usr/lib/python2.3/site-packages/rdiff_backup/restore.py",
line 698, in __call__
    assert index > old_index, (index, old_index)
AssertionError: (('694', 'Src'), ('694', 'Src', 'BootStrap', 'dataman.P48'))

when ask to restore old files, I lose confidence in it and look for other solutions.

I used to use a system of rolling backups with zip files, I'll go back to that. Rsync seems to be reliable, I will stick with that, rdiff-backup goes.

Peter

gihan Says:

over 4 years ago

rsync can do incremental backups. Check out the website: rsync.samba.org/examples.html

Peter Says:

over 4 years ago

I use rsync to mirror the pc working directory on a linux server, then I run incremental backups on the server. I use this. I think this may be what you are alluding to, it uses hard links to avoid duplicate files. I don't think it works on windows because windows is lame and doesn't support hard links.

Peter

Matt Says:

over 4 years ago

From what I can tell NTFS does support hard links. I gave hard-linking a trial run on Windows XP with cp from UnxUtils (you can get it at sourceforge) by doing: #cp -al f:\source f:\destination and sure enough it worked, the best I can tell. If I remember correctly, there was about 80GB on the drive and creating a hardlink made the destination folder look like it was 80GB, but the drive usage only increased a few MB. Currently, I'm working on a cwRsync/cp routine. BTW, cwRsync may support hard-linking on XP also: itefix.no/phpws/index.php?module=phpwsbb&PHPWSBB_MAN_OP=view&PHPWS_MAN_ITEMS=19

Peter Says:

over 4 years ago

I came across this just the other day, here is a utility to link directories. Not sure it works for liniing files though.

Peter

Jovan Washington Says:

about 1 year ago

Peter,

Could you get in touch with me as soon as you get this message. I am interested in your reactions to a certain program.

Thank you, Jovan Washington Jovan@spideroak.com

Sorry but comments on this post are now closed.