Peter's Blog

Redefining the Impossible

Backing up from Windows to a linux box


Another backing up setup is under construction. This time I'm backing up my project developed on a windows box to a linux server using rsync. It is using ruby as a scripting language. If this script is invoked thus:

ruby Backup.rb

then this script will merely update a simple backup of the source directory on the server. rsync is very fast and will only spend time uploading files that have actually changed and also it will only upload the changes to those files. I am using this on a 30M source archive and it takes less than a minute to update the backup. The backup files are just copies of the original files so easy to browse, diff, restore.

If the script is invoked thus:

ruby Backup.rb --backup

then it will use rsync's magic link-dest option to create historical archive of backups. What the script will do is this:

  • Move the current main backup to an archive directory and give that archive directory a name that is date/time stamped.
  • Create a new backup. Where files have not changed from the previous backup, the new backup will contain a link to the existing copy of the file in the archive, rather than consume more file space. The new backup directory will only contain files that are new or have been changed.

There is nothing in the script to limit the number of backups but they can be manually pruned every few months as required.

This is the script:

   1  #
   2  # This script is used to backup the source code to an offsite server
   3  #
   4  # If the parameter '--backup' is provided then create a backup directory
   5  #
   6  strMeAtMine = "me@myserver.org"
   7  strLocalFolder = "/cygdrive/c/projects/757/"
   8  strTargetFolder = "/home/pcw/757"
   9  strBackupFolder = "/home/pcw/Backup/757"
  10  
  11  #
  12  # If asked to backup then move previous backup to a backup directory
  13  # given a name related to date/time.
  14  #
  15  if ARGV.index( '--backup')
  16    bBackup = true
  17  
  18    strTimeStamp = Time.now.strftime( '%Y-%m-%d-%H-%M')
  19  
  20    strBackupDir = "#{strBackupFolder}/#{strTimeStamp}"
  21  
  22    strCommand = "ssh #{strMeAtMine} mv #{strTargetFolder} #{strBackupDir}"
  23  
  24    system( strCommand)
  25  else
  26    bBackup = false
  27  end
  28  
  29  #
  30  # Determine location of rsync exluded file list
  31  #
  32  strDir = File.dirname( __FILE__)
  33  strExcludeFile = File.join( strDir, "rsync-exclude.txt")
  34  
  35  #
  36  # Build command to invoke in an array of strings since this allows me to comment what
  37  # each parameter does.
  38  #
  39  strCommand = [ "rsync",
  40  #  	"-n",                                 # -n = dry run
  41    	"-v",                                 # verbose
  42                  "-a",                                 # archiving options
  43                  "--delete",                           # delete files no longer used from target
  44                  "--chmod=u=rwX",                      # set target file permissions
  45                  "--exclude-from=#{strExcludeFile}",   # exclude rubbish
  46                  "--delete-excluded",                  # delete excluded files from target
  47                  "-e ssh",                             # use ssh tunnel
  48                  strLocalFolder,                       # source
  49                  "#{strMeAtMine}:\"#{strTargetFolder}\"" # target
  50              ]
  51  
  52  if bBackup
  53    #
  54    # Backing up so instead of uploading everything again, link to files in the
  55    # backup directory where there are no changes.
  56    #
  57    strCommand.insert( -3, "--link-dest=#{strBackupDir}")
  58  end
  59  
  60  system( strCommand.join(" "))

Since this is a backup of files from a windows box to linux I used the "--chmod=u=rwX" to specify simple file permissions on the linux end. Without this rsync was tending to create files with no access permissions for anyone.

The 'rsync-exclude.txt' file lives in the same directory as the script and the script uses some magic to find it. This is a list of stiff that doesn't need backing up:

/Downloads
*.map
Backup/*
Backups/*
*~

This uses the cygwin version of rsync which uses ssh to talk to the remote server. I have ssh set up with key files so I don't need to enter passwords.


Filed under: backup rsync ruby

Mark Says:

about 1 year ago

Have you considered using Unison for backup?

Peter Says:

about 1 year ago

I a word... yes. I even had a try with it recently but didn't like it's default behaviour when comparing two new directories, one full and one empty. The default was to ask if I wanted to delete everything in the full one.

I prefer rsync because:

  • It is one way and therefore predictable and your source is safe
  • the link thing, not sure unison would do that.

Peter

Josef B. Says:

about 1 year ago

I also use rsync from cygwin to backup my windows to a NAS. Excellent and fast. I was considering also trying to do incremental via rsync, but it seems like too much work to really get it right. So after some googling I found rdiff-backup. Pretty nice. The thing I didn't like was that it uses librsync and Python. Which meant more cygwin installs.

Another app I'm looking at is BackupPC.

-- Josef

Peter Says:

about 1 year ago

I tried rsync-backup a few years ago and kept getting internal errors and corrupt archives. As I say that was years ago and no doubt the problems have been fixed but rsync works for me.

Peter

Comments are Closed