Backup of a Remote Drive
From Wiki99
↑ Computers ↑
← prev: Backup of a Local Drive
next: Recovery Using a Backup →
Contents |
Pull vs Push Remote Backup
Suppose you have the following situation. You have a machine, X, which has an attached eXternal backup drive. And you have a machine, W, which has an attached drive that is to be backed up. There are obviously two possible ways of running the backup:
- You can run the backup script on X, so that it pulls the data from W or
- You can run the backup script on W, so that it pushes the data to X.
rsync and our backup script with minor modifications can handle both these situations. Which you use is a matter of choice. My opinion is that it is neater to keep all the backup scripts for all volume, along with the backupExcludes file, on a single computer to which the backup drive is attached, hence I prefer the pull model. But you may, for whatever reason, feel that push is more appropriate for your situation, so I'll discuss that briefly.
rsync and ssh
In both cases, the easy way to run rsync over a network is layered on top of ssh. The ssh authentication stuff that we've already set up means passwords are not necessary. There is an old-fashioned way of doing things that involves setting up a a dedicated rsync server running on the remote machine, but this should be of no interest to you. Running rsync over ssh is so natural that it is built into rsync; older versions of rsync required special arguments describing how to use ssh, but modern rsync will use it automatically if you indicate a network backup.
Setting up ssh for rsync
Obviously using ssh means that the remote machine contacted by rsync has to be running ssh, ie you must have switched on Remote Login in the Sharing preferences pane. Also, rsync expects that you have set up the public-private key stuff as we described (here) when talking about setting up mail.
You cannot, however, simply expect the ssh setup you created previously
for mail to work. The reason is that the setup you created for mail was
to allow you, as say user mjh on this local machine, to login as
say user mh on your server.
For push-mode backup the backup script will be running as root on the local
machine (so that it can write out files with a variety of owners and
permissions), and the remote rsync also needs to run as root (so that it
can read files with a variety of owners). This means that we now need an
ssh setup that will allow root on the local machine to login as
root on the remote machine.
For the same reasons (but backwards) for
pull mode backup we need an ssh setup that will allow root on the server
to login as root on the remote machine to be backed up.
This is no problem, we got through exactly the same steps as we did with mail, the only difference is where we store the various files. Recall that the home directory of root is /var/root.
So for a push mode setup you need to do something like:
sudo su
cd /var/root
mkdir .ssh
cd .ssh
ssh-keygen -t dsa
Wait a few seconds for the computer to generate the two files id_dsa and id_dsa.pub Now copy over the id_dsa.pub to the root home directory on the server: First we have to create a .ssh directory in that server root directory:
ssh root@myServer.local
pwd
(The pwd should be /var/root.)
mkdir .ssh
exit
Now perform the transfer
scp ~/.ssh/id_dsa.pub root@myServer.local:~/.ssh/authorized_keys
Obviously if you are interested in pull rather than push, you need to reverse the machines involved, so that, on the server, you create id_dsa and id_dsa.pub, and you move over id_dsa.pub to the machine with the backup drive attached to it, where it should be named authorized_keys.
A Pull Script
#!/bin/bash
#This script performs a backup.
# The variables you'd need to set to modify it for you needs are clustered
# below.
#Two non-obvious things you should be aware of are:
# * The script must be run as superuser, ie sudo backupScript
# * It's not a good idea to run two instances of this script at the same
# time if they write to the same hard drive. This is because the diskutil
# stage of the backup, where the destination drive is checked for file
# system consistency, requires the drive to be unmounted, which you don't want
# to happen if the other script is buy writing to it.
# In theory two copies of the script that write to different drives should
# run in parallel without a problem, but I've never had a reason to try
# that so I can't promise anything.
#===============================================================================
HOME_DIR=/Users/mjh
RSYNC_LOCAL=$HOME_DIR/bin/rsync
POWERSHIFT=$HOME_DIR/bin/powershift
BACKUP_EXCLUDES=$HOME_DIR/bin/backup_excludes.txt
RSYNC_REMOTE=/Users/mh/bin/rsync
MAIL_ADDR=mjh@bluecloud.com
LOG_FILE=$HOME_DIR/Library/Logs/backup.log
#Using $$ below uses the process ID in the file name and thus makes it unique.
TMP_FILE=/tmp/myBak.$$.txt
BAK_NAME=Server
SRC_SERVER=root@myServer.local
SRC_DIR_REMOTE=/
SRC_DIR=$SRC_SERVER:$SRC_DIR_REMOTE
DST_VOL=/Volumes/Backup1
DST_DIR=$DST_VOL/Backups/server/
LINK_SRC_1=$DST_VOL/Backups/iMac/
#===============================================================================
#In the fine tradition of very limited BASH scoping, these power-related
# functions utilize two globals, SLEEP_TIME_BATTERY and SLEEP_TIME_AC
# to share data.
_GetPowerSettings()
{
#This function extracts, from the remote machine, the sleep times under
# both battery and AC power and stores them in the variables
# SLEEP_TIME_BATTERY and SLEEP_TIME_AC.
#Note that the documentation for pmset implies that there is a third
# sleep setting relevant to UPS's, but I don't have a UPS, and I can't
# find documentation about this. If you have a UPS, you should run
# pmset -g disk on your system, see if there is a UPS stanza, and make
# the appropriates modifications to the code below.
#The pmset -g disk command returns a block of data about power settings in
# the form
# Battery Power:
# acwake 0
# sleep 15
# AC Power:
# womp 1
# acwake 0
# sleep 300
# autorestart 1
# (with various fields omitted).
#The point is that there are multiple sections, one for Battery, one for AC,
# each with a sleep field, and we want to extract the appropriate sleep
# fields.
#The first block of awk below displays everything from the start specifier,
# eg /Battery Power:/, to the end specifier /END.
# (Note that /END/ is simply a pattern specifying the word END, it is not
# some magic awk variable. I can't figure out how to tell awk the real end
# of the file, but using a pattern that does not match anywhere in the
# file works fine.)
# So in the first case the awk get all the data returned by pmset, in the
# second case it only returns the second block, starting with "AC Power:"
#The grep then extracts the sleep lines from these blocks;
# in the first case there will be two sleep lines, one from Battery,
# one from AC. The head command passes on only the first sleep line.
#Finally the second awk returns the second field on the sleep line (the
# first field being the word sleep) which is the number we want.
SLEEP_TIME_BATTERY=`ssh $SRC_SERVER pmset -g disk | awk '/Battery Power:/,/END/' | grep "^ sleep" | head -n 1 | awk '{print $2}'`
SLEEP_TIME_AC=` ssh $SRC_SERVER pmset -g disk | awk '/AC Power:/, /END/' | grep "^ sleep" | head -n 1 | awk '{print $2}'`
}
_ChangePowerSettingsToNeverSleep()
{
#This function sets the remote machine to never sleep.
ssh $SRC_SERVER "pmset -b sleep 0"
ssh $SRC_SERVER "pmset -c sleep 0"
}
_RevertPowerSettings()
{
#This function uses the sleep settings we previously acquired to
# set the machine's sleep times back to what they were before we switched
# off sleep.
ssh $SRC_SERVER "pmset -b sleep $SLEEP_TIME_BATTERY"
ssh $SRC_SERVER "pmset -c sleep $SLEEP_TIME_AC"
}
#...............................................................................
SetRemoteMachineToNotSleep()
{
#This function uses our functions above to tell the remote machine not to
# sleep, but it also installs a trap handler so that when our script
# exits (under reasonable conditions) it will reset the sleep settings on
# the remote machine.
# Exiting under reasonable conditions means normal exits (either at the
# end of the script or by detecting an error), and some signals like
# command-C.
# Obviously we can do nothing about say a power outage; there are also
# some signals (like SIGKILL) that we simply can not catch.
_GetPowerSettings
trap "_RevertPowerSettings; exit" INT TERM EXIT
_ChangePowerSettingsToNeverSleep
}
#-------------------------------------------------------------------------------
ReportErrorAndExit()
{
#This function reports an error.
# It takes a compulsory argument, $1, a string that describes the error and
# and optional argument, $2.
# If $2 is anything, the error string is only logged to stderr, otherwise
# it is also logged to the log file.
# The error string is also mailed to $MAIL_ADDR.
#Set bash "word" separator to newline only.
# (If we didn't do this, the string argument passed in would not be
# treated as a single $1 argument.)
# Normally you'd want to restore this after you're done, but we're exiting
# at the end of this function so that's not necessary.
IFS=$'\n'
ERROR_STRING="BACKUP $BAK_NAME: $1"
if [[ $2 ]]; then
echo $ERROR_STRING >&2
else
echo $ERROR_STRING >&2
echo $ERROR_STRING >> $LOG_FILE
fi
mail -s $ERROR_STRING $MAIL_ADDR </dev/null &> /dev/null
exit 1
}
#-------------------------------------------------------------------------------
DirectoryExists()
{
#This function takes a full directory specifier (which might be a local
# directory or an ssh specified directory of the form
# machine_specifier:localDirectory) and makes the appropriate local or
# remote calls to test whether the directory exists.
#
#It takes a single argument, $1, which is the directory specifier.
#awk -F : splits text using : as a separator.
X=`echo $1 | awk -F : '{print $1}'`
if [[ $X == $1 ]]; then
#We are dealing with a local directory specifier
return `test -d $1`
else
lSRC_SERVER=$X
lSRC_DIR_REMOTE=`echo $1 | awk -F : '{print $2}'`
return `ssh $lSRC_SERVER test -d $lSRC_DIR_REMOTE`
fi
}
#-------------------------------------------------------------------------------
FileExists()
{
#This function takes a full file specifier (which might be a local
# file or an ssh specified file of the form
# machine_specifier:localFile) and makes the appropriate local or
# remote calls to test whether the file exists.
#
#It takes a single argument, $1, which is the file specifier.
#awk -F : splits text using : as a separator.
X=`echo $1 | awk -F : '{print $1}'`
if [[ $X == $1 ]]; then
#We are dealing with a local directory specifier
return `test -e $1`
else
lSRC_SERVER=$X
lSRC_DIR_REMOTE=`echo $1 | awk -F : '{print $2}'`
return `ssh $lSRC_SERVER test -e $lSRC_DIR_REMOTE`
fi
}
#===============================================================================
#Test user is root
if [[ `id -u` != 0 ]]; then
ReportErrorAndExit "user is not root" DONT_LOG
fi
#...............................................................................
echo "============================================================" >> $LOG_FILE
echo `date` >> $LOG_FILE
echo "Start backup $BAK_NAME" >> $LOG_FILE
#...............................................................................
#Test that ssh is working properly
if ! ssh $SRC_SERVER "true"; then
ReportErrorAndExit "ssh connection to $SRC_SERVER is not working"
fi
#Test src volume exists
if ! ssh $SRC_SERVER "test -d $SRC_DIR_REMOTE"; then
ReportErrorAndExit "$SRC_DIR does not exist"
fi
#Test dst directory exists
if [[ ! -d $DST_DIR ]]; then
ReportErrorAndExit "$DST_DIR does not exist"
fi
#...............................................................................
#Force the backup drive to have permissions enabled
#This (helpfully non-documented, no-built in help --- thanks Apple) command will
# force permissions to be enabled for the backup drive.
# http://www.macosxhints.com/article.php?story=20020925051644480
vsdbutil -a $DST_VOL
#Test that the backup drive has permissions enabled (otherwise not only do we
# have the obvious problem of permissions not stored correctly), we also have
# the dest-link stuff won't generate correct links for anything with ownership
# that's not the current user.
PERMISSIONS_ENABLED_ON_BACKUP=`diskutil info $DST_VOL | grep "Owners" | awk '{print $2}'`
if [[ $PERMISSIONS_ENABLED_ON_BACKUP != "Enabled" ]]; then
ReportErrorAndExit" backup drive does not have permissions enabled"
fi
#-------------------------------------------------------------------------------
#Tell the remote machine not to go to sleep.
SetRemoteMachineToNotSleep
#-------------------------------------------------------------------------------
#Do the actual backup. If a prior backup exists, use that as a link source.
INITIAL_SIZE=`df -k $DST_VOL | grep "^/" | awk '{print $4}'`
INITIAL_SECONDS=`date "+%s"`
sync
ssh $SRC_SERVER sync
if ! DirectoryExists $DST_DIR/1; then
$RSYNC_LOCAL -axHEyW --delete --delete-after \
--delete-excluded --exclude-from=$BACKUP_EXCLUDES \
--ea-checksum \
--link-dest=$LINK_SRC_1/1/ \
--rsync-path=$RSYNC_REMOTE \
--stats --progress \
$SRC_DIR $DST_DIR/0/
else
$RSYNC_LOCAL -axHEyW --delete --delete-after \
--delete-excluded --exclude-from=$BACKUP_EXCLUDES \
--ea-checksum \
--link-dest=$LINK_SRC_1/1/ \
--link-dest=$DST_DIR/1/ \
--rsync-path=$RSYNC_REMOTE \
--stats \
$SRC_DIR $DST_DIR/0/ \
>>$LOG_FILE 2>&1
fi
RSYNC_ERROR_CODE=$?
RSYNC_PHASE=1
# An error code of 24 (some files vanished before they could be transferred)
# is common and not worth treating as an error.
if [[ $RSYNC_ERROR_CODE == 24 ]]; then RSYNC_ERROR_CODE=0; fi
#Copy over the boot file (which has special permissions and can't be
# multiple-hard-linked to).
#Run this step twice to cope with either Intel or PPC boot.
if [[ $RSYNC_ERROR_CODE == 0 ]]; then
BOOT=/System/Library/CoreServices/BootX
if FileExists $SRC_DIR/$BOOT; then
$RSYNC_LOCAL -aEW --delete --delete-after \
--ea-checksum \
--rsync-path=$RSYNC_REMOTE \
$SRC_DIR/$BOOT $DST_DIR/0/$BOOT \
>>$LOG_FILE 2>&1
RSYNC_ERROR_CODE=$?
RSYNC_PHASE=2
fi
fi
if [[ $RSYNC_ERROR_CODE == 0 ]]; then
BOOT=/System/Library/CoreServices/boot.efi
if FileExists $SRC_DIR/$BOOT; then
$RSYNC_LOCAL -aEW --delete --delete-after \
--ea-checksum \
--rsync-path=$RSYNC_REMOTE \
$SRC_DIR/$BOOT $DST_DIR/0/$BOOT \
>>$LOG_FILE 2>&1
RSYNC_ERROR_CODE=$?
RSYNC_PHASE=3
fi
fi
#...............................................................................
FINAL_SIZE=`df -k $DST_VOL | grep "^/" | awk '{print $4}'`
FINAL_SECONDS=`date "+%s"`
let CHANGE_IN_SIZE=$(($INITIAL_SIZE - $FINAL_SIZE ))
let DURATION_SECONDS=$(($FINAL_SECONDS - $INITIAL_SECONDS))
let DURATION_HOURS=$(($DURATION_SECONDS/3600))
let DURATION_SECONDS=$(($DURATION_SECONDS-$DURATION_HOURS*3600))
let DURATION_MINUTES=$(($DURATION_SECONDS/60))
let DURATION_SECONDS=$(($DURATION_SECONDS-$DURATION_MINUTES*60))
echo
echo "Backup Duration hr min s =" $DURATION_HOURS $DURATION_MINUTES $DURATION_SECONDS >> $LOG_FILE
echo "Backup Change in size KB MB GB =" $CHANGE_IN_SIZE \
$(( ($CHANGE_IN_SIZE+512)/1024 )) \
$(( ($CHANGE_IN_SIZE+512*1024)/1024/1024 )) >> $LOG_FILE
if [[ $RSYNC_ERROR_CODE != 0 ]]; then
ReportErrorAndExit "*** rsync reported error $RSYNC_ERROR_CODE in phase $RSYNC_PHASE"
fi
#-------------------------------------------------------------------------------
#Proactively repair the backup volume
#1 Get the device node for the backup volume.
# We will need this later.
DST_VOLUME_DEV=`diskutil info $DST_VOL | grep "Device Identifier: " | awk '{ print $3 }'`
#2 Loop trying to unmount the backup volume.
# This may take a few tries because Spotlight may be busy indexing the volume.
COUNTER=0
while [[ $COUNTER < 3 ]]; do
diskutil unmount $DST_VOL &> /dev/null
UNMOUNT_ERROR_CODE=$?
if [[ $UNMOUNT_ERROR_CODE == 0 ]]; then
let COUNTER=3;
else
let COUNTER=$COUNTER+1
echo "Could not unmount. Waiting 60 seconds. Attempt $COUNTER of 3."
sleep 60
fi
done
#3 Once we unomunted successfully, remount the drive
# We should now be cleared to run diskutil repairVolume without problems
# when the repair tries to unmount the volume.
diskutil mount $DST_VOLUME_DEV &> /dev/null
#...............................................................................
INITIAL_SIZE=`df -k $DST_VOL | grep "^/" | awk '{print $4}'`
INITIAL_SECONDS=`date "+%s"`
rm $TMP_FILE &> /dev/null
touch $TMP_FILE
tail $TMP_FILE&
diskutil repairVolume $DST_VOL &> $TMP_FILE
REPAIR_ERROR_CODE=$?
kill %1 #Kill the tail command above.
if [[ $REPAIR_ERROR_CODE != 0 ]]; then
cat $TMP_FILE >> $LOG_FILE
rm $TMP_FILE &> /dev/null
ReportErrorAndExit "*** diskutil reported error $REPAIR_ERROR_CODE"
else
rm $TMP_FILE &> /dev/null
fi
sync
#...............................................................................
FINAL_SIZE=`df -k $DST_VOL | grep "^/" | awk '{print $4}'`
FINAL_SECONDS=`date "+%s"`
let CHANGE_IN_SIZE=$(($INITIAL_SIZE - $FINAL_SIZE ))
let DURATION_SECONDS=$(($FINAL_SECONDS - $INITIAL_SECONDS))
let DURATION_HOURS=$(($DURATION_SECONDS/3600))
let DURATION_SECONDS=$(($DURATION_SECONDS-$DURATION_HOURS*3600))
let DURATION_MINUTES=$(($DURATION_SECONDS/60))
let DURATION_SECONDS=$(($DURATION_SECONDS-$DURATION_MINUTES*60))
echo "Repair Duration hr min s =" $DURATION_HOURS $DURATION_MINUTES $DURATION_SECONDS >> $LOG_FILE
echo "Repair Change in size KB MB =" $CHANGE_IN_SIZE \
$(( ($CHANGE_IN_SIZE+512)/1024 )) >> $LOG_FILE
#-------------------------------------------------------------------------------
#Shift all the backup names down by 1.
$POWERSHIFT $DST_DIR 5 >> $LOG_FILE 2>&1
echo "============================================================" >> $LOG_FILE
#===============================================================================
Discussion of the pull script
This script is obviously almost identical to the local backup script. What's different?
Script variables
remote rsync specifier
The rsync that will run here, on the machine running the script, is given by RSYNC_LOCAL. But this instance of rsync will start, and communicate with, an instance of rsync running on the remote computer. We need an argument, RSYNC_REMOTE that gives the path, on the remote machine, to that rsync.
remote machine and directory specifiers
Next we need a way to specify the remote server. This, in SRC_SERVER, follows the standard ssh specification for the user you want to login as, namely {{code| root}, an @ sign, and then the name of the server you want to login to, myServer.local, obviously this will be named whatever you want.
On the remove server we need a way to specify the volume we want to backup. This is again specified in SRC_DIR, which now is no longer just a directory but is a concatenation of the machine specifier and SRC_DIR_REMOTE. In this case we want to backup the root directory of the server, so the SRC_DIR_REMOTE specification is simply the root directory, ie /. If the server had another disk connected to it that we wish to back up, say, Mail, the directory specifier would be, just like in the local case, SRC_DIR_REMOTE=/Volumes/Mail.
multiple link source specifiers
Finally we have the LINK_SRC_1 specifier. This has nothing intrinsically to do
with remote backups, it's just most likely to be used with them.
Recall everything we've discussed about how rsync uses hard links to prevent
having to store multiple copies of the same file in subsequent backups.
Now let's assume that you are doing backups of the boot drive of your iMac
into the folder $DST_VOL/Backups/iMac/. Some large part of that backup will
consist of MacOS X and various applications which are exactly the same
on the server. It seems silly to back up separate copies of those files
from the server when the server backup could, instead, create hard links
to the equivalent file in $DST_VOL/Backups/iMac/.
Newer versions of rsync (but not the version Apple ships with 10.4.8) can handle this situation easily simply by specifying more than one --link-dest specifier. As each file is to be transferred to the backup drive, rsync looks through each of the --link-dest specifiers in order, and as soon as it finds a match, creates a hard link to that match.
You can have as many --link-dest specifiers as you like, but too many is counterproductive because the backup is slowed down by having to search through each of these directories looking for matches. It only makes sense to add an additional --link-dest specifier if there is a pretty good chance that it will contain files that both
- match what is in the source to be backed AND
- do not match what is in the already specified --link-dest directories.
Tell the remote machine not to go to sleep
Obviously we don't want the remote computer to go to sleep during the backup process.
The script defines a collection of functions that, between them
- query the remote machine for its sleep times (under battery and AC)
-
tell the remote machine to never sleep and
-
install a handler that catches when the script exits (whether normally or through command-C) and tells the remote machine to reset its sleep times back to the earlier queried values.
If you are interested in the precise details of how this works, read the comments in the script.
Testing that the src volume exists
We first test that ssh is working properly. With that done, we then test that the remote volume to be backed up exists. This is, conceptually, exactly the same as what we do for the local case, but the code to do so is slightly different. In particular, rather than using a built-in shell command, as we did for the local case, we've now defined a function that tests whether a directory exists which works whether the directory is local or remote.
rsync
rsync has two features to attempt to improve its performance over networks.
-
The data to be transferred can be compressed. This is
specified using the -z flag or equivalently the
--compress specifier.
The most recent versions of rsync give you more control. Rather than using these two flags, you can use --compress-level=NUM, where NUM is an integer between 1 and 9. 1 requires the least CPU and gives the least compression, 9 requires the most CPU and gives the best compression. The default (what you get with -z) is 6.
Unfortunately neither the Apple rsync nor the rsync I recommended you download have this option yet, but hopefully they will acquire it soon. -
If a file to be backed up, and the matching file in a previous backup, are known to be different (for example they have different lengths or different timestamps), rsync can attempt to transfer just the parts of the file that are believed to be different from what already exists on the hard drive, not the whole file. The rsync running on the computer connected to the backup drive creates a full version of the file to be backed up from the pre-existing version on the drive and the difference data that is sent over the network.
This would obviously be a very neat trick if it worked as perfectly as I described. How exactly is one computer, with one copy of a file, supposed to tell how that differs from a copy of the file on some other computer without transferring one of the versions of the file between the two of them?
In very brief terms, the idea is that the two files are split into blocks, the blocks are hashed, the hash (but not the full block) is sent over the network, and blocks with matching hashes are assumed to be identical. There's a whole lot more to it, of course, and if you want technical details, you can read about them here at the official rsync documentation web page. A summary can be found on wikipedia.
The obvious question is whether to use one or both of these options.
The answer hinges on the relative speeds of your CPU, network and drive.
If your network is of order the same speed as your hard drives,
then obviously neither of these options makes sense. If your machines are
connected via 100Mb/s ethernet, they are probably a good enough match to the
speed of your hard drives that these options don't make sense.
On the other hand, if you are running rsync over the public internet, you
may well be limited to something like 384kb/s, and both these options make
sense.
The interesting case is that of backing up at home over WiFi where the maximum speed (with 802.11g) of around 3MB/s is slow enough compared to the hard drives that compression might be worthwhile.
In this case the important issue is how much CPU is left on your computer after that allocated to ssh, rsync and the network and disk IO. To maximize this, the first step is, as I described in the earlier article, to set up ssh so that it is using arcfour as the encryption algorithm and thus using minimal CPU.
With that in place, run your backup over the network without compression and see what sort of network throughput you get. (Menumeters is useful for this purpose, though you can stick to using Activity Monitor.app if you prefer.) Wait for rsync to go through its initial stages of scanning disks and building internal data structures, until it actually starts copying over data. This may take thirty minutes or more depending on the size of your disks, but you will see the data transferred over the network jump from around 100KB/s (while the data structures are being built up) to around 2+MB/s.
You can kill the backup after you've had a few minutes of this to get a feel for the behavior, and try again, this time telling rsync to use compression, (ie using the -z flag). When I back up my portable (1GHz PPC 7450) it is very clear that in the no compression case the data flows out at around 2.5MB/s, with a fair bit of CPU left free, whereas in the compression case the data flows out at around 1MB/s with no CPU left free, which I interpret as the CPU being sufficiently slow that it can't compress at the rate needed to be worthwhile over a network connection of 3MB/s.
|
You could argue, reasonably, that a data rate of 1MB/s is still a win if that represents an uncompressed data transfer rate of greater than 3MB/s, ie a compression ratio, averaged over all files, of around 3x. With a stream of pure text files, you should get a compression ratio of around 4x or greater, so this would be a win, but on most people's machines, the bulk of the large files are going to be photos, audio files and movies, all of which are already compressed; the extra rsync compression takes time to process them but provides no compression gains. rsync would be greatly improved by having a mechanism by which one could specify that certain file types are already compressed and should not be compressed when transferred over the network, but as far as I can tell this is neither an existing option. I have suggested it as a new feature and with luck it will appear in a new version. |
So my conclusion is that, at least for my portable with rsync as it currently operates, using the -z flag over 802.11g is not a good idea. I'm still not sure whether or not to use whole-file copying, or to allow the partial block comparison scheme for WiFi. My guess is that, at the end of the day, it's probably a win but a very minor one. There are very few large files I can think of on my system that are constantly being modified; the only ones that spring to mind are maybe the Mail, iTunes and iPhoto databases and various log files. For these files the non-matching block transfer scheme is probably a win, whereas for everything else it never kicks in (either the file is a small text file, or it has been created ab initio since the last backup).
A Push Script
#!/bin/bash
#This script performs a backup.
# The variables you'd need to set to modify it for you needs are clustered
# below.
#Two non-obvious things you should be aware of are:
# * The script must be run as superuser, ie sudo backupScript
# * It's not a good idea to run two instances of this script at the same
# time if they write to the same hard drive. This is because the diskutil
# stage of the backup, where the destination drive is checked for file
# system consistency, requires the drive to be unmounted, which you don't want
# to happen if the other script is buy writing to it.
# In theory two copies of the script that write to different drives should
# run in parallel without a problem, but I've never had a reason to try
# that so I can't promise anything.
#===============================================================================
HOME_DIR=/Users/mh
RSYNC_LOCAL=$HOME_DIR/bin/rsync
POWERSHIFT=$HOME_DIR/bin/powershift
BACKUP_EXCLUDES=$HOME_DIR/bin/backup_excludes.txt
RSYNC_REMOTE=/Users/mjh/bin/rsync
MAIL_ADDR=mh@bluecloud.com
LOG_FILE=$HOME_DIR/Library/Logs/backup.log
#Using $$ below uses the process ID in the file name and thus makes it unique.
TMP_FILE=/tmp/myBak.$$.txt
BAK_NAME=iBook_Push
SRC_DIR=/
DST_SERVER=root@iMac.local
DST_VOL=/Volumes/BackupDrive
DST_DIR_REMOTE=$DST_VOL/Backups/iBook/
DST_DIR=$DST_SERVER:$DST_DIR_REMOTE
LINK_SRC_IM=$DST_VOL/Backups/iMac/
LINK_SRC_SV=$DST_VOL/Backups/server/
#===============================================================================
ReportErrorAndExit()
{
#This function reports an error.
# It takes a compulsory argument, $1, a string that describes the error and
# and optional argument, $2.
# If $2 is anything, the error string is only logged to stderr, otherwise
# it is also logged to the log file.
# The error string is also mailed to $MAIL_ADDR.
#Set bash "word" separator to newline only.
# (If we didn't do this, the string argument passed in would not be
# treated as a single $1 argument.)
# Normally you'd want to restore this after you're done, but we're exiting
# at the end of this function so that's not necessary.
IFS=$'\n'
ERROR_STRING="BACKUP $BAK_NAME: $1"
if [[ $2 ]]; then
echo $ERROR_STRING >&2
else
echo $ERROR_STRING >&2
echo $ERROR_STRING >> $LOG_FILE
fi
mail -s $ERROR_STRING $MAIL_ADDR </dev/null &> /dev/null
exit 1
}
#-------------------------------------------------------------------------------
DirectoryExists()
{
#This function takes a full directory specifier (which might be a local
# directory or an ssh specified directory of the form
# machine_specifier:localDirectory) and makes the appropriate local or
# remote calls to test whether the directory exists.
#
#It takes a single argument, $1, which is the directory specifier.
#awk -F : splits text using : as a separator.
X=`echo $1 | awk -F : '{print $1}'`
if [[ $X == $1 ]]; then
#We are dealing with a local directory specifier
return `test -d $1`
else
lSRC_SERVER=$X
lSRC_DIR_REMOTE=`echo $1 | awk -F : '{print $2}'`
return `ssh $lSRC_SERVER test -d $lSRC_DIR_REMOTE`
fi
}
#-------------------------------------------------------------------------------
FileExists()
{
#This function takes a full file specifier (which might be a local
# file or an ssh specified file of the form
# machine_specifier:localFile) and makes the appropriate local or
# remote calls to test whether the file exists.
#
#It takes a single argument, $1, which is the file specifier.
#awk -F : splits text using : as a separator.
X=`echo $1 | awk -F : '{print $1}'`
if [[ $X == $1 ]]; then
#We are dealing with a local directory specifier
return `test -e $1`
else
lSRC_SERVER=$X
lSRC_DIR_REMOTE=`echo $1 | awk -F : '{print $2}'`
return `ssh $lSRC_SERVER test -e $lSRC_DIR_REMOTE`
fi
}
#===============================================================================
#Test user is root
if [[ `id -u` != 0 ]]; then
ReportErrorAndExit "user is not root" DONT_LOG
fi
#...............................................................................
echo "============================================================" >> $LOG_FILE
echo `date` >> $LOG_FILE
echo "Start backup $BAK_NAME" >> $LOG_FILE
#...............................................................................
#Test that ssh is working properly
if ! ssh $DST_SERVER "true"; then
ReportErrorAndExit "ssh connection to $DST_SERVER is not working"
fi
#Test src directory exists
if ! DirectoryExists $SRC_DIR; then
ReportErrorAndExit "$SRC_DIR does not exist"
fi
#Test dst directory exists
if ! DirectoryExists $DST_DIR; then
ReportErrorAndExit "$DST_DIR does not exist"
fi
#...............................................................................
#Force the backup drive to have permissions enabled
#This (helpfully non-documented, no-built in help --- thanks Apple) command will
# force permissions to be enabled for the backup drive.
# http://www.macosxhints.com/article.php?story=20020925051644480
ssh $DST_SERVER vsdbutil -a $DST_VOL
#Test that the backup drive has permissions enabled (otherwise not only do we
# have the obvious problem of permissions not stored correctly), we also have
# the dest-link stuff won't generate correct links for anything with ownership
# that's not the current user.
PERMISSIONS_ENABLED_ON_BACKUP=`ssh $DST_SERVER diskutil info $DST_VOL | grep "Owners" | awk '{print $2}'`
if [[ $PERMISSIONS_ENABLED_ON_BACKUP != "Enabled" ]]; then
ReportErrorAndExit" backup drive does not have permissions enabled"
fi
#-------------------------------------------------------------------------------
#Do the actual backup. If a prior backup exists, use that as a link source.
INITIAL_SIZE=`ssh $DST_SERVER df -k $DST_VOL | grep "^/" | awk '{print $4}'`
INITIAL_SECONDS=`date "+%s"`
ssh $DST_SERVER sync
sync
if ! DirectoryExists $DST_DIR/1; then
$RSYNC_LOCAL -axHEy --delete --delete-after \
--delete-excluded --exclude-from=$BACKUP_EXCLUDES \
--ea-checksum \
--link-dest=$LINK_SRC_IM/1/ \
--link-dest=$LINK_SRC_SV/1/ \
--rsync-path=$RSYNC_REMOTE \
--stats --progress \
$SRC_DIR $DST_DIR/0/
else
$RSYNC_LOCAL -axHEy --delete --delete-after \
--delete-excluded --exclude-from=$BACKUP_EXCLUDES \
--ea-checksum \
--link-dest=$DST_DIR_REMOTE/1/ \
--link-dest=$LINK_SRC_IM/1/ \
--link-dest=$LINK_SRC_SV/1/ \
--rsync-path=$RSYNC_REMOTE \
--stats \
$SRC_DIR $DST_DIR/0/ \
>>$LOG_FILE 2>&1
fi
RSYNC_ERROR_CODE=$?
RSYNC_PHASE=1
# An error code of 24 (some files vanished before they could be transferred)
# is common and not worth treating as an error.
if [[ $RSYNC_ERROR_CODE == 24 ]]; then RSYNC_ERROR_CODE=0; fi
# For some reason, no matter what I do, this backup always complains about not
# being able to open "/Users/handleym/._Music" and "/Users/handleym/._.DS_Store".
# It seems better to suppress this and get the rest of the work done than to
# accept it.
if [[ $RSYNC_ERROR_CODE == 23 ]]; then RSYNC_ERROR_CODE=0; fi
#Copy over the boot file (which has special permissions and can't be
# multiple-hard-linked to).
#Run this step twice to cope with either Intel or PPC boot.
if [[ $RSYNC_ERROR_CODE == 0 ]]; then
BOOT=/System/Library/CoreServices/BootX
if FileExists $SRC_DIR/$BOOT; then
$RSYNC_LOCAL -aEW --delete --delete-after \
--ea-checksum \
--rsync-path=$RSYNC_REMOTE \
$SRC_DIR/$BOOT $DST_DIR/0/$BOOT \
>>$LOG_FILE 2>&1
RSYNC_ERROR_CODE=$?
RSYNC_PHASE=2
fi
fi
if [[ $RSYNC_ERROR_CODE == 0 ]]; then
BOOT=/System/Library/CoreServices/boot.efi
if FileExists $SRC_DIR/$BOOT; then
$RSYNC_LOCAL -aEW --delete --delete-after \
--ea-checksum \
--rsync-path=$RSYNC_REMOTE \
$SRC_DIR/$BOOT $DST_DIR/0/$BOOT \
>>$LOG_FILE 2>&1
RSYNC_ERROR_CODE=$?
RSYNC_PHASE=3
fi
fi
#...............................................................................
FINAL_SIZE=`ssh $DST_SERVER df -k $DST_VOL | grep "^/" | awk '{print $4}'`
FINAL_SECONDS=`date "+%s"`
let CHANGE_IN_SIZE=$(($INITIAL_SIZE - $FINAL_SIZE ))
let DURATION_SECONDS=$(($FINAL_SECONDS - $INITIAL_SECONDS))
let DURATION_HOURS=$(($DURATION_SECONDS/3600))
let DURATION_SECONDS=$(($DURATION_SECONDS-$DURATION_HOURS*3600))
let DURATION_MINUTES=$(($DURATION_SECONDS/60))
let DURATION_SECONDS=$(($DURATION_SECONDS-$DURATION_MINUTES*60))
echo
echo "Backup Duration hr min s =" $DURATION_HOURS $DURATION_MINUTES $DURATION_SECONDS >> $LOG_FILE
echo "Backup Change in size KB MB GB =" $CHANGE_IN_SIZE \
$(( ($CHANGE_IN_SIZE+512)/1024 )) \
$(( ($CHANGE_IN_SIZE+512*1024)/1024/1024 )) >> $LOG_FILE
if [[ $RSYNC_ERROR_CODE != 0 ]]; then
ReportErrorAndExit "*** rsync reported error $RSYNC_ERROR_CODE in phase $RSYNC_PHASE"
fi
#-------------------------------------------------------------------------------
#Proactively repair the backup volume
#1 Get the device node for the backup volume.
# We will need this later.
DST_VOLUME_DEV=`ssh $DST_SERVER diskutil info $DST_VOL | grep "Device Identifier: " | awk '{ print $3 }'`
#2 Loop trying to unmount the backup volume.
# This may take a few tries because Spotlight may be busy indexing the volume.
COUNTER=0
while [[ $COUNTER < 3 ]]; do
ssh $DST_SERVER diskutil unmount $DST_VOL &> /dev/null
UNMOUNT_ERROR_CODE=$?
if [[ $UNMOUNT_ERROR_CODE == 0 ]]; then
let COUNTER=3;
else
let COUNTER=$COUNTER+1
echo "Could not unmount. Waiting 60 seconds. Attempt $COUNTER of 3."
sleep 60
fi
done
#3 Once we unomunted successfully, remount the drive
# We should now be cleared to run diskutil repairVolume without problems
# when the repair tries to unmount the volume.
ssh $DST_SERVER diskutil mount $DST_VOLUME_DEV &> /dev/null
#...............................................................................
INITIAL_SIZE=`ssh $DST_SERVER df -k $DST_VOL | grep "^/" | awk '{print $4}'`
INITIAL_SECONDS=`date "+%s"`
rm $TMP_FILE &> /dev/null
touch $TMP_FILE
tail $TMP_FILE&
ssh $DST_SERVER diskutil repairVolume $DST_VOL &> $TMP_FILE
REPAIR_ERROR_CODE=$?
kill %1 #Kill the tail command above.
if [[ $REPAIR_ERROR_CODE != 0 ]]; then
cat $TMP_FILE >> $LOG_FILE
rm $TMP_FILE &> /dev/null
ReportErrorAndExit "*** diskutil reported error $REPAIR_ERROR_CODE"
else
rm $TMP_FILE &> /dev/null
fi
sync
#...............................................................................
FINAL_SIZE=`ssh $DST_SERVER df -k $DST_VOL | grep "^/" | awk '{print $4}'`
FINAL_SECONDS=`date "+%s"`
let CHANGE_IN_SIZE=$(($INITIAL_SIZE - $FINAL_SIZE ))
let DURATION_SECONDS=$(($FINAL_SECONDS - $INITIAL_SECONDS))
let DURATION_HOURS=$(($DURATION_SECONDS/3600))
let DURATION_SECONDS=$(($DURATION_SECONDS-$DURATION_HOURS*3600))
let DURATION_MINUTES=$(($DURATION_SECONDS/60))
let DURATION_SECONDS=$(($DURATION_SECONDS-$DURATION_MINUTES*60))
echo "Repair Duration hr min s =" $DURATION_HOURS $DURATION_MINUTES $DURATION_SECONDS >> $LOG_FILE
echo "Repair Change in size KB MB =" $CHANGE_IN_SIZE \
$(( ($CHANGE_IN_SIZE+512)/1024 )) >> $LOG_FILE
#-------------------------------------------------------------------------------
#Shift all the backup names down by 1.
ssh $DST_SERVER $POWERSHIFT $DST_DIR_REMOTE 5 >>$LOG_FILE 2>&1
echo "============================================================" >> $LOG_FILE
#===============================================================================
You're probably a bit exhausted now, so I won't discuss this in detail. The main point is simply that in a number of situations where a task needs to be performed on the machine with the backup drive, for example attempting to unmount the backup drive and then run a file system check, the appropriate commands now have to be run through ssh rather than being run directly.
You will note also that I omitted the power management stuff from this script simply because I don't use push backup. If you want to use push backup and have a problem with one or the other machine going to sleep, use the power management code from the pull script.
← prev: Backup of a Local Drive next: Recovery Using a Backup →

