Quick and efficient backup methods – and what to do when it all goes wrong
The time you come to realise just how dependent you have become on your computer is when things go terribly wrong. Your partitions won’t mount, your files seem corrupt or, worst of all, your entire hard drive seems to have become unreadable.
The first and most obvious piece of advice, as anyone would tell you, is to make regular backups. It is surprising how many people who have years worth of valuable files have few or none of them backed up.
Here we will look at some preventative measures and – in case bad things do happen – some tips for disaster recovery.
rsync is a really useful tool for duplicating files quickly and with minimal fuss. The duplicate files it creates can be ‘synced’ at any time, so only changed files will be copied over. This method is good for regular backups, where you are protecting yourself more from accidentally deleting files or finding an older version of a file. To copy the directory ‘documents’ to /backups you could run:
$ rsync -av documents /backups/
With this command rsync will use the archive method (-a), which preserves all of the files’ original attributes (including permissions, date modifications and so on), and will recursively copy the directory (including the documents directory itself) to /backups. You could run this command weekly to maintain a regular copy of your files.
Rsync was designed to work well over a network. If you have OpenSSH set up on both machines, just include the machine’s name before the directory, for example:
$ rsync -av documents myserver:/backups
An alternative to copying individual files and directories is to make a raw copy of an entire hard drive or partition. This is easily performed in all forms of Unix (and OSX) using the standard ‘dd’ command. What this does is to read raw bytes from any file (including devices) and to copy them back to any other file or device. Since only raw bytes are being read, the dd command doesn’t need to understand the layout of the data at all, so this method works with any kind of filesystem. For example, /dev/hda1 is a 5GB partition and contains a Linux ext3 filesystem:
# dd if=/dev/hda1 of=image.ext3
The dd command now reads every byte from the partition and copies it to a file called image.ext3 in the current directory. This is an exact clone, so copying back the file to the partition would return the data to the precise state it was in when the backup was made:
# dd if=image.ext3 of=/dev/hda1
You can duplicate an entire hard drive in this way. If you have two hard drives of exactly the same size, one on /dev/hda and the second on /dev/hdb, copying every last byte can be done the same way:
# dd if=/dev/hda of=/dev/hdb
A severe word of warning: be very careful when using the dd command. A mistyped command can result in a complete loss of data. If you accidentally entered /dev/hda2 on the second command, your backup would be restored to the second partition instead, completely overwriting whatever was there previously, and without any warnings.
More efficient cloning
While dd is found on every Unix and Linux system and is a reliable way to back up your data, it has a major disadvantage for general use: every byte must be duplicated, whether it contains any valid data or not. If you want to back up a 100GB partition where only 20GB of the space is in use, you will still need 100GB of additional space to clone the data.
Partimage is a useful open-source application that helps to solve this problem. Unlike dd, partimage copies only used parts of the filesystem. To do this, it must obviously understand the filesystem it is copying, so it isn’t a simple raw copy like the one dd provides.
Partimage supports all popular Linux filesystems (ext2, ext3, reiserfs and so on) as well as Fat-formatted filesystems. It has limited support for NTFS. Partimage can also compress the data it backs up, which may help to reduce the size of the backup considerably, and on fast modern machines it makes little difference to the backup time.
Recent versions allow backing up over a network. Although it is a terminal text-based application, it is really no more difficult to use than a typical GUI program. If you’re using Ubuntu Dapper or Edgy, partimage can be installed from the Universe repository with
sudo apt-get install partimage
Updating your subscription status