Monday, October 06, 2008

When Time Machine fails: Seek-click-spindown-retry

My Macbook hard drive died recently. No warning - one minute fine, the next the familiar and nauseating seek-click-spindown-retry cycle I’ve come to dread. Dearly departed was a Seagate Momentus 120Gb 5400 RPM drive. What I didn’t know at the beginning was my Time Machine backup drive was faulty too.

My external half-terabyte drive stored 9 months of history as a Time Machine backup.

I’d googled for the restore process and it seemed painless. Just boot off the Mac OSX installer CD, System | Restore from Spotlight Backup and Voila - done!

Which of course, wasn’t how it went at all.

First, the hard drive. I live on the eastern US coast and my local Apple store would only replace the hard drive with one the same size as my original Macbook, a measly 60 gig or something. They would also charge $350 so I forgot that idea immediately - it’s easy to put it a new cheap 2.5” SATA drive although Apple won’t sell you one.

The only proviso: the hard drive bay is attached to the hard drive with Torx T8 screws - you’ll need the right screwdriver.

I bought a 320GB 2.5" Seagate Momentus 7200 rpm drive (finally, plenty of space) and got it installed. Insert Mac OSX 10.5 Leopard disk, reboot. I chose the menu option to restore from a Time Machine backup and after some playing around rebooting (it seems my internal drive wasn’t recognized straight away) I got to the restore process.

Time Machine started reading from the most recent backup, got to 30% and then the hard drive stopped reading, span down like someone cut the power then started back up again. The Time Machine drive had a mid-read hardware failure causing it to “reboot”. Of course, Time Machine couldn’t handle a violent read interruption and so gave the helpful “sorry, folks, reboot and try again” message.

What follows involved reinstalling OSX from scratch, then writing some ad-hoc bash scripts to get rsync to copy my home drive from the time machine backup disk to my new home directory. This needed to be done piecemeal as an iterative copy process to cater for the random drive brown-outs.

The script looked something like this:

#!/bin/bash

while [ "foo" == "foo" ]; do
rsync -av /Volumes/TimeMachineDrive/Backups.backupdb/My\ MacBook/Latest/Mac/Users/myuser/Photos/ ~/RestoreFolder
sleep 10
done

So rsync would just start where it left off after waiting 10 seconds for the drive to restart.

Although a broken drive isn’t Apple’s fault it would have been nice if the restore process could be a little more robust. I eventually got all my data back and replaced my defective Time Machine drive.

Kudos to Apple for providing an elegant folder-based backup solution accessible from the Finder and command-line. Without that, it would have made my life a lot harder. The scripts above were trivial to write.

The use of hard links by Time Machine is clever - it allows space to be used optimally and makes full use of the existing operating system resources and is fully backward compatible.

Sometimes Apple does things right - this is an elegantly engineered solution, even if they did give the GUI designers WAY too much leeway... gimmicky, much?

No comments: