What Hard Drive Should I Buy?

Interesting harddrive statistics from cloud backup provider Backblaze. The once infamous Hitachi “Deathstar” seems to have completely turned it around and is now the best of the three tested.

I usually tend to prefer WD drives given their impeccable history and easy RMA process, but now I at least have an option.

Offsite backups using CrashPlan – review

With the World Backup Day in our rear-view mirror, giving a second thought to our backup needs become utterly apparent. Most computer professionals probably have some kind of nagging voice inside their heads reminding them of creating backups, which works fine to some extent, until realizing that all backups are in-house and will be lost in case of a fire.

People tend to have no backup at all however, and adding a cloud based backup solution would greatly benefit these kinds of users. There are a lot of options though, and finding one that suits a particular need is not the easiest thing to accomplish.

Given that modern age files are quite large, with photo libraries containing 20 Gb worth of pictures every year or more, having a solution with unlimited storage, or as cheap as possible per gigabyte, is crucial. Not only that, setup has to be minimal and it should by default backup everything in the normal documents folder, music and other type of user-created content.

Needs

Having a DSLR camera that outputs raw files at about 10 Mb per photo, which within in a year amounts to 10 to 40 Gb worth of pictures, a remote backup solution with plentiful of storage is desperately needed. Other types of media include captured HD video files of irreplaceable moments and bought music, which together amounts to hundred of gigabytes worth of precious and irreplaceable data.

This means that my storage needs are quite large and increasing by the day, which means that a reasonably cheap and fast service is needed, which in addition is reliable and as secure as possible. These demands might sound like an oxymoron, but finding the perfect backup solution should encompass all these properties in some way.

I would also like a service which is reasonably priced for at least three computers backing up to the same account, but preferably being able to use at least five computers would be optimal. This means that backing up my parents’ computer to the same account will be a breeze and with no extra cost.

The whole reason for having a cloud based backup is to have my precious data available off-site, and to make things easier, the service should preferably have reasonable download and upload speeds and its agent should be able to operate without intervention when everything is configured and running.

Alternatives

When deciding to use a cloud based backup solution, there is a wide array of applications and services to consider. There are different types of backup services, and the most common ones are probably file synchronization services such as Dropbox and box.net.

While their goal is to synchronize files between different computers and other devices, they also have the ability to backup versions of the file when they change or are deleted. This provides an excellent solution for sharing document and other files when collaborating with other people, or when working on the same content using different devices. Storage is however not cheap if you plan to store more than a couple of gigabytes worth of data.

On the other side, there are backup software which usually do not have the file synchronization capability, but are more focused on keeping backups of your files, with no bells and whistles. The benefit of using something like this instead, is that cloud space is usually cheaper, with many backup providers claiming “unlimited” space.

There are a lot of players in this market however, such as SpiderOak and BackBlaze. While SpiderOak could possibly be a descent service, it would be too expensive for my storage needs. At the rate of $10 per 100 Gb, with how many computers you like, it however becomes apparent that this is an excellent service if your storage needs do not exceed that first tier of 100 Gb.

Backblaze on the other hand has a native Mac client and offers an easy plan of $5 per computer and month for unlimited storage. One of the key features however is their restore service, which means that they can overnight you a hard drive or DVD with your data for a fast restore. There is just one problem with this service — the data on the chosen media is sent unencrypted!

That brings me to the topic of security, and that no one of the services above have (to my knowledge) support for using your own encryption key. This means having to trust the provider to keep your password and key secure, instead of knowing that your own encryption key never leaves your computer.

CrashPlan

Another option I considered was CrashPlan, which was featured on the World Backup Day website. Having never tried it or even heard of it before, I was reluctant to consider it. The client is also written in Java, making it easier to run on multiple platforms, but memory and performance issues are usually lurking.

The user interface is quite pleasing to the eye, and once the client is initially launched and an account is created, a backup of the home directory is started automatically. Most people would be satisfied with leaving the application in its default state, since their entire account would be backed up. There is however a lot more than meets the eye at first glance.

Destinations

The most prominent feature when starting the application is the destination selection, providing the ability to backup using different storage endpoints. While backing up to “CrashPlan Central” will cost you money, the other backup options are free.

If you have a friend running CrashPlan, you can add each other as destinations for the backups, giving both parties the benefits of off-site backups while still using the free version. You will however need to provide enough storage for each others’ backup needs, which is not free in itself.

The same procedure can be used between different computers within your own account. They can act as destinations as well, potentially providing you will off-site backups if you have computers at different physical locations.

Speed

As mentioned earlier on, having an online backup together with a large backup size requires plentiful of bandwidth to work properly. Having backed up a considerable amount of data to the CrashPlan servers, there was a big difference in how fast the server nodes were able to receive the data.

Before measuring the upload speed, the settings for CPU and bandwidth usage were tweaked to allow maximum throughput. My internet link is a 100 Mbit fiber connection, so if there are any delays or bandwidth issues, they reside on the server side.

I started backing up my music collection on my Macbook Pro, which performed at a fairly constant rate at 3.2 Mbit/s. Even though this was fairly slow, it was bearable, give my not so large music library on this particular computer.

Backing up on the NAS was a completely different story however. Another server was chosen as the target for the backup (this is done automatically), but this time around, the throughput maxed out at about 700 Kbit/s at times, which is terribly slow if the data to be backed up exceeds 100 Gb, which it did in this case.

Security

Having a backup solution in the cloud inherently raises privacy and security concerns. A lot of people will be uneasy giving up their data to a third party without knowing their data is safe from prying eyes.

CrashPlan uses Blowfish with a 448 bit key to secure the data at rest, and the communication is additionally encrypted using normal SSL connections with AES and a 256 bit key. The Blowfish key is then escrowed together with your data on the CrashPlan servers, encrypted with your account password.

For most people, the above solution is perfect, given the simple nature of the setup. The end user never has to touch the encryption key or remember anything more complicated than their own account password. When restoring files on a new computer, it is just the matter of logging into the account and restoring the files from the server.

The downside of this solution is that there is no way to partition the associated computer within the account, meaning that any computer logged into the user account can restore any file from any computer to the local computer.

There is another security mode which separates the encryption key with the user account. That way, you still have the CrashPlan user account, but the encryption key is protected with another password. The benefits of using this mode is that different computers can have different passwords, and thus separate encryption keys. This fixes the problem with all computers being able to access all information on each server associated with the account.

The third option is to provide the encryption key directly instead of using passwords to encrypt the key stored on the server. This means that it is impossible for someone without knowledge of the encryption key to decrypt the data. The downside is that the key needs to be kept secure, since it needs to be provided when doing a data restore. Having the key on paper in a safety deposit box or some other secure location will be necessary, since losing the key means that it will be impossible to decrypt the data on the CrashPlan servers.

Security conscious people will undoubtedly distrust the implementation of the client handling the encryption key. Who knows if the key is secretly transmitted to CrashPlan without the user’s knowledge?

Conclusion

Having started the trial of CrashPlan only a few days ago, I have yet to uncover severe behavior and inconsistencies. It has been a fairly smooth ride so far setting up my own encryption key and backing up three computers.

There was however one weird kink when creating and using keys for encryption. When the key was created on the Windows platform, it could for some reason not be validated on the Mac, which at first made me doubt the service. However, when I created a new key on the Mac, it could successfully be used both on the Mac and in Windows, as well as my Linux server.

If you are planning to use CrashPlan on the Mac, you may experience an unusually high memory load, which is partly the result of CrashPlan being executed using 64 bit Java. There is a simple way to change to 32 bit execution however, which involves editing /Library/LaunchDaemons/com.crashplan.engine.plist and adding “-d32” to the ProgramArguments section. For other memory optimizations and a discussion, have a look at the Reduce memory usage thread on the CrashPlan forums.

Another thing which could be improved upon are upload and download speeds, which are abysmal compared to the available throughput. The speed when backing up my Mac seemed to stabilize at about 3.2 Mbit/s and the speed on my NAS is running at about 1 Mbit/s. Extremes ranged from about 500 Kbit/s to 20 Mbit/s, which is basically all over the place. Not that this is usually a problem once the initial backup has completed, but it could be a lot faster. This is also one of the reasons I am hesitant to become a member once the trial has run out, but I may change my mind, since it is extremely convenient.

The other reason however, is privacy. While I am confident that CrashPlan does not “backup” my encryption key once I have chosen to use my own, there could be programming errors or other problems, exposing this key in some manner.

The alternative would be to setup a backup server at some location with plenty of disk space to mirror all my data, including changes made to file and using some kind of rsync snapshot solution. This requires a somewhat hefty investment on the hardware side however, while CrashPlan is ready to backup anything I throw at it.

When the trial expiration starts to creep up, I will hopefully have some more insight into reasons to stay or quit. Until then, I am staying with CrashPlan.

Limit Time Machine disk usage on external drives

Time Machine is a very simple and elegant backup solution for Mac OS, with an intuitive restore browser. The problem with Time Machine however, is that it takes up all free disk space before starting to erase old backups. This is no problem if you have a dedicated Time Machine disk, but most people usually want to keep other things on the very same disk.

Time Machine uses different methods for network backup and local backup. One way of limiting remote backups is covered in an earlier article called “Create a fixed size network storage for Time Machine“, so this will instead focus on limiting the disk usage on locally connected disks, such as USB or Firewire.

First make sure that you are using a HFS formatted disk, since we are going to resize the partition. Start Disk Utility and select your external disk from the left menu. Click the Partitions tab and you will be presented with your entire disk. Drag the bottom-right handle of the partition up and make it as small as you want your Time Machine to be. When you are satisfied with the new size, click the plus button at the bottom to add an additional partition to occupy the free space.

Disk Utility

Disk Utility

Now open up the Time Machine preferences and select your disk!

Time Machine Preferences

Time Machine Preferences

The additional volume can be used to store anything you want. Just remember to eject the disk properly before you disconnect it from your Mac!

Safe document writing using Dropbox

dropbox-logoLots of people who write articles or create content in any form, often find themselves generating lots of files. A writer will for instance probably have lots of article drafts laying around. Everyone have different solutions for revision control and backup, ranging from a simple manual file copy to using a full-fledged revision control system such as Subversion.

For everyone else, there is a simple solution for keeping backups of your work in progress, as well as being able to retrieve any previous revision. In addition to all this, it even lets you sync files between multiple computers and access your files online from any computer with internet access.

I guess you know by now that I am talking about Dropbox, a service available for Windows, Mac and Linux. It installs a small application on your computer which monitors a configurable directory for changes and uploads them automatically to the Dropbox servers.

dropbox-revisions

The free version offers 2 GB of space, which should be enough for most people. For photographers and other people dealing with lots of large files, there also a premium option available which gives you 50 GB for $99 per year.

The web interface is beautiful and easy to use for navigating your Dropbox and downloading the files. This is also the place for viewing older revisions for your files and delete, copy, rename and delete them.

dropbox-events

A very handy feature is the ability to share folders with other Dropbox users! If you are working together with other people in a project, just share a folder between you and everyone will instantly have access to all changes in the project folder – automatically.

dropbox-publicThere is even a way of sharing files with non Dropbox users. There is a special folder in the root of the Dropbox named “Public”. Putting files here makes it possible to right-click on the files and copy a public URL for it. To let other people download the file, it’s just a matter of sharing the link with them. They can’t of course make changes to it, nor view its revision history.

Another special folder in the Dropbox root is the Photos folder, which creates instant photo albums for viewing on the web by anyone. This is definitely the easiest way of getting a photo album up on the web, since you only need to copy or move the pictures to this special folder on your computer – Dropbox does the rest.

All iPhone users out there, and possible other phone owners, can access the iPhone web interface too for downloading files in the Dropbox. It is even possible to view the uploaded photo galleries.

dropbox-iphone

There is a tour available on the website which explains all features more in-depth.

Upcoming features include:

  • Timeline based undo
  • Online visualization for any file type
  • An iPhone application/interface that let us download files of interest (pdf, docs, pictures..)
  • Watch any folder support (configurable per host)
  • Better shared folder controls (permissions, etc.)
  • Online edition for text files
  • Add friends
  • Improve Upload Speed
  • Group accounts

If you decide to give Dropbox a go, consider using my referral link when you sign up. That way, both you and I get additional storage for free!

Disclaimer: From this article it may seem like I work for Dropbox, but I don’t. I just like their service a lot!

Create a fixed size network storage for Time Machine

Time Machine is a backup program built into Mac OS 10.5, Leopard. It saves all files on the computer on a USB  or network drive, which can be used for restoration of individual files or the whole computer.

The normal behavior of Time Machine is to keep

  • hourly backups for the past 24 hours
  • daily backups for the past month
  • weekly backups until your backup disk is full

It is the last point that might cause some trouble for some people, since many people might share the drive with other type of data. There has to be some way to limit the size of the backup volume. This is my approach.

Preparing an image

The first step is to create an image to hold the backup filesystem. If you want this filesystem encrypted, have a look at Mounting encrypted volumes, otherwise just follow the following steps. The image will be created as /ext/timemeachine.img and it will be mounted in /ext/timemachine.mnt.

dd if=/dev/zero of=/ext/timemachine.bin bs=1G seek=250 count=1
losetup /dev/loop1 /ext/timemachine.bin
mkfs.ext3 /dev/loop1
tune2fs -c0 -i0 /dev/loop1
losetup -d /dev/loop1
mkdir /ext/timemachine.mnt

The first thing is to create an image file, and using the dd command we create an empty 250GB file, which will contain the backups. The next step is to setup the image as a loop device, which makes it possible to mount it as usual. loop1 is currently used, but if you know that it is occupied, feel free to choose another device.

The next step is to edit /etc/fstab and add a line which will automatically mount the filesystem when the computer boots.

/ext/timemachine.bin /ext/timemachine.mnt ext3 loop=/dev/loop1 0 0

Then we will mount all filesystems and verify that it has indeed been mounted.

df -h
/ext/timemachine.bin  248G  188M  235G   1% /ext/timemachine.mnt

There should be a line like the above if everything is working correctly. The last step is to set the correct permissions for the directory for your user.

chown -R joch /ext/timemachine.mnt/

Setting up the Samba share

To connect to the server, it is necessary to setup the Samba server. Create a share like the following in /etc/samba/smb.conf.

[tmbup]
comment = Time machine backups
path = /ext/timemachine.mnt
browseable = yes
read only = No
inherit permissions = no
guest ok = no
printable = no

Now just reload Samba and add a user if you have not done so before.

invoke-rc.d samba reload
smbpasswd -a joch

Setting up Time Machine

Connect to the share in Finder as usual.

Finder window

Open up the Time Machine preferences and click Change Disk. It should give you a dialog like this, and Time Machine should then be enabled.

Time Machine setupTime Machine enabled

If you get the error “Time Machine Error: The backup disk image could not be created.”, you will need to do some magic on the server.

Time Machine error

You need to start the backup once again, but this time you will have to be quick and copy the directory it creates on the server. Once Time Machine has finished, the original directory will be deleted, so just copy the saved directory back to the same place.

cp -rp Johnnys\ MacBook\ Pro_001ec2123456.sparsebundle/ ..
# Wait until Time Machine has finished
cp -rp Johnnys\ MacBook\ Pro_001ec2123456.sparsebundle/ timemachine.mnt/

Now run the backup again, and it should complete successfully!

Time Machine run

This behaviour is very strange, but the above trick always solves the problem.

Synchronize your data using Grsync

Keeping your data synchronized with an external data storage is essential to keep your documents and other data secure. Rsync is a robust and popular tool for doing exactly this; so what better tool to use as your personal backup solution.

There are of course other tools for doing this such as Unison, which I wrote about earlier. Which tool you prefer to use for backing up your data is a matter of personal preference, as long as you actually use it. This article will not directly use the rsync tool, but instead discuss the GTK front-end, which gives the user access to the most usable functions and settings.

We will start by installing grsync with your favorite package manager. If you are using a Debian based distribution, just execute apt-get install grsync to get hooked up.

Next, we will initialize a directory with data and a directory to keep the backup. The backup directory should of course be located on an external disk, network drive or something other than the local computer.

$ mkdir -p sync/data sync/backup
$ echo “This is the contents of the first file” > sync/data/one.txt
$ echo “This is also some dummy content” > sync/data/two.txt

The time has now come to start grsync. Start by creating a new session by clicking add and figure out a name to describe your sync pair.

grsync.png

Browse to the source and destination directories to select them. Note that if you are synchronizing to a FAT, NTFS or other type of file-system not supporting Unix permissions, uncheck “preserve permissions”, since those depend on how the partition is mounted, and not the actual permissions.

Before executing the task, it might be wise to run the simulation to see possible problems or just to get reassurance of which files will be copied. When you are ready to start the sync, just press execute and hope for the best.

progress.png

The files should now hopefully be correctly synchronized to the sync/backup directory. You might also notice that the actual command to rsync is displayed on the top. This command could be useful if you want to automate this process using cron or something similar.

To conclude, I have to say that Grsync is a very competent and easy to use tool, suitable for both beginners and more advanced users. The GUI looks polished and usable but will still give you detailed information if you want.