Backups

Let's talk about backups. Hardly anyone has adequate backup of their important files. Even IT professionals, who know better than anyone that backup is important and who should know how to do it, don't backup their systems appropriately.

A related, but somewhat different, problem is synchronizing your important files across multiple computers. I had used networked filesystems before Dropbox, but now Dropbox has made synchronization so convenient that I think it's now something I would hate to live without.

What are my actual backup requirements? Can Dropbox solve my backup needs in addition to being a synchronization solution? What about other cloud storage providers, or less expensive options like Carbonite? Is a home-brew solution feasible? These are some questions I will try to answer, mostly for myself, here.

My backup requirements

There is definitely a working set of files that I want to see synced to every machine that I work on. Dropbox has taught me that. When using Dropbox, I know that my most important files are backed up because they are synchronized to multiple computers. As further insurance, Dropbox is storing them "in the cloud", just in case all of my computers go down. It seems like Dropbox has become a total solution here and I don't worry about these important files, or backing them up, at all any more.

There is another class of files, though, that Dropbox isn't right for. I have tens of gigabytes of family photos and videos that I need to back up better than I am doing currently. Part of the reason I haven't used Dropbox for these files is that I have hesitated to purchase a paid account, but you know what? I'm not going to let that be an issue anymore. I'm willing to pay $10 or $20 per month if that's the right solution for me. The rest of the reason, though, is that I don't really need all those files copied to all my computers. I don't even think I have space on some of my devices. These files, I pretty much just need them backed up. That's all.

Yet Dropbox has skewed my perspective on what I consider to be effective backup for any files. Carbonite and other "cloud-only" solutions just don't seem adequate to me anymore. Having my data on just one machine plus stored in the cloud just seems like too brittle of a solution. I had Carbonite installed on my wife's parents' computer at one point and when their computer died, they couldn't recover their files, even with the assistance of Carbonite customer service. I was in Europe at the time, unable to mount a recovery effort of my own. Also, for some reason, I don't trust Carbonite's business model. $59/year for unlimited backup seems like it must break down at some point.

There are lots of ways to keep my photo and video file folders synced to another computer so I would at least have two copies of them at my house. That would be better than what I have now (which is nothing), but I have a paranoid side that would always worry about a fire, flood, or burglary that destroys all computers in the household. That's when the extra comfort of a "cloud" copy comes in handy. I don't trust the cloud enough to rely on it whenever one hard drive goes bad, but I can trust it as a second layer of backup in those black swan disaster scenarios when two hard drives go bad.

The bottom line is that I won't be completely comfortable with my backup situation until my photo and video files are stored in two places in my house and in the cloud. This leads me to conclude two things. First, maybe Dropbox will be a good solution for these backup needs as well. Second, the main cost for my backup system is going to be the cloud storage.

Cost

How much does it cost to store extra data at my house? Let's say a 1 TB drive uses 13 watts. Multiplied by 8776 hours per year and $0.15 per KWh, that's about $20 per year (per TB). We'll convert the up-front cost of the drive (about $100) to an annuity of $5-10 per year so the total is $25-30 per year per TB, or about $0.03 per GB per year.

How much does it cost to store data in the cloud? Dropbox costs $200 per year for 100 GB, or $2 per GB per year. Amazon S3 costs $0.14 per GB per month, or $1.68 per year per GB, or $1.116 per year per GB for the reduced redundancy option. (And data transfer in is free now.) Amazon Cloud Drive is about $1 per GB per year, although their current offer is to store all of your music for free. Apple iCloud is $2 per GB per year.

Wow. Storing the data myself is about 30 times cheaper than storing it in the cloud. Yet at this point, it's not significant enough to cause me to engineer my own solution.

Carbonite

Another thing about these numbers is they make me reconsider my dismissal of Carbonite's business model. Maybe the marginal cost for data is indeed so small that an unlimited data plan is feasible. And if I'm storing my files on at least two of my own computers, then I wouldn't mind so much using Carbonite merely as a second layer of backup. This suggests that I could use AeroFS (or an old-school solution like Rsync) to synchronize all my folders onto one machine and then use one Carbonite account to back them all up for $59/year. But I guess it's not honest to back up multiple computers with one Carbonite account. They want me to pay $59 annually per computer. Then again, at that price, I could afford three Carbonite accounts for less than the price of one 100GB dropbox account.

Short-term Plan

I don't think I have 100 GB of data that needs to be backed up yet, so I'll probably go with the $20/month Dropbox plan. I'll install that main Dropbox account on two computers, then I'll create some free Dropbox accounts and use Dropbox shared folders to make sure they sync with my main account.

Technical details

Speaking of a home-brew solution, here are some interesting projects that could help you create your own backup solution:

Tarsnap - A very UNIXy solution. You run the "tarsnap" command on your important directories every night and it encrypts, compresses, and copies to the cloud only the bits of data that have changed since the last backup. Still, $3.60 per GB per year. And lots of my files are probably incompressible.

Brackup - Similar to tarsnap, brackup will encrypt your files and only backup the bits of data that change each time. Written in PERL, it can send the files to lots of different places (backends) and you can always write some code if you need a custom "backend".

A Perfect Solution

It would be great if someone could find some way to take advantage of the cost difference between storing my own data and storing it in the cloud. Say I installed a 1 TB hard drive at my house and used one-third of it for my own backup and dedicated two-thirds of it to storing backups for other people using some kind of peer-to-peer backup arrangement. It would, of course, have to encrypt my data before it gets sent. The big question mark is reliability. If my data is stored on computers that I don't control, then a program would have to continually monitor my backups to see if they disappear, and recreate them if they do.

Of course, if there were money or credits of some kind involved, then my compensation for storing other people's data could depend on the reliability of my storage and my network. I'd get paid for storing other people's data and I'd pay others to store my data. Then I could choose the proper price/reliability/redundancy arrangement for my needs.

Also Related: Distributed Hash Tables - A relatively recent invention that coordinates (in a peer-to-peer sense) the storing of data across lots of computers in a redundant, distributed way. Perhaps they could form the basis for some kind of P2P backup system.

links

social