Can cloud computing can be used as a backup system?

There are lots of good definitions for what cloud computing is, but this week’s question focuses more on whether or not cloud computing can provide a specific function in a specific scenario.  The answer is yes, maybe.

For some people cloud computing is simply another way of saying that you’re storing information on someone else’s computer.  The real issue is really about bandwith and the method for data transfer.  Andrew Tenenbaum said “Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway” and as of 2007 FedExing hard drives was still faster than using the internet to transfer the information.  One of the recent technologies that helps make large data transfers more possible is’s adoption of ZFS replication.  

When talking about this problem for small amounts of data the answer is a very easy yes, but as the data size grows the question becomes more technically involved.  The syncing/offloading tool is crucial to a successful solution.  At home I use BitTorrent Sync to synchronize hard drives for backup purposes.  It provides an automated solution to moving the data between machines that I control.  Syncthing hasn’t quite matured but should be mature enough to trust in production later this year.  Owncloud is mature enough for CERN’s dataset, but I’ve had issues with it in the past possibly due to their integration of csync’s features.  After loosing data I’m not ready to trust it again yet.  These tools are a bit different then the Google Drive, Dropbox, Carbonite and other tools that exist in the market because they can be used on one’s own off-site systems and not just on someone else’s cloud.  

For example, my brothers and I are spread across several time zones and we’ve used these tools at times to synchronize a drop folder among our houses.  In those scenarios my brother’s computers formed the cloud, not some external third party service that routinely gets requests from the government to share information stored on their hard drives.  If it makes anyone feel better Google for business does advertise HIPAA Compliance and they did side with Apple over the San Bernardino phone.  I bring this up to point out a key dimension in storing information on the cloud.  Unless it’s your cloud, you don’t truly control your data.

Bit for bit the most efficient way to transfer the information over the internet I’ve been able to find is ZFS replication.  If the hard drive is encrypted then the transfer will be encrypted.  If the cloud you’re storing it to is yours then the data will be yours when you need it.  That being said though even the most ninja-like ZFS gurus still put hard drives in their cars from time to time.  So the answer is yes, but just out of curiosity, how what sort of gas mileage do you get?

As an aside, this subject is extremely relevant to the recent issues with hospitals becoming victims of ransomware.  A good backup and restore model can completely negate the ransomware model currently being used.

Scenario.  You run all of your data on servers running ZFS set up to create snapshots at 15 minute intervals (think of a snapshot as a really efficient incremental backup–it only looks at the bits that have changed and doesn’t rewrite all the data blocks).

The bad actor gets in through vulnerability and encrypts the hard drive.  You catch it when you get an email a few hours later.  Business is halted for a few hours while you find the vulnerability he exploited to get in.  You close the vulnerability and restore from the latest known good snapshot.  No need to pay the ransom.  Only a few hours of data lost.  You’re welcome.

Am I missing something?  Was this helpful? Feel free to leave a comment below.