2014/06/22

RAID++: So, you cant afford the extra cost of Data Protection at $0.10-$0.20 per GB?

Summary: You and your business probably now depend on computers and smartphones/tablets for most of your daily work and other activities. If you don't pay up-front to protect your data, you'll pay for it many times over at a later date, when, not if, you have a drive fail and lose all data.

When data is $0.20/GB (or even $1/GB) and wages are $35-60/hour and it will take a minimum of 1 day to reconstruct data, more likely a week+, spending a little money up-front for Data Protection seems prudent to me.

The 1987 Berkeley RAID paper was written at a time few people had PC's and storage cost $40,000/GB in current dollars. The economics of swapping space for computation were compelling at the time, nowdays, very few people have even$1,000 invested in Disk Storage, let alone $250,000.

Good desktops or laptops are now available in $500-$1,000 range, with Commodity Drives costing $0.04-$0.10/GB and Enterprise Drives from $0.12-$0.65/GB, and more for high-spec variants. Times are very different: raw prices have fallen 500,000, Bit Error Rates (BER/UBER) are up ~100 times, Mean Time Between Failures (MTBF) have increased 10-100 fold,  raw read/write rates have increased 100-300 times, while access times (rotation & seek) are 2-5 times different. As is estimated disk utilisation: at some point after 2000 the average drive went for 90%-100% full to ~75%, at least for Desktops. This suggests that drives are now "Big Enough" and not a System Constraint, at least not for Capacity. The advent of affordable, large Flash Memory with reasonable read/write speeds and uniform access times has removed one of the big constraints of storage: random I/O per second.

Researching RAID designs, I was surprised by I.T. Professionals and home users alike, that baulk at the cost of reasonable Data Protection, even $100 for a single USB drive plus a 4-drive NAS unit is definitely "too expensive" (<$1,500). Do they have such volumes of data that the cost of extra drives is overwhelming? Or is the data worth so little, or cost so little, that it's not worth protecting?
Data likely to be stored on home and micro-business drives is likely to be of three types, each with a different cost for Data Loss events:
  • Owner created content
    • What's would it cost to recreate the data?
      • For business records, recapturing client data can be difficult.
    • Is there an Opportunity cost if the data is lost?
    • Are there attributable use or sentimental costs for lost data or records?
  • Licensed content
    • It may not still be available.
    • There may be high additional license fees to replace from the owner/vendors.
  • Free (downloaded) content
    • May no longer be available on-line.
As well, there is a real, attributable cost to downloading files in Australia:
  • the cost per GB in the ISP plan, especially via Mobiles with punitive excess data charges, and
  • the time cost of locating then downloading new copies of files.
The easiest and fastest means to good Data Protection for low-end applications is to use one of the many Internet storage services. This option isn't available to most homes and micro-businesseses.

In Australia with its poor Broadband options, uploading speeds are constrained to 0.5-1Mpbs with ADSL2, approx 100-400MB/hour, depending on ISP congestion and Network Contention Ratios dimensioned in. Uploading or backing up just 1GB could take 10 hours, while 1TB, a small drive, will take over a year, if the link stays up and the error rate isn't too high. Until guaranteed, low-contention upload over 40Mbps is available in Australia, on-line backups, even "differential", upload/backups using Internet Storage Services are not viable. This is different to storing pictures from your smartphone. It's a comprehensive, structured and continuing "second copy" of your precious data, with searchable indexes and per-copy contents list. A full copy of your data isn't useful if you can't locate, or name, files and folders you need back.

Whilst Internet Storage Services may offer lots of space and reasonable levels of data protection, they have two "drop-dead" problems:
  • sudden and complete Data Loss due to non-technical problems. If they go bust, one day all your Data is gone for good.
    • Data that you no longer own and becomes part of the assets of the distressed business, to be sold to the highest bidder for any purposes they desire.
    • No vendor I've seen offers a dedicated, clear-title asset for drives, but I haven't looked hard. The closest might be the many "dedicated mac mini hosting" services.
  • Even if you have warning of impending collapse, the "rush on the bank" effect means nobody will be able to retrieve more than a few hundred MB of data, if anything.
    • Like a bank, if there's a whiff of trouble, everyone will want their money, or data, back, right now!
    • Business will only be in trouble if they're pushing the financial envelope, so they'll minimise expenses, including "just enough" bandwidth and small per-user quotas.
    • These two factors, correlated infinite demand and thin pipes, constitute a very effective Distributed Denial of Service attack (DDoS). As everyone rushes to get their data, they block access to the very thing they want. If the hosting service hasn't taken steps to limit congestion, allowing unlimited connections, then the link will be permanently saturated, but no useful traffic will flow. All connections will time-out and restart.
For Australians, and probably many people overseas, they need to fully curate and maintain their own data holdings. If your business, or just personal tax/income records, are on a computer, you need a minimum of three independent copies of the data, at least one of which needs to be stored at least 16km (10 miles) away in a secure, clean, dry, cool location.

Running just external drives and a software equivalent to Apple's "Time Machine", then if you have 1TB of data, a portable USB drive will cost ~A$75-$100. You need three and you need to replace them - drives have a 5 year design-life. At best, that's $250 every 5 years, or $50/year, around 1hour 20 minutes of your work rate, 1min 30seconds per week, just for the drives. You then need to cost your time in swapping around those drives and transporting

Backups are only useful if you can restore from them. Is any home user or micro-business really going to spend time every 6-12 months confirming all drives are readable and without error? That's at least a 2-4 hour task. At any hourly rate, it gets expensive, quickly.

If you think "I can just use Flash Drives or DVD's", think again. You'll still need at least two copies of your data on top of your main copy, and one must be off-site and well maintained. At $1/Gb for Flash (times two), you're up for $2,000 for 1TB, or storing just 50GB, one or two hours of video. DVD's, even at $0.25/disk are the same price as Flash: $1/GB. Both Flash and writeable DVD's "fade" over time. Flash chips wear out (fail to store data) and lose charge, while the dyes in DVD's fade. It makes sense to write all you accounts/income data to a set of Flash drives that you rotate every day and to regularly replace all those drives.

That's a per machine cost and doesn't include smartphones, tablets and music or video collections.
Most households now have multiple devices, per person, and run an ethernet network, either wireless, hard-wired or both. Low-end NAS, or Network Attached Storage, devices that connect to home networks come out at around $100/drive-bay plus drives (~$250 for 2-bays and $400 for 4-bays).

Many NAS appliances also have a USB port to attach additional drives. I've no idea if they can be used to create copies for off-site storage.

Using 3.5" drives at $150-$250 each (commodity or NAS-grade) gives you a second copy of your data, shared across all machines, for $500-$1000. With a 5-year drive life and 3% Annualised Failure Rate (20% extra for failed drives over 5 years), it's $125 to $250/year: or 4-8 hours of wages per year (2-4 hours for a business).

That's the trade-off: unknown time spent recreating your records and files after losing a disk, always at an inconvenient time & not one of your choosing, or trade a half-day of wages each year against that eventuality.

If you have a hobby where you store and use large amounts of data, say Photography or Video, the chances are, you're already using an internal or directly connected RAID device, just to store everything and to get reasonable speed. You've most likely already experienced a Data Loss Event and will have developed a backup regime to suit your needs and budget.



Wage Rate Calculations

average full-time wage: $75,000/year
naive rate: $35.93/hour  [40 hour week, 52  weeks/year]

take-home: $56,250/year
take-home hourly rate: $34.09/hour [37.5 hour week + 20 days vacation, 10 days public holidays, 5 days sick leave]< cost to business salary: $118,177.50/year [30% extra "on-costs"] per-position yearly salary: $153,630.75/year [100% coverage of a position for 50 hours/week service] on-cost hourly rate to business: $59.09/hour [37.5 hour week etc]

37.5 hour week: 1650 hours/year [4 weeks vacation, 10 public holidays, 1 week sick leave]
52 weeks @ 40-hours: 2087 hours/year
52 weeks @ 10hr-shifts, 5 days/week: 2600 hours/year
multiplier: 1.5757 [from 37.5 hours to 10-hour/day 5days/week coverage]

No comments: