What do you get if you stick nine SSDs in a RAID 0 (striped)? “Battleship Mtron”. Reaches sustained read speed of 830 MB/s (Megabytes, not Megabits, although also not Mebibytes), gated from the theoretically achievable 1100 MB/s only by the “enterprise” RAID controller. They actually transferred 1 GB in under four seconds and booted Vista in ten seconds.
There will be a brief recess for drool wiping.
Here’s the crazy part: this article is one and a half years old, posted in December 2007. Although these were the lauded (and expensive: nine drives ran $7000) Mtron 16GB SSDs, we have better and cheaper drives today.
What else is interesting about this? Today’s SSDs only have a bad reputation because the cells have a limited lifetime that’s slowly being extended by advances in flash technology and better OS support. SSDs, as most things solid state, very rarely fail for no reason; instead, they run out. RAID 0 using HDDs was neat but dangerous since the risk of a drive failure grew. RAID 0 using SSDs means that the data’s moved faster and, although the whole setup now bears the burden of each SSD’s risk of sudden failure, each individual drive is by far less likely to fail, since it gets half (or less) the traffic.
According to this article, flash SSD’s ‘limited lifetime’ is FUD. Would be nice to have the industry get to the bottom of it.
http://www.storagesearch.com/ssdmyths-endurance.html
By Glenn Rempe · 2009.05.30 18:31
I know that the effect is overstated because the lifetime is often given for each cell, and not every cell is written to as often. I have read somewhere, although I can’t find a source now, that a few percent are dedicated backup cells for when “primary” cells fail. In the face of both of these, the lifetime will in fact be dramatically longer than an HDD given non-exceptional and evenly distributed load.
Intel’s X-25 supposedly can last for five years provided that the load on the drive doesn’t exceed 100 GB daily (I think it’s the guarantee they give to the OEMs). With explicit recognition and support for SSD in modern operating systems, it seems that it’s likely a non-issue; but X-25 is a high-end drive that would still go under dramatically faster with heavy sustained loads (even if, in all honesty, I don’t think those that are likely to buy X-25s will buy just one and use it for five years). That leaves the question of how lower-end drives hold up, and actually of how high-end drives would hold up against heavy logging.
I’m mostly over my worries. I was a reluctant non-believer around two years ago since the capacity was not at all there (it still trails by a lot, but today’s disks are so ginormous that unless you’re editing HD videos you’ll likely be fine) and since the cell lifetime was shorter.
By Jesper · 2009.05.30 19:18
Yes, the “your SSD will wear out” epithet is FUD, but concern over wear-leveling isn’t — only because the wear-leveling and block erasure algorithms on cheap taiwanese SSDs are terrible.
RAID is increasingly an awful idea (speaking as a guy with two large arrays (6 and 8 drives)) — not because of catastrophic failures, but because of pointless bullshit ones!
Random catastrophic failures in modern 1TB desktop drives are exceedingly rare, almost to the point of not worrying about at all. Unfortunately, bullshit CRC and disk controller errors happen on a regular basis.
Only one in a billion block reads might result in a CRC error, but with multiple 1TB drives you have nearly a %100 chance of hitting one. It might not matter for your actual data that 512b is missing (because who cares if a single frame of video is slightly molested?), but your hardware RAID controller is sure to freak the fuck out!
Disk controller errors are even more ridiculous — they manifest as random pauses that stop all IO for 100ms to several seconds. It’s annoying with a single drive, but with RAID the controller will consider the drive dead and degrade the array. For no good reason! Instead of working to preserve your data’s integrity, the controller has gone and fucked you over, because it’s presumptions as to how disks fail are way out of date. All the currently available drives >1TB do this pausing incessantly (it’s why they aren’t ‘enterprise’ yet), and all the non-enterprise SSDs do too.
Basically, striping a bunch of nice SSDs together will keep them from wearing out as soon, but your chance of a single pause wrecking the whole thing goes up exponentially — it’s just not worth it. The real trick is keep the striping well above the block level (and the filesystem level, depending), so that the good intentions of your hardware and kernel don’t fuck everything up.
Both CRC errors and drive pauses happen to me multiple times a year, and often crop up again while rebuilding the array. I am never using RAID again — single disks are big enough now, so I’m just going to use them as independent volumes and use rsync. COW filesystems like ZFS and btrfs are a step in the right direction, but not all the way.
By Fred Blasdel · 2009.05.31 08:01
I can see how cheap RAID5 (and above) controllers could be unreliable, just as cheap SSDs could have unreliable wear-levelling. But isn’t RAID0 easy enough that at least the good controllers should have gotten it exactly right right now? Is it poor drives that generate CRC errors on read in the exceptional case?
Or is this really simply a case of big numbers making unlikely things very likely? Like how cosmic rays will eventually zap a bothersome number of bits on machines with enough memory, and therefore ECC RAM is not complete bullshit in boxes with, say, 32 GB.
By Jesper · 2009.05.31 10:46
I don’t have cheap RAID5 controllers — they’re both high end 3ware cards! The newer one at least has the option to ignore CRC errors during a rebuild.
Hardware raid controllers are pretty universally stupid, they rarely even get RAID1 right! — almost none will parallelize seeks across the disks to boost read bandwidth. Linux’s md does this (and other seemingly obvious optimizations) perfectly.
It’s exactly a case of really big numbers with the CRC errors, where the size/density of the disks and the hostility of the universe combine in a swirl of entropy.
For the disk controller errors, it’s a classic case of software development FAIL. It’s not just the shitty taiwanese shops churning out cheap SSDs either — all of Seagate’s recent very-large drives have had ridiculously awful firmwares.
By Fred Blasdel · 2009.06.01 08:12