Home | About the Storage Advisors | Adaptec Trusted Storage


In a spin about not spinning …

Posted in General, Storage Applications, Platforms, Storage Interconnects & RAID, Storage Management, Application Environments, Advisor - Neil by Neil

In a spin about not spinning …

I hope people appreciate how long it takes me to come up with these snappy headlines.

Just spent a few days at Computex in Taiwan … wading through a mountain of new equipment, motherboards, memory modules and especially, importantly and all-pervasively … Solid State Drives (SSDs).

They were everywhere. It seems that there are thousands of these things on the market, ranging from well-known to never heard of. Of course, they are flavour of the month and have a very large target audience, but it was amazing to see how many brands are on the market.

Now my focus is on storage, especially in the server arena. While I would have loved someone to give me one of these babies for my laptop that didn’t seem to happen, so I whiled away my time asking questions of these manufacturers. Now to be fair, there were a lot of very pretty young ladies trying to sell product whom I don’t think were product experts, but there were also the supposed “product experts” hanging around the back of the stand trying not to be bored to death by several hundred thousand tyre-kickers, so there was knowledge there if you asked hard enough questions.

Now what sort of questions would be “hard” questions for the SSD industry? Simply (a) are you using SLC or MLC technology, (b) what are your performance figures and (c) how long do you expect this drive to last in my laptop and an average database server (maybe a 50-user SQL environment).

Question A generally drew some blank looks, but reading the tech specs found the answer (which was, to a very large degree, MLC). Question B was easy … everyone has amazing figures on display (take all with a dose of salt) so there’s no shortage of amazing claims out there.

Question C had people running for cover. “Oh, we are not targetting the server market” was the well-practiced standard response by most manufacturer’s representatives. However when we got to discussing their view of the future drive market, all seemed of the opinion that SSD would certainly take over the SAS market, and that SATA would come under threat as the size of SSD increases and the price comes down.

Great, so I want to build a database server. Yesterday I would have used 4 x 15K RPM SAS drives in a RAID 10 configuration to get my best combination of read and write performance (especially on small writes) and pretty-well-sorted reliability. Today the “new age” drive industry would have me using 4 x SSD drives in the same config with amazing IOP and throughput figures.

So I build my system with SSD … how long will it last? I am aware of the long write-life of SLC technology, but it also has an amazing cost involved, so many people will not use those drives, but drop for what seem a better bargain (which happen to be MLC drives). Now I don’t believe MLC has anywhere near the write life of SLC, so exactly how long will this server last before things start going terribly wrong with my storage.

With my SAS investment I could easily expect 3 years, and more likely 5 years good use (on average) from my expensive, hot, expensive to run and heavy SAS drives, but what will MLC give me? That one I could not get an answer for.

Now the SSD drive experts and manufacturers out there will read this and immediately start yelling at me that I’m using the drive out of it’s intended target usage pattern, and that I should be using SLC technology, however these are the same people telling customers that they should use SSD for everything and omitting the relatively important information about drive life when selling the product and making amazing claims to customers about size and speed, glossing over the underlying technology questions.

Don’t get me wrong … I think SSD is great, it’s here to stay and I want one sooner rather than later, but there are a lot of variables out there with these drives, and Joe Public doesn’t seem to have much of a clue.

It’s either “buyer beware” or do your research first.

Ciao
Neil

12 Responses to “In a spin about not spinning …”

  1. Ernst Lopes Cardozo Says:

    Niel,
    I have two questions. 1) Why would you need four SSD’s in stead of two? They should be fast enough without striping. 2) What do you expect to pay for the replacement drives when the first ones are exhausted after one or two years?

  2. Neil Says:

    Ernst,

    Why four SSD instead of two? Good question (in other words I’m stumped). You’ll probably need more than two to cover the capacity requirements of your database. In fact you’ll probably need a lot more than two to get close to the previous SAS capacities you are used to.

    As for performance? Again a good question. Will I get better performance from 4 SSD against 2 SSD? In fact I think I will. The principle is the same as platter drives … the more spindles (or drives in SSD case) you have, the less data in general you will write to each drive. Of course if you happen to write a small enough block of data that it fits one one stripe on one drive then there is no advantage. When someone gives me 4, 6, 8, 10 and 12 Intel X25E drives to play with I’ll tell you :-)

    As for “what do I expect to pay for replacement drives”? - zip. A quick check on Intel’s website reveals a 3-year warranty on these drives so they’re confident. As for the rest of the manufacturers it brings up a good point … users should be very careful as to what warranty the vendor is providing with these as yet unproven drives.

    Ciao
    Neil

  3. Pete Steege Says:

    Beautiful synopsis of the state of SSD today. It
    s a great technology, but not yet a mature product. There’s a huge difference between these two things. SSD performance can’t make up for product immaturity. Those who use storage in substantial endeavors get this.

    The storage device industry is rapidly moving to make SSD into an IT-ready product, one that matches the compatibilty, standards and capabilities of today’s disk drives.

    Not there yet.

  4. Neil Says:

    Pete,

    I agree with your points that SSD is not yet a mature technology. In fact it may never be “mature” in that the memory market has been one of the most dynamic segments of the industry for many years and SSD (basically) falls into that category.

    There are also now many more players in the “drive” industry with the advent of SSD so we are likely to continue to see a lot of innovation … some good, some bad, some etc.

    My only disagreement is with your last statement (slightly and very picky on my part) … I hope the storage device industry is aiming a lot higher than just matching compatibility, standards and capabilities … this is a chance for major advancement in storage, not just another player on the already-level playing field.

    Ciao
    Neil

  5. Ernst Lopes Cardozo Says:

    Neil,
    I would answer my on questions differently. Your 4- drive 15k RPM SAS RAID10 array would give a theoretical performance of 4 x 180 read-IOPS and 2 x 180 write IOPS, 720/360. Current SSDs do 5 to 10 times that. So for performance we need no more than a single SSD. For security we might well chose 2 SSDs in a RAID1 configuration. As to capacity, the largest 15k RPM drives I’ve seen are 600 GB, so depending on the size of your database, it may be necessary to have multiple SSDs. 256 GB units are readily available and larger units, up to 1 TB, have been announced.
    The write endurance for current multi-level flash is 10.000 cycles per page. As geometry shrinks this number is decreasing to about 5000. Now let’s calculate how long it takes to exhaust our initial set of SSDs. Assuming a decent SSD-controller, we have wear-leveling, so we can rewrite the entire unit 5000 times. That is 5000 time 256 GB or 1280 TB.
    It takes some time to write that much. Worst case, your DB application does nothing but write as fast as your 4-drive SAS array would let it. That is 360 writes per second of say 8 kB each. May calculator tells me that this amounts to 85 TB per year (running 24×365), so even in this crazy scenario it will take 15 years to wear down my SSDs.
    As SSDs get larger, wear leveling makes them last longer, even if the per-page endurance goes down a bit.
    But should you fear the day that your SSDs are ‘consumed’? Not at all. Of course, not all pages in a SSD last exactly 5000 cycles. Some fail earlier, many fail later. If a page fails, it is remapped to one of the spares. The controller can tell exactly how many spares have been used. And since it must track the number of erase-write cycles of each page to do the wear-leveling, it knows how much life is remaining. I expect that information to become available, e.g. through SMART, so that you can replace your SSD’s well before they run out.
    If you do that in say 5 years time (playing it very safely), what would the replacement drives cost? About 1$. (30% price decline amount to a factor 400 in 5 years.) Except that you probably have to settle for a slightly more expensive 10 TB unit. Replacement cost is just no issue.

  6. Neil Says:

    Ernst, Wow … you’ve put a lot of thought into this, however:

    A lot of my issues surrounding the drives available today are based around the SLC/MLC technology used in those drives. My discussions with a major player (and I’m not going to get into brand discussions) is that SLC technology flies, lasts a long time, read and writes are above platter comparison in both streaming and random … in other words they are the ducks guts. Their only problem? Price and Capacity.

    Now Computex is not necessarily the best place for technical discussions, but it became very clear that MLC is the dominant technology being used in SSD because of those two issues … price and capacity. However, MLC has some pretty bad write and lifetime characteristics. Again based on discussions with my major player, they gave me MLC performance characteristics of IOPS actually lower than my SAS drives (by a very small amount), and enterprise writes a lot, lot less than your 1280tb calculated.

    Therefore, before we can have any real technical discussion on performance and life of drives we need to be very, very clear whether we are talking about SLC or MLC technology … and that was the initial bent of my blog … vendors out there are not clearly saying what they are selling.

    Ciao
    Neil

  7. Andrew Says:

    I have shared the same view for some time and have expressed this to partners and clients. In our data recovery lab we now are seeing an increase in failed SSD drives. People assume that because SSD drives no longer have any spinning platters, that therefore failure would not occur. Unfortunately this is not the case.

    Best,
    Andrew
    http://www.datanalyzers.com

  8. Neil Says:

    Andrew,

    “Seeing an increase in failed SSD drives” brings the question … is this because they are failing more often than their platter counterparts or because there are more of them on the market to fail (than there was previously)?

    We’ll never really know the failure rates of these devices. It’s not something that the manufacturers are ever going to tell us and if they did we’d have to take the numbers with a large dose of salt (lies, damned lies and statistics).

    Just because something is faster does not necessarily make it better … just faster.

    Ciao
    Neil

  9. Ernst Lopes Cardozo Says:

    Neil, Andrew,
    What I was trying to say is: data on SSDs are just as vulnerable as on any other storage device, so backups, mirroring or parity is just as necessary as ever. The ‘wear’ factor though is controllable: is should not bite you economically or suddenly eat your data. My calculation was for MLC, using very worst case assumptions: 24×7 100% write ate the speed of the ‘replaced’ disk configuration. Even then, the SSDs will last longer that the rest of the system. Plus, if the should wear out after a couple of years, the replacement drives will so cheap that you shouldn’t care. Do you still worry today that the 128 MB USB stick you bought in 2005 for 100 $ will wear down?
    Of course we will know the true MTBF for SSD’s, just as we got real-life numbers for disk drives from studies by Google and UCLA. No need to rely on data sheet numbers.
    SSD’s are different from disk drives, not just drop-in replacements. We will have to adjust some of our habits and best practices. There will be less need to stripe data over many units to gain performance; reducing delays in the rest of the storage stack will become an issue. For databases, clustered servers with local SSDs are an attractive option, because of performance, cost and flexibility.

  10. Neil Says:

    Ernst,

    I love reading your comments, because you obviously think about things before you just blunder down on paper (I must learn to do that sometime). You have extremely valid points, and I again point out that I love this technology and desperately want someone to give me a few of these things to try.

    However … my issue is not so much with the technology, but with the way it’s marketed and sold. Is SSD the final answer? - No. Will it solve all problems? - No. Will it last very long in it’s current form - I seriously doubt it.

    I was just reading over lunch about PCM (Phase Change Memory), which looks like it will have some serious impact on both memory and I think storage in years to come. This, and I’m sure many other examples of new technology, will mean that we keep moving forward with bigger, faster, smaller, lighter, cheaper (etc etc etc) storage devices.

    As long as manufacturers tell us the truth, market their products in the right place, are upfront about both the strengths and weaknesses of their products and don’t hype it too much (in other words cut the b/s), then we don’t have too much to fear from what’s coming along.

    Then again, without marketing teams I’d have nothing to write about.

    Ciao
    Neil

  11. Ernst Lopes Cardozo Says:

    Hi Neil,

    I need to correct my previous calculations on SSD durability. A smart lady explained to me the concept of “write amplification” in SSDs. If the application writes a 8k block, the SSD controller must do an update of that 8k section of a much larger page. The page is the basic unit of the particular SSD that can be erased, sometimes called and ‘erase block’. So the controller reads the entire page, updates the 8k part, then erases and rewrites the full page. In effect, this 4k write has resulted in a much larger rewrite, hence the write amplification.

    The effect depends on both the page size of the device and the average size of write commands. For database applications, 8k seems a realistic write size. Page sizes for current SSD are 128k or even larger. So the write amplification could well be 128/8=16. My previous calculated write endurance for the SSD drives of 1250 TB goes down to 78 TB and the life expectancy of 15 years becomes 11 month. But remember this was for a 100% write, 24×7 application. Under more realistic assumptions, this system would still run long enough to make replacement drives very cheap. So I maintain that it makes sense already to build database servers with internal SSDs in stead of 15k rpm drives or SAN connections.

    Take care,
    Ernst

  12. Neil Says:

    Ernst,

    The devil is in the detail. Yes, I’d agree that using multiple SLC drives would probably be the best way to go for database servers today. I’d be mighty suspicious of the performance of MLC drives because of their slow random write speeds.

    Ironically, when using SLC drives “life” of the equipment is not an issue … they will last a long, long time. So back to my original bent … make sure you know what you are purchasing when buying SSDs. There are many, many variables out there.

    Ciao
    Neil

Leave a Reply