Home | About the Storage Advisors | Adaptec Trusted Storage


I lost a Terabyte!

Posted in General, Storage Management, Advisor - Joe Disher by Joe Disher

I was recently talking with a customer and during the discussion: they threw out the outrageous claim that when they formatted their NAS device with RAID5, they lost almost a Terabyte of storage capacity. Naturally, I was skeptical of this claim. How could that be? So I started to investigate.

The first thing I discovered was that this was a 3.2TB, 8 drive device. Therefore the system was using 400GB drives. Well I found almost half of the lost space, since the entire capacity of one of the drives is used for RAID parity. What about the other 600GB’s? They had also allocated 10% of the usable space for snapshots. That’s less than 300GB’s, so I’ve still not accounted for more than 300GB’s… or have I?

The problem is all in the math. Most of us that have been in the storage industry know that the way the hard drive vendors “market” the drive capacities is fundamentally inaccurate. If you check the bottom of any drive manufactures datasheet for a drive you will see some footnote at the bottom claiming that “1,000,000,000 bytes equals 1 Gigabyte”. Consumers generally don’t pay real close attention to that fact and on smaller drive capacities the difference just looks like a rounding error. Unfortunately, as the drive capacities have gotten larger, the disparity in what is “marketed” and what the real capacity of a drive is becomes exacerbated and more noticeable.

Anyone that has spent as little as one week in a computer class knows that the above math doesn’t work. A refresher on what a Gigabyte really is:
1024 bytes = 1 Kbyte
1024 x 1024 bytes or 1,048,576 bytes = 1 Mbyte
1024 x 1024 x 1024 bytes or 1,073,741,824 bytes = 1 Gbyte

So, what’s the “real” capacity of a 400GB drive? Let’s do the math with the new knowledge that, for drive manufacturers, 1 GByte is really 1 Billion bytes:
A drive marketed as 400GB contains (about) 400,000,000,000 bytes
400,000,000,000 bytes / 1024 = 390,625,000 Kbytes
400,000,000,000 bytes / 1024 / 1024 = 381,469.726… Mbytes
400,000,000,000 bytes / 1024 / 1024 / 1024 = 372.529… Gbytes

Also, understand that the drive companies round up a touch, so it’s likely that it may be even smaller then 372.52902984619140625 Gbytes. That’s at least a 27 Gbyte difference between the “real” capacity and the “marketed” capacity per drive.

For the NAS box of the customer I was talking to, this means they have 216 Gbytes LESS capacity than they thought they were getting! Let’s do the math for this customer scenario knowing what we now know:
400GB = (about) 372.53GB
8 drive RAID 5 = (372.53 x 8 drives) – 372.53(parity space) = 2607.71GB or 2.54TB
2607.71GB – 10% (260.711GB for snapshots) = 2346.999GB or 2.29TB

Now, I think I’ve solved the mystery! A 3.2TB (400GB x 8 drives) NAS box gives almost 1TB less usable capacity when using RAID5 and allocating 10% for snapshots.

Was this customer ripped off? No, it’s simply a matter of understanding the difference between the “marketed” drive capacity and the actual capacity available for use on a drive.

Can you imagine going to your local computer store and buying a 232.83064365386962890625 GByte drive! (BTW, that would be a 250GB drive) I’m not saying that the hard drive manufacturers couldn’t find a more accurate way to market the drives, but unfortunately, this is one of those times where there isn’t much we can do but understand the math!

Blog ya later!

Joe

4 Responses to “I lost a Terabyte!”

  1. Bob Says:

    I think what makes this a bigger problem is the fact that the tools used to calculate space on an OS or on a RAID device itself all use the 1024 base math. If the Marketeers choose to use the 1000 base math, then maybe they could ask for an enhancement request to use the 1000 base math on all the tools that check the size of the drives. This way the numbers would match up. This is more a preception problem. I was sold a 3.2 TB box and I expected 3.2*1024*1024*1024*1024 bytes of storage, but I only got 2.29*1024*1024*1024*1024 bytes of storage. I can understand the overhead for filesystem, and snapshot, and parity, but all the other space makes me question how this can happen. Now if my tools represnted the space in the same way the marketers did, then I would be happy and on my way. Just a thought.

  2. JPuu Says:

    First of all, i like this site and the articles, thanks a lot!
    I read this article just recently and i found it very informative.
    After that i stumbled across a piece of information that is related to this. Terabyte is an SI system based expression for 10^12. There are new standards (IEEE 1541/IEC 60027-2) that defines the use of prefixes for binary multiples. And there we can find that 2^10 is kibibyte (Kib), 2^20 is mebibyte (MiB), 2^30 is gibibyte (GiB) and 2^40 is tebibyte (TiB)! I have never seen these new prefixes in use yet, what about you ? But still, these HDD manufacturers seems to be right when they use 10^9 bytes for a GB!

  3. Joe Says:

    JPuu,

    Thanks for the info! I’m embarrased to say that I had not heard of these terms before. I did find this link that talks about the difference between binary and decimal terminology. The fact that the new terminology has been officially recognized by the standards bodies yet not used by the manufacturers is surprising. I guess the marketing folks are afraid to confuse their customers with new terms.

    We’ll have to keep an eye on this one. I know I’ll try to start using the “ibi” renditions now. It sounds kinda funny to say that I just bought a 500 gibibyte drive - people might think I have a speech impediment! Of course the number of people that I hear confusing the current lingo - “Tera” versus “Giga” versus “Mega” versus “Kilo” - I rather doubt that anyone will really notice!

    I’ll see if I can start a trend! ;-)

    Blog ya later!

    Joe

  4. Patrick Says:

    It’s not because the HD manufactures quikly want to scam us and increase their profit that we need to take it! It happened before that under the pressure of professionals a standard was trown away. We just need to use the binary calculation of HD space. What do you gonne do if tomorrow an idiot defines one byte as 3 bits and says that it’s equal to 1 terrabyte? This never gonne hapen because the difference is tooooooooo big but anyway what they now did with the new standard is pure scam what’s good for computer mnufacturers because they can sell more HD’s and of course the HD manufacturers sells more too. Suppose that tomorrow the speed limit on the highway is suddenly 15 miles an hour because a new law, you shall see that the people have the power and that stupid speed limit goes quickly away but meanwhile many people got a fine for speeding and the government get’s more money, pure scam like with the HD capacity calculation. Maybe if we (lot’s of IT professors and I) can bring the HD manufactures before court we do, the only obstacle is that an USA court probably don’t punish the HD manufactures but in other countries it’s possible.

Leave a Reply