Home | About the Storage Advisors | Adaptec Trusted Storage


Performance units of measure

Posted in Storage Interconnects & RAID, Advisor - Tom by Tom Treadway

Question to the Storage Advisors, from Prahlad: I am looking for advice on how to calculate throughput (MB/sec) from an IOPS value. Suppose I say I have a disk array with 14 disks with each giving approx 150 IOPS, how do you calculate MB/sec it can deliver? From what I know below is what I calculate.

IOPS per disk x No. of disks x Segment size = MB/sec

eg: 150 IO/sec x 14 drives x 32KB/IO = 67200KB/s, or 67MB/s.

Is this the correct way? Pls do let me know.

Prahlad, your math is correct. If each drive really could get 150 IOs per second with each IO being 32KB, then 67MB/s is the right answer.

But where did you get the 150 IOP number? Did it come from the drive’s spec sheet? If so, it’s probably referring to the number of IOs that can be done in a second if the IO is really, really small, such as 512 bytes. In this case it’s usually derived from average seek and rotation times since the 512B data transfer time is insignificant.

Furthermore, are you trying to access these 32KB chunks sequentially or randomly? If it’s random then most of the access time is still seek and rotation. But if it’s sequential then you’re mostly limited by the data transfer rate to and from the media. And that number is between 70MB/s and 100MB/s depending on the drive.

I’m not sure if that answered your question, but hopefully it steered you in the right direction.

TT

9 Responses to “Performance units of measure”

  1. prahlad Says:

    Thanks for the answer. but how does the total MB/sec vary wrt random or sequential IO.
    can i say that number of IOPS increases for sequential IO and hence increases total MB/sec compared to random since random IOs still have the disk seek and latency times whereas its not much of an issue with sequential IO.
    also can you explain me more on your first paragraph about number of IOs done in a second and the 512 bytes.

    regds

  2. Tom Says:

    Prahlad,

    Let’s say a drive has a 9ms average seek time and a 10ms full rotation time. With each access having an average seek and half a rotation (5ms) the access time would be 14ms on average. 14ms is therefore the total time on average to access the desired data block. But it does not include the data transfer time. Let’s say that the block to be transferred is 512 bytes. With a media transfer rate of 100MB/s that would be .005ms - practically zero, allowing you to essentially ignore the data transfer time. If each IO is 14ms, then you just invert that to get the IO rate - 71 IOPS.

    Now let’s increase the transfer length to 64KB, resulting in a data transfer time of 0.7ms assuming 100MB/s. That results in a total command (access plus transfer) time of 14.7ms, or 68 IOPS. That’s practically the same number as 71 IOPS, showing that the random IOP rate doesn’t vary much with transfer length.

    But let’s convert those two IOP rates, 71 and 68, to MB/s. If you have 71 IOs per second, each at 512 bytes, then you have 36,352 bytes per second, or a measily .036MB/s. But the 68 IOPS with 64KB transfers results in 4.5MB/s.

    [Note: Since drive spec sheets like to show large numbers the IOP rate is typically measured with 512 byte transfers.]

    Those two numbers, 0.036MB/s and 4.5MB/s, are two orders of magnitude different, but they’re still relatively small numbers. The media rate of the drive is typically closer to 100MB/s - maybe 70MB/s on cheaper drives. And the only way to get throughput rates close to 100MB/s is to avoid seeks and rotations - which means reading or writing data sequentially.

    This is where I need to do a little hand-waving to avoid having this comment word count explode: Regardless of whether the sequential transfers are short or long, the drive will read the data from the media at the same rate - 100MB/s. And the drive’s read-ahead buffer should allow a track to be read in at the media rate and then served up in smaller chunks, like 512B. So with varying transfer sizes the transfer rate “should” remain constant, if you can assume that the overhead of each command is zero - but you can’t assume that. The OS driver, IO controller, and drive interface will all add latency that will be most noticable with short transfers. However it is typical for such short transfers to be combined into longer transfers, assuming that the command queue depth is high enough. To avoid going on and on, perhaps you can just squint a little and see where I’m going.

    So, to summarize, the transfer length with random IO will cause the IOPS to remain fairly flat while the transfer rate (MB/s) changes drastically. But the transfer rate with sequential IO is more closely tied to media rate, with IOPS changing drastically but transfer rate remaining somewhat constant (with squinting).

    I thing that answers your question. If not, let me know. I’d be happy to clarify anything.

    TT

  3. Ali Says:

    What is a segment size or how do you measure it?

  4. Tom Says:

    Ali, when the original poster said “segment size”, I assumed he was referring to the data transfer size of each command.

    As far as measuring the size of the commands issued by your application or operating system, that’s not easy. If you’re using Windows I recommend software from a company called SysInternals. (They may be recently been bought, but you should still find them if you do a search.) The transfer sizes issued to the drive or RAID card are displayed in either filemon or diskmon - I forget which one shows this.

    Enjoy,
    TT

  5. Ali Says:

    Is there any defualt semgent size for Microsoft SQL 2005. I am using RAID 10 on EMC CX700. I believe SQL write in 8K chunks.

  6. Ali Says:

    I ran the diskmon. Length size for read and write showing 16. Is it 16KB then? I should be looking for the length size right?

  7. Tom Says:

    Ali,

    Yes, I believe SQL’s chunk size is 8KB. And it looks like diskmon is confirming that. Even though the “length” field has no units, I’m 99% sure that it’s in blocks (512 bytes each) based on a quick test I ran on my laptop. The smallest number is 1 and the largest is 128, and I know that Windows does a lot of 512 byte and 64KB transfers, which would be 1 and 128 blocks, respectively.

    So as long as your stripe size is a few multiples of 8KB you won’t be having many stripe crossings. For example, with a 16KB stripe size, you’d have 50% of the commands crossing a stripe and therefore using twice the disk IOs. 32KB would have 25% crossing, 64KB 12.5% crossing, etc. 128KB is a good default for RAID-10. Since you’re mostly running random IO you don’t have to worry about keeping the stripe size low enough to create long coalesced IO out of short sequential IO.

    TT

  8. JPuu Says:

    Hi Tom,
    I am pretty confused with these IOPS figures. At StorageReview they benchmark top drives around 800..1000 IO/sec with their office oriented benchmark suite. Here you are discussing IOPS around 100. Is the similarity of these terms just a cruel coincidence with nothing in common? Is there any way of estimating real-life performance of writing/reading of 10GiB file with eg. RAID 5/10 ??

  9. Tom Says:

    JPuu, yeah, IOPs are confusing. When I quote IOPS I usually refer to random IOPS, where each IO has a seek and half a rotation. This is an accepted measure of a drive’s performance. Any other measure is arbitrary and based on a specific IO pattern.

    For example, if the access pattern was short sequential reads, where most of the IOs come out of the drive cache, then the IOPS rate can grow from 100 to close to 100,000! And in the case of the StorageReview numbers you mention above, the access pattern is probably some mix of sequential and random, short and long, read and write. Who knows.

    As far as read and writing the 10GB file, it really depends on the transfer size. For example, does the OS access it 512 bytes at a time or 64KB at a time? That will make a HUGE difference in performance. There’s also the issue of fragmentation. Is the file spread across the disk making the IOs random, or is it located in one general region making the IOs sequential?

    Performance is tough. It’s easy to characterize disk and controller performance using specific metrics, like long sequential reads, short random writes, etc., but matching those up to real world access patterns is very, very tough.

    TT

Leave a Reply