Using some common sense …
Posted in General, Storage Applications, Platforms, Storage Interconnects & RAID, Storage Management, Application Environments, Advisor - Neil by NeilWhile in China I’ve come across a lot of customers building very large storage systems. Much of this is for video surveillance. They want a big, cheap bucket into which they will record hours and hours of surveillance from millions of cameras (and yes, there are millions of cameras in China).
Now “performance” is never seen as an issue when building these boxes. Capacity and price are the only real requirements. This brings about an issue for me where I totally disagree with my own engineering team. Now I’m supposed to toe the company line here and it will be interesting to see if the editor allows this one to slip through the net, but I have a real problem with the way these large “buckets” can be built.
We currently allow 32 drives to be used in a RAID 5 array. This means 32 x 2tb drives in a single RAID 5 array. The capacity will be somwhere around 28-29tb. Now even on our 3 series card that works fine … when it is running fine that is. However this scares the living daylights out of me when it comes to a RAID rebuild. If one of these drives fails (which means “when”, not “if”), and the system is busy (many cameras recording at the same time), then the load on the system becomes quite ridiculous.
The customer needs to (a) rebuild the array as quickly as possible and (b) not have any disruption to data capture during that time.
We can do the (b) bit … the cards can handle considerable throughput, but the (a) part on this size array is scary. I’m a big fan of multi-level arrays … RAID 50 or 60, or even just RAID 6 if you don’t want to go there. But to put your array at risk of a drive failure during the time it will take to rebuild this particular scenario is very very scary indeed. Especially in the surveillance scenario where backups just don’t happen.
I can’t give you exact figures on rebuild times on these arrays. It depends on speed of RAID card, speed of disks, number of drives in array, load on system etc etc, but you can generalise and say “it will be a long time”. So use some common sense when building these arrays and think about the consequences when things are not working correctly (ie in a failed drive scenario). Make sure your system can cope with the load of rebuilding and capturing at the same time.
At the very minimum have a hot spare in the system so the rebuild kicks off immediately that there is a failure. At best use multi-level arrays (50 or 60) or RAID 6.
Just don’t leave yourself swinging in the breeze any longer than you need to.
Ciao
Neil