There are several key things to consider when building a storage system. We decided to cover the most important and shine some light on the inner workings of a good storage system. The first case we’ll cover is on storage sizing and storage IOPS density per GB.
Many companies go out to buy a storage solution without understanding their needs or their use case well enough. Here is an example from the recent days:
A company asks for 70 TB of usable storage for a virtualized environment. They are looking to get a solution which can do 10,000 storage IOPS.
Storage IOPS density and keeping your user’s sanity
10,000 IOPS on 70 TB storage systems makes just 0.15 IOPS per GB. Thus a typical VM with 20-40 GB disk will get just 3 to 6 IOPS. Dismal. 50-100 IOPS per VM can be a good target for VMs which will be usable, not lagging. This will keep your users happy enough, instead of pulling their hair.
For reference, Google’s Standard SSD Persistent volumes come with 30 IOPS per GB. I.e. 200 (!) times more than the stated requirement: https://cloud.google.com/compute/docs/disks/performance#ssd-pd-performance
So a Google VM with 40 GB disk and 30 IOPS/GB will be able to peak at (maybe not sustain, though) 1,200 IOPS.
Storage system sizing in virtualized environments
A system with 70 TB usable can easily store the data of 1000-2000 VMs. Sometimes much more, depending on the average disk size of a VM and the gains from space saving features.
For the sake of exploring the boundaries, let’s assume that the average VM disk size is 40 GB and the storage solution has deep integration with a cloud management system, which provisions the VMs – say OpenStack, CloudStack, OpenNebula or similar. With a good integration, we have measured 2 to 5 times gain in terms of logical-to-usable space. In other words on 70 TB usable one can save from 140 TB to 350 TB of logical VM disks. This is 3,500 to 8,750 VMs on the original 70 TB usable!
If we take the 50 IOPS per VM mark, then we should have a system which can deliver between 175,000 and 437,500 IOPS! Further, we should be looking at latency metrics since a system which delivers this many storage IOPS, but has a latency of over 0.5 milliseconds will deliver bad user experience too. Actually, latency is the more important metric for many block-level storage workloads.
Thus we’ll cover the latency aspect in a separate post.
Update: December 2018.
You can learn more about how latency affects storage operations (and all the applications/workloads of your cloud):
– here: Latency storage vs. Capacity storage – A Podcast by Intel and StorPool and
– here: Webinar: Latency: #1 metric for your cloud.
If you have any questions feel free to contact us at firstname.lastname@example.org