When you look across history, there are plenty of examples of ideas which were originally thought as crazy that turned out later to be breakthroughs. Tesla’s wireless energy transfer system (mainstream adoption: 120 years later), Faraday’s electric generator (mainstream adoption: 50 years later), and Bell’s telephone (mainstream adoption: 26 years later).
Eighteen years ago, DataCore Software broke with the traditional ways of approaching storage, specifically the handling of I/O. It took nearly 14 years after DataCore’s founding for the industry to adopt what is being called today, Software-defined Storage (aka. SDS). Although today, 18 years later, there are many competitors in the field, there are still none that have disrupted the traditional way of thinking like DataCore has. DataCore attacks the problem of I/O from the complete opposite direction and with staggering results.
[BEGIN SOAPBOX #1]
I don’t particularly like the term ‘Software-defined Storage’ mainly because it implies that previous to SDS, hardware alone was used to drive storage. In reality, simply having hardware-only void of some sort of instruction set, whether it be firmware or software, is useless. Likewise, software without hardware to run on is just as useless. You need both hardware and software to do something useful. I like our CEO’s founding statement about DataCore’s vision better:
…Creating an enduring and dynamic ‘Software-driven Storage’ architecture liberating storage from static hardware-based limitations
– George Teixeira, CEO, 1998
Substituting Software-driven for Software-defined implies that both storage hardware and software work together yet in their own ways to achieve the end goal. As we will see next, the industry calls it “Software-defined Storage”, but it is still very much hardware-driven.
[END SOAPBOX #1]
FIRST: WHAT IS THE PROBLEM?
The problem is I/O. What is I/O? I/O is input and output or the movement of data through a system. The I/O takes the form of either a read or write operation. In other words, the system is either retrieving information (reads) or committing changes to information (writes). The rate at which reads and writes are occurring is referred to as IOps (I/O’s per second).
[BEGIN SOAPBOX #2]
I could spend the rest of the day writing about what IOps are and what they are not, but I will keep it short. The IOps value by itself is completely meaningless. Without understanding whether the I/O is a read or a write, performed in a sequential or random pattern, what size the I/O is (referred to as block size), and the latency of the I/O, the value of IOps means literally nothing. At the very least, assuming the exact same test conditions, you could get a relative performance comparison between two systems using the IOps value. But this is almost never the case. The only exception is an industry standardized and audited benchmark called the SPC-1 from the Storage Performance Council. This is truly the only meaningful and consistent comparison of IOps you are likely to find. NOTE: The SPC-1 represents an intensive OLTP workload, similar to that of a database.
[END SOAPBOX #2]
SECOND: WHAT IS THE SOLUTION?
When you boil this whole problem down to the ground floor, when you get at the core of the issue, when you finally peel back all the layers of the onion, you are ultimately left with two fundamental solutions to the problem of I/O. You can either,
#1: Throw hardware at the problem. This includes the amount of hardware as well as the type of hardware (i.e. faster and more expensive disks such as SSD and Flash). This approach is called Hardware Parallelization.
#2: Throw software at the problem. Not just any software mind you, software that is super intelligent and is able to completely harness and fully exploit the power of the underlying hardware. And not just the disk hardware, but the CPU and memory resources. This approach is called I/O Parallelization.
Approach #1 attacks the I/O problem by pushing the problem down to the disk, to the slowest components in the entire system stack, furthest from the application. As you will see in a moment, this is not only ‘not efficient’, it is a tremendous waste of expensive resources.
Approach #2 attacks the I/O problem as soon as the I/O is encountered, at the fastest layer in the entire system stack (CPU and memory), closest to the application. As you will see in a moment, this is not only ‘extremely efficient’, but the end result is nothing short of an enigma, achieving something that in traditional terms is impossible (but remember, DataCore broke with tradition 18 years ago).
WHAT DOES APPROACH #1 LOOK LIKE?
From a physical perspective, hardware parallelization tends to look like this:
First off, I want to make very clear, that neither myself, nor DataCore is at war with our friends at Hitachi or any other hardware vendor. DataCore is software, we need hardware. This system (Hitachi VSP G1000) was chosen simply because it achieved similar performance (IOps and latency) and price/performance levels to what DataCore achieved on the SPC-1 benchmark. This system arrived at $2,003,803 achieving $0.96 per SPC-1 I/O, or 2,004,941 IOps.
WHAT DOES APPROACH #2 LOOK LIKE?
From a physical perspective, I/O parallelization looks like this:
This system (DataCore Parallel Server running on a single Lenovo x3650 2U server with a 2U 24-bay storage array attached) arrived at $136,759 achieving $0.09 per SPC-1 I/O, or 1,510,090 IOps.
HOW DO THE LATENCIES COMPARE?
The first obvious set of distinctions between the two are costs and physical size. Let’s take a look at a third, less obvious distinction. One that is critical to your applications and users, more so than even IOps… Latency.
As you can see from the latency graph taken over each ramp phase of the benchmark (10% thru 100%), Hitachi (as well as every other storage system tested by the SPC-1), falls over what I refer to as the interrupt cliff. What should really stand out to you however, is the flat line which represents DataCore’s latency curve. Since DataCore is a real-time, non-interrupt based, parallel I/O engine, you will not see the typical latency curves you see with other storage systems. Interestingly, this marks for the first time ever in history a storage system achieving sub-100 microseconds of response time at 100% load on the SPC-1.
THE BOTTOM LINE
DataCore, like Tesla, Faraday, and Bell, landed way ahead of their time in the industry. And DataCore, just like those early pioneers who faced the naysayers and received the scrutiny from those around them, in the end, prevailed and proved to the world their way was the best way. The results don’t lie: Fastest storage platform with the lowest latency at the lowest cost in existence today, 10,000+ worldwide customers, 30,000+ worldwide deployments, 10th generation proven technology, and over 18 years of development.
The bottom line is this: If you want a platform with rich enterprise features that delivers outstanding performance while saving lots of space and money, then DataCore is the answer. Otherwise, choose a traditional storage platform.
For more information about DataCore and it’s 10th generation award winning storage software, check out this three minute video:
or the DataCore website.