Is Your Storage Highly Available, Or Simply Fault-Tolerant? – Part 2

Introduction
In Part 1 of this series we reviewed the principles related to high-availability and how high-availability and fault-tolerance differ. In Part 2 we will discuss ways high-availability is achieved and what other benefits can be realized from this type of architecture.

Abstraction: The Key To True High-Availability
If you recall, high-availability is the combination of component-level and data-level redundancy. Component-level redundancy is fairly commonplace in contemporary infrastructures. Everything from servers to storage offer component-level redundancies as an option which is why I say, this type of redundancy (ie. fault-tolerance) should be the absolute minimum requirement because it is so easy to achieve. So now the question is, “how is data-level redundancy (ie. high-availability) achieved?”.

As we previously discussed in Part 1, to attain the highest level of high-availability you would need to meet each of the six principles of high-availability. However, you can’t start down that road until you have a system in place that allows you to abstract away from the underlying storage hardware and simultaneously provide synchronous mirroring capabilities for the data across that hardware. The abstraction principle is the baseline requirement to achieving not only synchronous mirroring, but all of the principles related to high-availability. You can read a great deal more about abstraction here.

I’m Abstracted, Now What?
Once data-hardware decoupling has occurred, we now need a system that will ensure that the data synchronously coexists across the underlying storage hardware. Of course, not surprisingly, the same system that abstracted the data away from the storage hardware should also provide the mirroring capabilities. It wouldn’t make much sense to go through all the pain of abstraction only to stop there, right? If you are familiar with enterprise storage systems, you are certainly by now starting to say to yourself, “All of this looks a lot like software-defined storage”, and you would be correct. One of the principles of software-defined storage is “Improve Data Service Availability”. You can read more about software-defined storage principles here.

The Mechanics of Synchronous Mirroring
Now we finally arrive at the “how” portion of this discussion. If you are like me, you are not satisfied with simply accepting that something works, you want to know how it works. By understanding how it works, you gain further appreciation for what is being accomplished just like an artist appreciates a Picasso or a Rembrandt.

Let’s take a look at what needs to happen to achieve this synchronization:

Sync Mirroring Mechanics

A couple things to point out here:

  • The high-speed RAM cache is vital to the process because this is the component that will allow receipt and acknowledgement of I/O to happen as quickly as possible on both storage virtualization engines.
  • The high-speed mirror path(s) should be able to utilize either fibre-channel or iSCSI. Deploying iSCSI for the mirror paths should also allow both virtualization engines to be separated by significant distances (up to approximately 100km or so).
  • Since the data is synchronously mirrored to both nodes, the data should be fully accessible on both nodes simultaneously. This would eliminate any delays that would normally be associated with LUN trespassing or migration if a failure occurred on either node.
  • Besides adding data redundancy, performance is also greatly improved because of the additional channels, cache, and disks. Most systems today have the ability to load balance (or round-robin) their I/O requests against all available channels yielding better overall performance.

Let’s review how well we did in achieving the principles of high-availability:

✔   End-to-End Redundancy Achieved through component-level and data-level redundancy
✔   Subsystem Autonomy No storage disk subsystem inter-dependencies; systems are not aware of each other
✔   Subsystem Separation Separation achieved through long-distance mirror paths
✔   Subsystem Asymmetry Made possible through hardware abstraction and administrator choice of hardware
✔   Subsystem Diversity Made possible through separation and administrator choice of facility
✔   Polylithic Design Made possible through hardware abstraction and administrator choice of hardware

Conclusion
If you have been reading my blogs of late you will see a pattern emerging. Once again, it all boils down to abstraction. The need to break away from being tightly-coupled with the hardware is very readily apparent. So if the only way to achieve true high-availability is through abstraction, and the only way to achieve abstraction is with software (which should be obvious by now), then considering software-defined storage solutions makes a lot of sense. This is precisely what we are seeing in the market today. Until next time…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s