Over the last several months I have spoken with many clients interested in deduplication. There is good reason for this interest, but one aspect of deduplication always gets more attention. The question of whether a solution performs deduplication via “inline” or “post-process” is always of significant interest. The prevailing mindset in the industry, it would seem, is that inline is superior to post-process. Let’s pull back the covers to see if there is any real truth to it. To ensure we are on the same page, let’s define these terms before proceeding.
Deduplication effectively gives you some percentage more usable storage capacity above the native capacity (albeit, this gain is highly variable based on the data types involved). You can either look at it as a given amount of data consuming less space, or the normalized approach of increasing the total effective usable storage space. In other words, if you reduce your data to 1/2 of the original size then you have effectively doubled (or 2x) your usable storage capacity.
Inline refers to deduplicating ingress data before it is written to the destination device. Post-process refers to deduplicating ingress data after it has been written to the destination device.
First let’s look at why a vendor would choose one method over the other. Take all-flash vendors for example which always use inline duplication. Without some sort of data reduction, the economics of all-flash systems are not nearly as attractive. Besides the need to reduce the $/GB of all-flash (which makes a lot of sense in this case), there is another issue that deduplication must address. This issue is related to the inherent disadvantage that all flash solutions suffer from: write amplification.
Flash blocks that already contain data must be erased before they can be rewritten. Before the flash can be erased, existing valid data has to be moved prior to the erase. This ultimately causes many more reads and writes to occur for a single ingress write operation increasing response time and wearing. This is where inline deduplication comes in. The best way to reduce write amplification (which cannot be totally eliminated) is to reduce the amount of ingress data to be written. For all-flash systems there simply is no other choice but to use inline.
Not surprisingly however, there are costs involved. Placing another intensive operation in the I/O path before committing the data to the disk slows overall performance. This processing overhead coupled with the reality that write amplification cannot be completely eliminated leads to some unpredictable performance characteristics especially as the total amount of valid data increases on the system (which increases the metadata that needs to be tracked).
With systems that utilize post-process (mainly non-all-flash array systems) the performance impact is almost entirely eliminated. I say “almost” because the deduplication process needs to happen at some point and it does generate some amount of additional load (albeit, small). I say “small” because the impact of the eventual deduplication is mitigated by monitoring overall system activity to determine when the best time is to perform the operation thus minimizing contention. Interestingly, the net resultant data reduction is at least as good if not better than inline deduplication. Most importantly, the write commit response time as seen by the application is not impacted since the data is committed immediately with no intermediate operation standing in the way. This ensures the user and application experience is not negatively impacted when the write is initially generated.
The tradeoff here is that the capacity consumption is slightly higher for a period of time until the deduplication process kicks in. In today’s world where most shops have 10 or 100’s of unused TB’s, this is seemingly and increasingly a non-issue.
CONCLUSION AND RECOMMENDATIONS
It should be apparent by now that it is not really an issue of “Good vs. Bad”. It is more a matter of necessity on the part of the vendor. But, if we were to consider which method has the least amount of negative impact on overall system operation, post-process would seem to have the upper hand.
On a related note, one thing I would highly recommend being careful of is promises related to the actual data reduction ratio. Anyone saying they are going to reduce your data footprint without first knowing what the data consists of, is lying to you. The only guaranteed data reduction method I know of is a method that gives you 100% data reduction and it’s called FORMAT! (Kidding of course, please do not attempt this at home).
Below is an example of Microsoft’s deduplication and compression ratios based on common file types:
|Scenario||Content||Typical Space Savings|
|User documents||Documents, photos, music, videos||
|Deployment shares||Software binaries, cab files, symbols files||
|Virtualization libraries||Virtual hard disk files||
|General file share||All of the above||
Great candidates for deduplication:
Folder redirection servers, virtualization depot or provisioning library, software deployment shares, SQL Server and Exchange Server backup volumes, VDI VHDs, and virtualized backup VHDs
Should be evaluated based on content:
Line-of-business servers, static content providers, web servers, high-performance computing (HPC)
Not good candidates for deduplication:
Virtualization hosts (running workloads other than VDI or virtualized backup), WSUS, servers running SQL Server or Exchange Server, files approaching or larger than 1 TB in size
** Random writes are detrimental to the performance and the lifespan of flash devices. Look for systems that are able to sequentialize I/O which will help to reduce the write-amplification effect.
** There are tools available that will estimate your data reduction savings prior to implementation. Microsoft includes one with Windows Server 2012 if the deduplication services are installed (DDPEval.exe).