SIS Can Complement Block-Based Deduplication; Interview with CommVault's Simon Taylor Part II

| | Comments (0)

Deduplication is currently one of the hottest topics in data protection but it takes more than one form. The CommVault® Simpana® software suite implements deduplication as a Single Instance Store (SIS). In this iteration, SIS deduplicates archived and backed up files at the file level and then only stores one occurrence of the file. In part 2 of this interview series with CommVault Systems' Senior Director of Information Access and Management, Simon Taylor, elaborates on how Simpana leverages SIS for information search and data mobility as well as how this approach complements block-based deduplication approaches found on certain disk-based storage solutions.

Jerome: I've examined CommVault System's implementation of SIS in the past but primarily looked at the benefits of SIS and deduplication from the perspective of capacity and space savings. What benefits does SIS offer when other factors like information search and data mobility are considered?

Simon: Regardless of which product (archiving or backup software) that one uses within CommVault's Simpana suite, these products store data in the same object store. While storing data in this format consumes more space than when deduplicating data at the block level, by retaining each file in its entirety and treating it as an object, certain permissions such as read, write, delete and retention can be assigned to this object (or file). This creates some interesting possibilities from an information search and data mobility perspective.

Consider this first from an information search perspective. Data that is deduplicated (block-based or SIS), may be indexed when it is initially stored. However a problem that can arise is that files may need to be re-indexed later on (new laws, new search criteria, etc.). In circumstances where files are deduplicated at the block level, to re-index these files may require that the files first be reconstituted before they can be re-indexed. Stored in a SIS format, companies do not need to worry about the overhead and wait times that reconstituting the data would introduce to re-index the data.

Now consider it from a data mobility perspective. Most companies still use tape in some fashion in their data protection and archiving scheme. However the problem with tape is that you cannot store deduplicated data on tape. If moving deduplicated data from disk to tape, the data must again be reconstituted before storing it on tape. SIS again avoids this scenario since the file is stored in its entirety on disk so companies can more easily move data from disk to tape with minimal performance impact or wait times. In fact, companies can opt to migrate large files within their SIS repository to tape as a means to keep their costs down.

Jerome: So how do you explain the large number of partnerships that CommVault has with vendors that provide archiving and deduplicating products if SIS provides so many benefits?

Simon: CommVault currently has partnerships with many vendors that resell archiving and deduplicating solutions: Data Domain, EMC (Centera), HDS (HCAP), HP Integrated Archive Platform, IBM DR550, NetApp SnapLock, Permabit Enterprise Archive and Plasmon UDO Archive Appliance just to name a few. We can support all of these platforms without spending a great degree of time integrating their products with ours because of CommVault's architecture. It provides sufficient flexibility so that users can select whatever technology best fits their requirements.

Because of this, we see certain synergies emerging from these partnerships as organizations today will typically implement a tiered infrastructure with most companies using a least a couple of storage tiers. A specific benefit we see is companies using CommVault to facilitate the introduction of Green IT into their organization.

"Green IT" is becoming an important component of many companies' IT strategies. In the US, Green IT is primarily about saving power while, for the rest of the world, saving power is also part of their objective but it is also about social responsibility and achieving higher degrees of efficiency. These other solutions allow companies to meet those objectives while they use CommVault as their primary means to search and move data across these different storage tiers.

Part 1 in this series took a look at the forthcoming paradigm shift that needs to occur in information management.

In part 3 in this 3-part interview series, Simon discusses emerging challenges with Information Access and how evolving laws in the US and internationally are presenting new challenges as to how archived data is managed, searched and retained.

Leave a comment

Entry Sponsorship

This entry is sponsored by CommVault® Systems

About CommVault® Systems Blog

    CommVault® is determined to develop a better paradigm to manage data. A paradigm that would not attempt merely to "integrate" disparate solutions, but would spawn solutions designed to work together from a single, infinitely-adaptable code. A paradigm that would not merely address current data management needs, but that would anticipate and meet needs yet to come. The paradigm would be more accessible, adaptable, flexible and powerful than any data management solution to date. That paradigm is defined as Solving Forward. CommVault® Systems, Inc.

    DCIG is paid a fee by CommVault® Systems, Inc. in connection with this blog. CommVault® undertakes no obligation to update, correct or modify any statements contained in this blog; these statements represent the views and opinions of DCIG only.