Implement Data Deduplication on NetApp Volumes

Author: Jason Daunt
This is a proposal to evaluate and recommend the implementation of data deduplication on appropriate NetApp storage volumes to increase storage efficiency and maximize the overall investment.
NetApp describes deduplication as a storage space savings technology that increases storage efficiency by deduplicating identical blocks of data and only storing unique data. The client currently utilizes data deduplication on approximately 9% of its volumes across 3 NetApp arrays. While data deduplication is not suitable for all data volumes, it is ideal for certain types of volumes in many cases. This document is proposing that all volumes on the NetApp arrays are evaluated, and if they fit the criteria of the following data deduplication will be enabled: File server/shareBackupVirtualization (VMFS Datastore) In the event a volume needs further analysis as to whether a significant savings will be realized with data deduplication, the NetApp SSET (Storage Savings Estimation Tool). This will analyze an online volume and estimate the overall savings that will be achieved by enabling deduplication. Any volume that requires incredibly high performance and/or very low response times will not be a candidate for data deduplication. Volumes that have operating system/application level compression or encryption of any kind are also not suitable.
Alternative(s) Considered
Thin provisioning of both volumes and LUNs is already being utilized on the NetApp arrays. This can work in conjunction with data deduplication. This is the same with data compression, which can work in step with dedupe and is a good option with backup volumes. An alternative is to maintain operation and largely not utilize this storage efficiency feature. While this will ensure there is no risk with performance impacts on these volumes by enabling, it will also not utilize this enhancement and increase overall storage utilization Not recommended.
Performance Impact – There is an overhead with enabling data deduplication on NetApp volumes. There is a small impact to read operations as well as CPU overhead on the NetApp filers when the deduplication operation occurs. It is recommended to analyze the data and schedule deduplication operations appropriately. For example, if a volume is a suitable candidate for deduplication and the data change rate is low, a dedupe operation may only need to occur bi-weekly.