One of the greatest operational challenges in modern data centers is copy management. The number of copies of data is proliferating. Part of the reason is the availability on all storage array platforms of snapshot, in particular space efficient . The space efficient were first introduced by , and are logical copies of data based on metadata held in WAFL (Write Anywhere File Layout). By taking just the delta changes between two , these enable much more efficient replication of data either locally or to remote sites. Another reason for so many copies is that while current disk drives have increased radically in density (with 4TB drives becoming the norm), the access density (the number of IOs and the amount of data that can be extracted from this drives in a unit of time) has remained the same or declined. To ensure copies of data can be actually used, physical copies of data have to be made. The average number of copies of data exceeds 10 in a even a well run data center.

The cost implications of these copies of data are great. The management challenges of managing all these copies are even greater. Finding snapshots uses the same principles as paper files – the newest one is the one on top, with the least amount of dust. Keeping track of snapshots, when they were taken, which developers or end-users have used them, whether they have been deleted and the provenance of data snapshots used in down stream processing and data warehousing is extremely suspect in most organizations. This leaves security and compliance less than adequate.

The solution put forward by software vendors is to keep track of all copies of data made, and keep track of the usage made to every copy of data. The resultant metadata about the copies can be used to:

