Distributed backup is the key to ILM

Computer Technology Review, Nov, 2004 by Eran Farajun

ILM solutions can significantly reduce the cost and complexity of data storage, but to reap the greatest rewards, ILM relies on a backup system that is ILM-aware. ILM has two goals. One is to minimize administration costs. The other is to make the most efficient use of storage hardware. Without a backup architecture that maximizes or even enables ILM, these goals cannot be realized effectively.

The Case for ILM

Since enterprises are so dependent on information about their processes, products, customers and suppliers, data storage is a challenge for IT executives and storage administrators everywhere. Reliable and secure data storage is crucial to business continuity plans. Many industries, such as finance and health care, face new regulatory policies that mandate ever-increasing durations of data retention.

Because of the combination of more data and longer retention times, the cost of managing information throughout its lifecycle grows as much as 20% to 30% per year, according to some estimates.

Though opinions vary, for the purposes of this article ILM will be defined as a data archiving process that automatically moves data to the most cost-effective storage media, based on predetermined policies of accessibility, security and long-term storage. Data is transferred automatically, with no manual intervention required, reducing hardware and real estate costs. As a result, ILM vendors promise a significant Return on Investment (ROI).

Archiving Versus Backup

All of an enterprise's data can be placed into one of two categories. Critical information is that which is needed for day-to-day operations and resides in the system's primary storage for fast access. Important information is the historical, legal and regulatory information that can safely be archived to secondary storage--lower cost disk or tapes stored offsite.

Critical data is typically accessed often. However, as a given file is accessed less and less frequently, over time this data eventually changes from critical to important. If, as a matter of policy, a file ceases to be critical and becomes important after ninety days of inactivity, an ILM solution automatically archives this data after ninety days to secondary storage, without any intervention by IT personnel. ILM solutions create a pointer or placeholder for every file moved to secondary storage. Should a user ask for a file after ninety days (if the important information becomes critical) this placeholder points to the new location and the system can retrieve it and move it back to primary storage.

Archiving data that is no longer needed for day-to-day operations by moving it to long-term storage is distinctly, functionally different from backup operations which protect operational, critical data before it can be archived.

One key failing of backup systems that are not ILM-aware is that they will continue to store backup files on tape or secondary disk, even though this data has been archived elsewhere. Since this secondary storage must still be managed, the overall return on the ILM investment will be considerably less than anticipated.

The Figure illustrates this process in a typical e-mail setup. This architecture includes a backup system that protects critical data on primary storage before it is archived to lower-cost disks or tape by an ILM solution. This traditional tape-based backup is the ILM solution's Achilles' heel when it comes to ROI.

The Problem With Backup

Typically, the backup saves files from primary storage to secondary storage on a daily basis. As long as a file remains critical (on primary storage) it will be backed up routinely--daily, in most enterprises. This means that the same file, often in multiple versions, is saved and stored many times, resulting in excessive hardware or media costs, administration time, and storage real estate, both onsite and offsite. A backup approach that is ILM-aware, and overcomes this problem, is Distributed Backup.

[GRAPHIC OMITTED]

One advantage of using Distributed Backup in the ILM environment is that it eliminates the need for daily backups to tape, and the subsequent rotation, retrieval and storage of these tapes.

A Distributed Backup system collects the data to be backed up from LAN clients and sends it to offsite disk storage in a compressed and encrypted format. It also retrieves this data from offsite when it is needed for a restore. Because the process is fast and fully automated, backups can take place as often as desired.

ILM-aware Distributed Backup or, more simply. Backup Lifecycle Management (BLM), takes advantage of the ILM archive's placeholders to keep only one copy of the file on either backup or secondary storage--but not both. These placeholders help the backup determine which files have already been archived. This allows it to automatically remove them from the backup disks, freeing up storage space and eliminating file duplication.

When BLM recognizes a placeholder in the backup data received from the client, it knows that the associated file has been transferred to secondary storage. It therefore searches the backup disk for the original file, deletes it, and saves only the placeholder.

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
CXO UnpluggedSmart Business interviews on BNET

See and hear how senior level executives across the Asia Pacific are developing smart business ideas across a variety of sectors. The focus is on the future, and on how businesses need to evolve.

advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale