Technology Industry
Industry: Email Alert RSS FeedUnderstanding Online Archiving - Technology Information
Computer Technology Review, Jan, 2000 by Paul Wang
Online archiving provides more efficient and faster access, plus major disk space savings without performance penalties
System administrators have historically relied on offline archiving for data backup and storage. In a typical scenario, offline archiving is a manual process for moving data to a media that is no longer connected to the system environment. When it becomes necessary to retrieve that particular data, then another similar manual process must be performed to bring the data back onto the system environment so that the data can be used.
Most RecentTechnology Articles
Other drawbacks exist besides the intensive time required in the manual archiving method such as it impacts user productivity while the system administrators wait for an operator to locate the right tape and load it. Locating the desired files can also be a challenge with potentially hundreds or even thousands of on-site and off-site tapes. Once the files are found, the data on the tapes may be corrupted due to an indefinite shelf life of tape medium caused by oxidation. Finally, it increases management and helpdesk costs because it is a very manpower intensive operation.
The other traditional data storage, nearline archiving, involves moving the data to a slower media such as robotic tape and laser or magnetic optical jukeboxes. Nearline archiving is also referred to as Hierarchical Storage Management (HSM). Retrieving data from nearline archiving devices is slow, but is much faster than doing it from offline archiving, since it is not a manual process.
A HSM system selects files through a policy procedure and archives them. The archiving is a multi-step process, including data compression and then moving the files to the nearline storage device. Additionally, when a user or application attempts to access an archived file, a time lag occurs. The HSM will find the device and media where the file is located, and then inform the device to load the appropriate media. Once the file media is loaded, the HSM will retrieve the file from the media and decompress it, at which time the file will be available.
Issues the system administrator faces in nearline archiving include the configuration requirements for optimum storage: archiving of the least-needed data. Additionally, the HSM system must operate as desired without adversely affecting performance on a regular basis.
For example, let's say an HSM system is configured and files are migrated to nearline devices. A "performance hit" or lag time is required to access a particular file and bring it back to the online system. If the HSM system is not properly configured, one of two situations can occur. First, the system administrator is not archiving enough data because he or she is not sure whether it will be needed or whether the performance lag time is acceptable or, second, too much data is archived and each time the file is accessed, lag time results.
A case in point is an application that requires a nearline-archived file every three months. On each occasion, this file is retrieved from a tape robotics system, brought back into the system, and lag time is incurred. Here's how this scenario plays out. In 60 days, this particular file is moved off the system and, 30 days later, it is moved back on the system. As a result of this highly unproductive movement, most system administrators generally opt for the first extreme of not archiving enough data due to the lag time issue.
Then, there is the cost of nearline archiving because it is a highly complex system. Both the hardware and the software are expensive. However, the highest cost incurred with nearline archiving, or HSM, is management. HSM is complex to configure and to manage well. Without archiving data, system administrators will definitely run out of disk space. Each time this occurs, the system is brought down and new hardware is installed. Then, it is configured and the data is reloaded. The downtime and management are very expensive. (This scenario assumes that the hardware was already purchased and delivered. If not, the cost of managing this system skyrockets.) Also, the more pieces of hardware, the greater the opportunity for failure. The Table shows that on average the disk drive Mean-Time-Between-Failure (MTBF) is five years for one disk. With 60 disks, MTBF is one month, and, with 180 disks, it is 10 days.
Online Archiving
Online archiving for Unix and Windows NT environments is now making its entrance to resolve these storage and backup issues that are plaguing system administrators. Online archiving refers to taking data not being used on a regular basis and storing it efficiently on direct access systems--disk drives or enterprise storage systems connected via SCSI, fiber, or other cabling. Additional hardware is not required in an online archiving environment, but more importantly, in addition to efficient data storage, the hallmark of online archiving is high-speed access when the data is needed. Key benefits to the system administrator are reduced backup time and reduced hard-drive requirements, which in turn, translates into reduced management, maintenance, and support expenditures.
CXO UnpluggedSmart Business interviews on BNET
Brought to you by CBS MoneyWatch.com
- Best- and Worst-Paid College Degrees
- 6 Things You Should Never Do on Twitter or Facebook
- How Much Sleep Do You Really Need?
- 6 Big Myths about Gas Mileage
Most Recent Technology Articles
- INTERVIEW WITH BEN BUTTERS, DIRECTOR OF EUROPEAN AFFAIRS AT EUROCHAMBRES : "A PERFECT ROAD MAP FOR EU CLUSTERS DOES NOT EXIST".
- AGENDA.(Brief article)(Conference notes)
- FIGHT AGAINST INTERNET PIRACY.
- INTERNET : AUTHORS' SOCIETIES URGE ACTION AGAINST PIRACY.
- TELECOMMUNICATIONS : BUSINESSEUROPE HOSTILE TO FURTHER CONTRACTUAL OBLIGATIONS.(Brief article)
Most Recent Technology Publications
Most Popular Technology Articles
- BizRate to monitor in-store customer satisfaction for Office Depot stores - Market Intelligence
- Speed control of separately excited DC motor
- What is precision air conditioning and why is it necessary?
- 3G: naughty or nice? PhoneErotica.com generates over 300 million hits per month, and rings up more minutes of use per month than MSN
- Effects of creative, educational drama activities on developing oral skills in primary school children



