Insuring The Reliability Of Fibre Channel RAID Storage - Industry Trend or Event

Computer Technology Review, Jan, 2000 by Kris Land

A major benefit of Storage Area Networks is fast "any to any" server or client access to RAID storage. In a mission-critical environment, this places emphasis on ensuring high availability of not only the data access paths, but also the RAID storage system itself.

Fortunately, standardized Fibre Channel layers define media and interface characteristics, as well as specifying highly reliable transmission protocols with low bit-error rates. SAN fabrics have evolved to include redundancies among switches and access paths, providing failover insurance against hardware problems.

From a hardware perspective, RAID systems typically include such high-availability features as redundancies, hot-swappability, and thermal management to dissipate heat build-up. Fibre Channel RAID systems with dual-loop architectures even provide protection against internal disk channel failures. Alarm systems and remote management capabilities further contribute to the reliability of today's RAID storage systems.

The storage industry has embraced traditional RAID levels (1, 3, 5) and variations thereof (0 1, 1 5, 6, etc.) as means of protecting critical information against the likelihood of disk drive failures. Typically, however, this protection is limited to a single drive failure (RAID 3 or 5). At most, protection against three concurrent inoperable drives is achieved, but at the cost of expensive mirroring. Even exotic arrays of this nature have limitations on the conditions under which drive failures can be sustained.

LAND-5 has developed patented algorithms that allow a disk RAID array consisting of "N" drives to sustain operations even in the event of "M" drive failures, where 1[less than]=M[less than]N. Called "eRAID," this breakthrough technology can be implemented with far fewer disk drives than mirroring while also yielding higher performance and enhanced reliability.

INTRODUCTION

With the growth of mission-critical information requiring twenty-four hour access, the reliability of storage systems is paramount. Downtime is extremely costly. Customers, vendors, employees, and prospects can no longer conduct essential business or critical operations. There is a "lost opportunity" cost to storage failures, as well, in terms of business lost to competitors. Well-documented studies place the cost of downtime in the tens of thousands (or even millions) of dollars per hour.

Consider the recent problems with eBay, a major online auction Website with 2 million customers that suffered extended equipment crashes. The company, which saw its stock value slide by almost 20 percent, lost significant revenue over the three-day period--eBay warned that the latest 22-hour outage would knock between $3 million to $5 million off Q2 sales. However, the greater damage could be to eBay's reputation, especially if it continues to be plagued by outages. In a recent survey of consumers, Jupiter Communications found that 46 percent of online consumers leave a preferred site if they experience technical or performance problems.

The need for large amounts of reliable online storage is fueling demand for fault-tolerant technology. According to International Data Corporation, the market for disk storage systems last year grew by 12 percent, topping $27 billion. More telling than that figure, however, is the growth in capacity being shipped, which grew 103 percent in 1998. Much of this explosive growth can be attributed to the space-eating demands of endeavors such as year 2000 testing, installation of data-heavy enterprise resource planning applications, and the deployment of widespread Internet access.

The rising tide of Storage Area Networks (SAN) is fueled by the prospect of providing "any to any" high-performance access by networked servers and clients to critical information on a continuous basis. RAID storage is the underlying foundation of SAN technology, necessary to insure that mission-critical data is available when needed. Access to online storage on a 24x7 basis is essential to most SAN configurations. Thus, the reliability of Fibre Channel RAID storage is, in a sense, the Achilles heel of a SAN fabric.

In examining SAN storage, attention is quickly focused on three elements that are essential to reliability:

* The error checking scheme inherent in the transmission protocol

* The reliability of the RAID storage unit itself

* The ability of the RAID storage system to withstand multiple drive failures

This article discusses each topic in turn. Industry answers exist for the first two subjects, but the storage community is still applying expensive "band-aids" in an attempt to overcome the inevitability of disk drive failures in large storage arrays.

TRANSMISSION PROTOCOLS

Fibre Channel has five layers: FC-0 through FC-4. The FC-0 layer defines the media and interface characteristics of full-duplex serial links between points. It lets Fibre Channel scale its signaling rates and define conforming cabling and connectors without affecting upper level protocols. As such, the FC-0 layer facilitates high-performance availability to Fibre Channel storage systems.

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
CXO UnpluggedSmart Business interviews on BNET

See and hear how senior level executives across the Asia Pacific are developing smart business ideas across a variety of sectors. The focus is on the future, and on how businesses need to evolve.

advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale