Featured White Papers
Technology Industry
Industry: Email Alert RSS FeedBack-end switching in storage server design: improves the performance and availability of storage systems - High Availability
Computer Technology Review, July, 2002 by Richard Lary
Moving from a shared back-end bus structure to a switch-based backend structure in the design of RAID servers or file servers can significantly enhance the performance and availability of these systems.
Until recently, including a back-end switch as an essential component of a storage server has not been practical because the power, cooling, packaging, and cost of a back-end switch has made it unreasonable. In addition, adding a back-end switch would have also required significant changes to the firmware of the storage server. However, recent changes in back-end switch design now make it practical and feasible.
In order to set the stage to illustrate the benefits of adding a back-end switch to a storage server, we will first discuss design issues of conventional storage servers. A storage server consists of one or more "controllers" that actually deliver the storage service, plus the packaging that holds those controllers and their back-end disks. Storage servers are designed to provide a variety of value-added services, but a primary goal they all share is to enhance the characteristics of their individual disk drives in the areas of performance (bandwidth, throughput, and latency) and RAS (reliability, availability, and scalability).
We will discuss storage server design issues using the example of modular RAID servers, which represent the majority of enterprise storage servers shipped worldwide. In general, the same design issues arise in monolithic RAID servers such as the EMC Symmetrix, HDS Lightning, and IBM Shark, and in enterprise file servers (NAS). We will then demonstrate how incorporating a back-end switch into the design of a storage server can lead to significant performance and availability improvements.
Modular RAID Server Design
Modular RAID servers are available from many manufacturers and they are all variants on a common theme. As shown in Figure 1, each controller in a modular RAID server consists of the following functional elements:
* Processor on a control bus with local memories for programs (program load memory) and control structures (control memory).
* Data bus, data/cache memory and data retention system.
* Host and disk interfaces.
* Cache mirror interface connecting to the other controller in the server.
* Parity computation logic.
All of these functional elements are integrated into a single controller that is replicated in its entirety to create a high-availability server. The controller's processing power, memory bandwidth, number of host interfaces, and disk interfaces are all fixed, although the amount of cache memory and the number of back-end disks in the server may be upgradeable.
Common elements of a modular RAID server include host and disk interfaces, processor and data bus, cache, modular disk packaging, and high availability mechanisms.
Host and disk interfaces: The most prevalent host interface found today in modem modular RAID controllers is Fibre Channel, and the most prevalent disk interface is FCAL. Modular RAID controllers generally have two host interfaces. The number of disk interfaces determines the maximum number of disks the controller can connect to as well as how much performance the controller can get from requests that miss its cache.
Processor and data path design: The RAID controller processor has a private bus connecting to a memory that holds code and control structures to assure that the processor's memory traffic does not interfere with data traffic through the controller. The two controllers in a modem modular storage server have enough processing power to operate 120-500 of the fastest available back-end disks at full speed, and processing power is rarely a performance bottleneck in these controllers. The data bus connects the host interfaces, the disk interfaces, the data/cache memory, and the cache mirror interface in a manner that is optimized for burst data traffic. A pair of modern controllers have enough combined data bus and data memory bandwidth to transfer data between six 2Gbps back-end FC-AL buses and six 2Gbps front-end Fibre Channel fabrics at full speed, which removes internal bandwidth as a performance bottleneck.
With the recent and continuing improvements in commodity processor, memory, and bus performance, back-end bus efficiency and disk drive connectivity have become the most significant areas in which storage server designers can differentiate themselves on performance.
Cache: Controller cache provides two functions--read caching and write caching. Read caching improves latency and throughput by holding disk data that is anticipated to be read by applications. Write caching captures write data in the cache instead of writing it immediately to disk, thus providing the illusion of low-latency disk writes. Customers buy lots of controller cache believing it will help their application performance. The performance improvements on real-world workloads due to adding more than the minimum amount of controller cache, however, are far less than most customers (and some storage server designers) believe!