Find Articles in:
All
Business
Reference
Technology
News
Lifestyle

Achieving simplicity with clustered, virtual storage architectures

Computer Technology Review, Dec, 2004 by Rob Peglar

Since the invention of the disk drive in the early 1950s, storage technology has evolved at an ever-increasing rate. Today, there is a wide range of storage choices, from the simple but notoriously unreliable 'Just a Bunch Of Disks', or JBOD, to the highly reliable but extremely complex monolithic arrays. Faced with this Hobson's choice, many of today's users are asking, "how can we have extremely reliable storage without the complexity?" In other words, are we bound to an ever-increasing spiral of complexity, or can storage be 'tamed'--be made simple--no matter what?

To answer that question, we examine two architectural techniques that are common in the blade server and application worlds--virtualization and clustering--that are starting to enter the mainstream in the storage universe. But, how is an enterprise to choose which combination of storage virtualization and clustering, if any, is applicable, especially for a universe of scalable blade server clusters acting as the base for highly resilient, powerful application clusters? Can complexity be managed, or must complexity be eliminated? This article will show that only the latter has business value, and only the combination of virtualization and clustering can achieve simplicity; neither technique, by itself, can provide optimal reduction of complexity.

Virtualization Without Clustering

Virtualization of storage has many different manifestations. However, virtualization by itself is not the entire answer to optimal simplicity. If scalability is still a function of individual, non-clustered elements, and those non-clustered elements must be managed as such, then optimal simplicity cannot be achieved. It is not sufficient to merely 'move' the complexity up one level; in order to achieve simplicity, it must be eliminated. Still, since many examples of non-clustered virtual storage arrays exist in the marketplace, and many storage professionals 'think of it' as clustering, its exploration is worthwhile.

Virtualization without clustering may be accurately described as a redundant technique rather than a clustering technique. In particular, there is no decision-making during non-clustered controller fail-over or failback; there is by definition only one option. While virtual disks may be created and used, pairs of controllers do not operate in a clustered fashion. There may be multiple pairs, but there is no cluster. In other words, the virtualization does not extend to virtualizing controller elements across the infrastructure. This is a distinct inhibitor to achieving simplicity. Given this, the impact to business may be severe. For example, in a data recovery situation, non-clustered controller failover necessitates at best the physical relocation and complex reconnection of blade servers--i.e. physically moving the blade servers to the data--or at worst, the reverse replication of data, since the replicated data cannot be directly accessed--i.e. physically moving the data to the blade servers. Blade server re-connection or reverse replication may take hours or even days, for large volumes of data. The expense and risk of lost time, productivity and lost transaction opportunity is significant in such situations.

In addition, another factor which inhibits simplicity is the fact that any blade server that uses the LUNs made visible by the pair of controllers must have this software installed and running in order for dual-controller failover to succeed. Virtualized but non-clustered also means that a given set of physical disks are managed and accessed by one and only one given, fixed pair of controllers. In addition, any given LUN is managed and made accessible by one and only one controller. Failover (of a controller to its paired counterpart) may occur as a planned event, typically triggered by an administrator, or as an unplanned event, due to any failure between the blade server (initiator) and the controller. The controller proceeds to redirect all traffic to its LUNs over to its counterpart, and then must inform the blade server(s) using those LUNs to alter their communication path(s), since the paired controller by definition has a different SAN address (e.g. a Fibre Channel worldwide name) than the failing controller. This process takes anywhere from several tens of seconds to minutes. The end result is that all blade server(s) using those LUN(s) from the failed controller now communicate with its paired counterpart. The reverse process is similar, although it is always initiated by administrator intervention.

By definition, this architecture does not lend itself to reducing complexity. In fact, this architecture forces the enterprise to decide, a priori, which blade server(s) to connect to which controller pair(s), and how many (and of what size and speed) disk drives to place behind each controller pair. Once selected, this cannot be changed without downtime and data loss. By definition, this process does not scale and is not optimal--it is 'best guess'. Since the typical application and OS cannot tolerate a change in LUN address, as advertised by the controller, host software is required to facilitate failover. Without this software, paired failover would result in loss of access to data volumes.

 

BNET TalkbackShare your ideas and expertise on this topic

The following tags are supported in BNET comments:
<b></b> <i></i> <u></u> <pre></pre>

Leave a Reply

  1. You are currently a guest | Login?
advertisement
CIO SessionsVision Series on ZDNet

See and hear what CIOs the world over thinks about the business of technology and how it's changing the way we live and work.

Go
advertisement
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale