Roundtable: "switched on storage arrays": Part 2 of 3 - Connectivity

Computer Technology Review, July, 2003 by Mark Ferelli

Editor-in-chief Mark Ferelli recently joined a panel of experts to discuss some of the underlying factors driving the storage industry's move toward switched storage back-ends. Part 1 of the Roundtable appeared in the June issue of Computer Technology Review. Special thanks to Vixel Corporation.[hr coordinating this event.

Mark Ferelli: How, in very real terms, would a switched architecture-enabled solution enable providers to add benefit to reliability, availability, and serviceability? Our favorite new acronym is RAS (Reliability-Availability-Serviceability) of storage architectures, and I'd like to identify some of the more serious RAS issues that a switched architecture can solve. Brian and Bob, I'd like you to start on this.

Brian Reed: Even though today's systems are highly reliable, there are some nuances of problems that you see in shared-based architectures. In any type of shared-based architecture, every read/write request has to move to every individual drive before a completion, which causes latency and congestion issues. And some of the biggest RAS issues you see are things like--even though Fibre Channel drives have dual Ports--you get dual loop failures because there's individual parts on a drive that can bring down both loops.

You have things like rogue and flaky drives, which don't necessarily fail but may imminently fail, and it's very difficult to diagnose those problems and, therefore, it's very hard to prevent these problems from happening. Switched-based architectures basically isolate drive and solve those types of problems today.

Mark Ferelli: Bob, what d you think?

Bob Rumer: There's a couple of things that haven't been mentioned yet--or haven't bee directly called out. One of the main issues here that gives you the diagnostics benefit is the we're moving from JBODs manufactured with PBC--port bypass circuits. These are gigabytes being analyzed by mixes. When your signal is always in serial demand, then you rarely have access to that information and diagnostics is an added cost. It is always fairly difficult to implement.

What we're seeing in a switched architecture is a SERDES-based disk array, and because they are SERDES-based we have parallel data going in and then coming out c the SERDES that can be monitored for a variety of diagnostics and capabilities. And that really is transforming the industry. Vitesse has provided an awfully large quarterly part of the world's PBCs. We'll never build another one again.

The second thing is that many of these systems are also deployed with more sophisticated enclosure managements, which is the CPU that a company is referred. Vitesse's emphasis has always been in-band management that can play over Fibre Channel and mitigate diagnostics to drive autonomously. So the entire level and sophistication of the whole system is going at the end-user benefits as a whole.

Mark Ferelli: Bob, does the enclosure issue impact the total cost of ownership of the device?

Bob Rumer: The initial hardware cost is intact, but the total cost of ownership drops drastically with all the benefits that you've already heard, again, from our system friends.

Mark Ferelli: Speaking of systems, James, how do you at HP look at switched architectures for reliability, availability, and serviceability issues?

James Myers: The two situations we've seen where it's really been beneficial is before, when we used to have a couple hundred drives on the loop, if you get a rogue hard drive and many times going through a myriad of errors and error recovery, it can actually tie up and saturate and just dominate the entire loop.

Now, any other production application that's trying to do I/Os to the remaining couple of hundred hard drives begins to just bog down; service times for users are impacted and, basically, it's an impact semi-failure situation for customers.

The other thing is, many customers need instantaneous growth in their environment and they would like to do that any time of the day, any day of the week.

With all of these couple of hundred drives on a Fibre Channel loop, when you interrupt the loop to do an addition, you basically still have access but you sort of cause things to bog down. And, again, customers don't want to have to wait until the wee hours of the morning to do capacity expansions and those sorts of things, so by having a switched backend architecture, you can be assured that you can add capacity on the fly, in real time, without any degradation to your application.

Mark Ferelli: Then you can take a real time--or real byte, rather--out of a lot of the latency problems in terms of installations and implementation?

James Myers: Right. Exactly.

Mark Ferelli: Mark, where do you see the reliability issue sitting and the serviceability ones?

Mark Nossokoff: Yes, some the errors that occur in the arbitrated loop situations--when they occur--they kind of get congregated down and around the loop and it is hard to identify specifically where the individual problem is. Through implementing these loop switches in the JBOD, we'll be able to really pinpoint where a rogue drive actually resides and the drive can be more quickly diagnosed and more quickly serviced.


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
CXO UnpluggedSmart Business interviews on BNET

See and hear how senior level executives across the Asia Pacific are developing smart business ideas across a variety of sectors. The focus is on the future, and on how businesses need to evolve.

advertisement
  • Click Here
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale