High Performance Computing: past, present and future - Storage Networking

Computer Technology Review, Jan, 2004 by Bruce Moxon

Cluster computing has forever altered the landscape of High Performance Computing (HPC). From humble beginnings as part of a NASA project in the early 1990s, Beowulf clusters have now secured their place as the predominant high-performance computing architecture. Advances in scalable storage architectures promise further changes that will continue to drive supercomputing into more commercial IT organizations.

High Performance Computing: A Landscape in Transition

Only a few years ago, the typical supercomputer was built using custom silicon, proprietary high performance interconnects and specialized storage subsystems. Companies such as IBM, Amdahl, Cray and Fujitsu developed complex systems that took years to bring to market and cost millions to tens of millions of dollars. Such systems were available only to national and international government-funded research centers.

Compare that with the current trend in high performance computing: cluster computers based on personal computer architectures, commodity microprocessors, Gigabit Ethernet and standard networked storage architectures. These systems represent the growing wave of commodity supercomputers. They are developed and sold by some of the traditional high-performance computing vendors, such as IBM, high volume PC manufacturers, such as Dell and HP, and a new breed of cluster computing vendors that includes LinuxNetworks, Rackable Systems and RackSaver. These systems can be acquired for as little as a hundred thousand dollars, making them available to a wide range of government, industry and academic institutions.

Cluster Computing: To Infinity and Beyond

The growth of cluster computing has been fueled by a number of key technology developments in recent years. First is the rapid advancement of CPU technology, in accordance with Moore's Law. Proprietary vector processors have given way, first to RISC-based processors employed in HPC systems of the 1980s and early 1990s, and more recently to commodity Intel processors whose integer and floating-point performance improvements have outpaced more specialized processors. Second is the commoditization of high performance networking technology, which provides the interconnection network, required for cluster computers to communicate with one another. Last is the maturation of the software infrastructure required to orchestrate the activities of hundreds or thousands of cluster computers, making the task of effectively harnessing these hardware advancements accessible to a larger group of programmers.

This last area has been instrumental in simplifying the installation, management and use of cluster computing systems. Together, these developments comprise a layer known as cluster middleware. Cluster management and monitoring software is now a common component of cluster vendors' offerings. This software allows for remote management of the cluster computing resources, including provisioning of additional nodes, node imaging (remote loading of the operating system) and application provisioning. Distributed resource management (DRM) software, such as Platform Computing's LSF, provides a means for managing and monitoring the distribution of computing jobs across the cluster from a single point of control. Parallel programming libraries and compiler enhancements, such as MPI and OpenMP, and distributed debugging and monitoring tools support the development of parallel computing applications that are able to effectively leverage these cluster computer architectures.

The result of these activities has been a broadening of high-performance computing. Traditional scientific high-performance computing applications such as high-energy physics research, environmental science and weather prediction, aerospace engineering, seismic data analysis, and signal and image processing, have been joined by a new wave of industrial applications. Drug discovery, circuit design, automobile design, financial analyses and digital media applications all benefit from the use of large cluster computing configurations.

Scalable Storage: The Final Frontier

These technology advances have removed many of the impediments encountered by early adopters of cluster computing technology. Still, one crucial development wave remains to complete the accessibility of cluster computing: the development of scalable commodity storage architectures to complement the scalable computing and networking architectures (Figure 1).

[FIGURE 1 OMITTED]

[FIGURE 2 OMITTED]

Today's cluster computing approaches employ a scale-out or data parallel computing model. In this model, applications apply a 'divide-and-conquer' approach: the computing problem is decomposed into pieces by identifying individual data partitions that comprise an individual task. Each of those tasks is then distributed to one of the compute cluster nodes for processing. Program inputs and outputs are typically maintained in centralized datasets residing on long-term storage subsystems, so data parallel applications must typically scatter portions of the input dataset to each compute node, then gather partial results and combine them into the final output.

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
CXO UnpluggedSmart Business interviews on BNET

See and hear how senior level executives across the Asia Pacific are developing smart business ideas across a variety of sectors. The focus is on the future, and on how businesses need to evolve.

advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement
Click Here

Content provided in partnership with Thompson Gale