Overview of the QCDSP and QCDOC computers
IBM Journal of Research and Development, Mar-May 2005 by Boyle, P A, Chen, D, Christ, N H, Clark, M A, Et al
The QCDSP and QCDOC computers are two generations of multithousand-node multidimensional mesh-based computers designed to study quantum chromodynamics (QCD), the theory of the strong nuclear force. QCDSP (QCD on digital signal processors), a four-dimensional mesh machine, was completed in 1998; in that year, it won the Gordon Bell Prize in the price/performance category. Two large installations-of 8,192 and 12,288 nodes, with a combined peak speed of one teraflops-have been in operation since. QCD-on-a-chip (QCDOC) utilizes a six-dimensional mesh and compute nodes fabricated with IBM system-on-a-chip technology. It offers a tenfold improvement in price/ performance. Currently, 100-node versions are operating, and there are plans to build three 12,288-node, 10-teraflops machines. In this paper, we describe the architecture of both the QCDSP and QCDOC machines, the operating systems employed, the user software environment, and the performance of our application-lattice QCD.
Related Results
Introduction
In this paper we discuss two massively parallel computers that were designed and constructed for efficient, cost-effective calculations of physical systems subject to the strong nuclear force. While simulations of the strong nuclear force may seem a very esoteric goal, the techniques and underlying mathematics are common to a broad class of problems and are particularly common among exceedingly demanding problems in high-performance computing. Two numerical methods, whose use in our computations is detailed below, are the Metropolis algorithm for importance-sampling a large dimensional integral (more than approximately 107 dimensions for QCD) and Krylov space inversion of a large sparse matrix.
The strong nuclear force affects the protons and nucleons of atomic nuclei, the up and down quarks, which are constituents of protons and nucl�ons, and the four additional short-lived quarks (for a total of six quarks) that are known from experiment to exist. The theory of the strong nuclear force, known as quantum chromodynamics (QCD), is an elegant generalization of the theory of electromagnetism, and is accurately described by a simple Hamiltonian that can easily be written in closed form. Just as the photon mediates the electromagnetic interaction between electrically charged particles, QCD includes a similar mediating particle, called the gluon.
A fundamental difference between QCD and electromagnetism is that the gluon interacts with itself as well as with the quarks. (In electromagnetism, photons in a vacuum interact only with one another very weakly.) For low-energy systems, such as the proton itself, the quark-gluon interaction and the gluon self-interaction are very strong. This makes the equations of QCD highly nonlinear, and in this regime they are not amenable to analytic calculations. (At very high energies, analytic calculations can be performed reasonably accurately, and these are an important part of our belief that QCD is the correct theory of the strong interactions.)
Thus, to study QCD at lower energies-to understand the proton mass, decays of particles made of short-lived quarks, the effects of heating protons to high temperatures, and many other phenomena-we are led to numerical techniques that can deal with this nonlinearity. While QCD can be cast as an infinite number of coupled nonlinear differential equations, it is more easily handled numerically by Monte Carlo integration. To start out, the integration is over all values of quark and gluon fields in space-time, with each configuration of quarks and gluons weighted by its classical action. This is the well-known Feynman path integral formulation of a quantummechanical system.
To evaluate the Feynman path integral, we discretize space and time with a four-dimensional (4D) Cartesian grid. This grid, or lattice, gives the study of QCD with this technique its name, lattice QCD (LQCD). Only a small number of input parameters are needed: the quark masses and the strength of the coupling constant. When the grid spacing is made sufficiently small, QCD simulations should yield precise answers for a wide variety of physical phenomena. Because of its completeness and simplicity, and the precision of its numerical formulation, QCD is an attractive target for large-scale simulation and has received much attention as a grand challenge problem in scientific computing.
The characteristics of the problem mean that the minimal design parameters for an ideal QCD machine can be somewhat restrictive compared with those of the mythical general-purpose parallel machine:
* The problem naturally has a Cartesian structure, allowing for a simple nearest-neighbor mesh network.
* The only common non-nearest-neighbor communication is global summation.
* Both communication and memory access patterns are deterministic and amenable to both software and hardware prefetching.
Exploiting these key simplifications to obtain an advantage in both price and performance has been the raison d'etre for a number of specialized machines built in the U.S., Italy, and Japan, which we briefly discuss in the next section.
White Papers, Webcasts, and Resources
-
Free White Paper: Easing the Adoption of Server Virtualization
Microsoft
Find out how to reduce space, power and hardware costs in your data center by converting under-utilized physical servers into virtual machines.
Download Now -
Solution Spotlight: Reducing Costs with Microsoft SQL Server 2008
Microsoft
Looking to squeeze the best possible value from new and existing systems? Learn 12 proven ways to save time and money using Microsoft SQL Server 2008.
Download Now -
Forrester Report: The Total Economic Impact of a SQL Server 2008 Upgrade
Microsoft
See how upgrading to Microsoft SQL Server 2008 can provide your company with an anticipated ROI of between 160 and 180 percent.
Download Now -
Free White Paper: Making The Case for Conferencing
Microsoft
Discover how today's conferencing solutions can help your organization improve business outcomes and reduce costs in challenging economic times.
Download Now -
Free Trial Download: up.time Systems Management Software
Uptime Software
Easily manage, measure, and monitor all your physical, virtual, and cloud assets across platforms, applications, domains, and multiple datacenters.
Download Now
- 5 Rules for Immediate Annuities
- Death in the Family: 12 Things to Do Now
- Dumbest Things You Do With Your Money
- 6 Online Networking Mistakes to Avoid
- 401(k) Mistakes to Avoid
- 5 Economic Scenarios to Keep You Up at Night
- The Real ‘Best Places to Retire’
- Best Credit Cards for You
- 12 Tough Questions to Ask Your Parents
- The Real ‘Best Colleges’
- Home Buyer Tax Credit: How to Cash In
- Why You Shouldn't Bash Cash
- 8 Phony 'Bargains' and Better Alternatives
- Danger: 3 Debit Card Scams to Avoid
- 6 Myths About Gas Mileage
- 29 Fees We Hate Most
- Quick and Easy Ways to Boost Returns
- Best Stocks to Buy Now
- Lower Your Taxes: 10 Moves to Make Now
- New Jobs: 8 Lessons from Real-Life Career Switchers
- The New Job Market: Who Wins and Who Loses?
- Health Care Reform's Public Option: Everything You Need to Know
- Volunteer Work When Unemployed: Should You Work for Free?
- Whose Recovery Is This?
- Long-Term-Care Insurance: 4 Biggest Risks to Avoid


