Overview of the QCDSP and QCDOC computers

IBM Journal of Research and Development, Mar-May 2005 by Boyle, P A, Chen, D, Christ, N H, Clark, M A, Et al

The QCDSP and QCDOC computers are two generations of multithousand-node multidimensional mesh-based computers designed to study quantum chromodynamics (QCD), the theory of the strong nuclear force. QCDSP (QCD on digital signal processors), a four-dimensional mesh machine, was completed in 1998; in that year, it won the Gordon Bell Prize in the price/performance category. Two large installations-of 8,192 and 12,288 nodes, with a combined peak speed of one teraflops-have been in operation since. QCD-on-a-chip (QCDOC) utilizes a six-dimensional mesh and compute nodes fabricated with IBM system-on-a-chip technology. It offers a tenfold improvement in price/ performance. Currently, 100-node versions are operating, and there are plans to build three 12,288-node, 10-teraflops machines. In this paper, we describe the architecture of both the QCDSP and QCDOC machines, the operating systems employed, the user software environment, and the performance of our application-lattice QCD.

Introduction

In this paper we discuss two massively parallel computers that were designed and constructed for efficient, cost-effective calculations of physical systems subject to the strong nuclear force. While simulations of the strong nuclear force may seem a very esoteric goal, the techniques and underlying mathematics are common to a broad class of problems and are particularly common among exceedingly demanding problems in high-performance computing. Two numerical methods, whose use in our computations is detailed below, are the Metropolis algorithm for importance-sampling a large dimensional integral (more than approximately 107 dimensions for QCD) and Krylov space inversion of a large sparse matrix.

The strong nuclear force affects the protons and nucleons of atomic nuclei, the up and down quarks, which are constituents of protons and nucl�ons, and the four additional short-lived quarks (for a total of six quarks) that are known from experiment to exist. The theory of the strong nuclear force, known as quantum chromodynamics (QCD), is an elegant generalization of the theory of electromagnetism, and is accurately described by a simple Hamiltonian that can easily be written in closed form. Just as the photon mediates the electromagnetic interaction between electrically charged particles, QCD includes a similar mediating particle, called the gluon.

A fundamental difference between QCD and electromagnetism is that the gluon interacts with itself as well as with the quarks. (In electromagnetism, photons in a vacuum interact only with one another very weakly.) For low-energy systems, such as the proton itself, the quark-gluon interaction and the gluon self-interaction are very strong. This makes the equations of QCD highly nonlinear, and in this regime they are not amenable to analytic calculations. (At very high energies, analytic calculations can be performed reasonably accurately, and these are an important part of our belief that QCD is the correct theory of the strong interactions.)

Thus, to study QCD at lower energies-to understand the proton mass, decays of particles made of short-lived quarks, the effects of heating protons to high temperatures, and many other phenomena-we are led to numerical techniques that can deal with this nonlinearity. While QCD can be cast as an infinite number of coupled nonlinear differential equations, it is more easily handled numerically by Monte Carlo integration. To start out, the integration is over all values of quark and gluon fields in space-time, with each configuration of quarks and gluons weighted by its classical action. This is the well-known Feynman path integral formulation of a quantummechanical system.

To evaluate the Feynman path integral, we discretize space and time with a four-dimensional (4D) Cartesian grid. This grid, or lattice, gives the study of QCD with this technique its name, lattice QCD (LQCD). Only a small number of input parameters are needed: the quark masses and the strength of the coupling constant. When the grid spacing is made sufficiently small, QCD simulations should yield precise answers for a wide variety of physical phenomena. Because of its completeness and simplicity, and the precision of its numerical formulation, QCD is an attractive target for large-scale simulation and has received much attention as a grand challenge problem in scientific computing.

The characteristics of the problem mean that the minimal design parameters for an ideal QCD machine can be somewhat restrictive compared with those of the mythical general-purpose parallel machine:

* The problem naturally has a Cartesian structure, allowing for a simple nearest-neighbor mesh network.

* The only common non-nearest-neighbor communication is global summation.

* Both communication and memory access patterns are deterministic and amenable to both software and hardware prefetching.

Exploiting these key simplifications to obtain an advantage in both price and performance has been the raison d'etre for a number of specialized machines built in the U.S., Italy, and Japan, which we briefly discuss in the next section.


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

White Papers, Webcasts, and Resources

advertisement
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with ProQuest