Self-adapting numerical software (SANS) effort
IBM Journal of Research and Development, Mar-May 2006 by Dongarra, J, Bosilca, G, Chen, Z, Eijkhout, V, Et al
The challenge for the development of next-generation software is the successful management of the complex computational environment while delivering to the scientist the full power of flexible compositions of the available algorithmic alternatives. Selfadapting numerical software (SANS) systems are intended to meet this significant challenge. The process of arriving at an efficient numerical solution of problems in computational science involves numerous decisions by a numerical expert. Attempts to automate such decisions distinguish three levels: algorithmic decision, management of the parallel environment, and processor-specific tuning of kernels. Additionally, at any of these levels we can decide to rearrange the user's data. In this paper we look at a number of efforts at the University of Tennessee to investigate these areas.
Introduction
The increasing availability of advanced-architecture computers is having a very significant effect on all spheres of scientific computation, including algorithm research and software development. In numerous areas of computational science-such as aerodynamics (vehicle design), electrodynamics (semiconductor device design), magnetohydrodynamics (fusion energy device design), and porous media (petroleum recovery)-production runs on expensive, high-end systems can last for hours or days. A major portion of this execution time is usually spent inside numerical routines, such as for the solution of large-scale nonlinear and linear systems that derive from discretized systems of nonlinear partial differential equations. Driven by the desire of scientists for ever higher levels of detail and accuracy in their simulations, the size and complexity of required computations is growing at least as fast as improvements in processor technology. Unfortunately, it is getting more difficult to achieve the necessary high performance from available platforms because of the specialized knowledge in numerical analysis, mathematical software, compilers, and computer architecture that is required, and because rapid innovation in hardware and systems software quickly makes performance-tuning efforts obsolete. Additionally, an optimal scientific environment would have to adapt itself dynamically to changes in the computational platform (e.g., network conditions) and the developing characteristics of the problem to be solved (e.g., during a time-evolution simulation).
With good reason, scientists expect their computing tools to serve them, not the other way around. It is not uncommon for applications that involve a large amount of communication or a large number of irregular memory accesses to run at 10% of peak performance or less. Were this gap to remain fixed, we could simply wait for Moore's law to solve our problems; however, it is growing. Our goal in the self-adapting numerical software (SANS) project is to address this widening gap. There are four challenges to closing it:
* Challenge 1: The complexity of modern machines and compilers is so great that very few people know enough, or should be expected to know enough, to predict the performance of an algorithm expressed in a high-level language; there are too many layers of translation from the source code to the hardware. Seemingly small changes in the source code can change performance greatly. In particular, where a particular piece of data resides in a deep memory hierarchy is both hard to predict and critical to performance.
* Challenge 2: The speed of innovation in hardware and compilers is so great that even if one knew enough to tune an algorithm for high performance on a particular machine and with a particular compiler, that work would soon be obsolete. Also, platformspecific tuning impedes the portability of the code.
* Challenge 3: The number of algorithms, or even algorithmic kernels in standard libraries [1], is large and growing too rapidly for the few experts to keep up with tuning-or to even know them all.
* Challenge 4: The need for tuning cannot be restricted to problems that can be solved by libraries in which all optimization is done at design time, installation time, or even compile time. In particular, sparse matrix computations require information about matrix structure for tuning, while interprocessor communication routines require information about machine size and the configuration used for a particular program run. It may be critical to use tuning information captured in prior runs to tune future runs.
In this paper we discuss our efforts to meet these challenges with the following approaches:
* Generic code optimization: a system for automatically generating optimized kernels.
* LAPACK for clusters: software that manages parallelism of dense linear algebra transparently to the user.
* SALSA (self-adaptive large-scale solver architecture): a system for picking optimal algorithms based on statistical analysis of the user problem.
* Fault-tolerant linear algebra: a linear algebra approach to fault tolerance and error recovery.
Most Recent Technology Articles
- INTERVIEW WITH BEN BUTTERS, DIRECTOR OF EUROPEAN AFFAIRS AT EUROCHAMBRES : "A PERFECT ROAD MAP FOR EU CLUSTERS DOES NOT EXIST".
- AGENDA.(Brief article)(Conference notes)
- FIGHT AGAINST INTERNET PIRACY.
- INTERNET : AUTHORS' SOCIETIES URGE ACTION AGAINST PIRACY.
- TELECOMMUNICATIONS : BUSINESSEUROPE HOSTILE TO FURTHER CONTRACTUAL OBLIGATIONS.(Brief article)
Most Recent Technology Publications
Most Popular Technology Articles
- BizRate to monitor in-store customer satisfaction for Office Depot stores - Market Intelligence
- Speed control of separately excited DC motor
- What is precision air conditioning and why is it necessary?
- Effects of creative, educational drama activities on developing oral skills in primary school children
- 3G: naughty or nice? PhoneErotica.com generates over 300 million hits per month, and rings up more minutes of use per month than MSN


