Self-adapting numerical software (SANS) effort

IBM Journal of Research and Development, Mar-May 2006 by Dongarra, J, Bosilca, G, Chen, Z, Eijkhout, V, Et al

The challenge for the development of next-generation software is the successful management of the complex computational environment while delivering to the scientist the full power of flexible compositions of the available algorithmic alternatives. Selfadapting numerical software (SANS) systems are intended to meet this significant challenge. The process of arriving at an efficient numerical solution of problems in computational science involves numerous decisions by a numerical expert. Attempts to automate such decisions distinguish three levels: algorithmic decision, management of the parallel environment, and processor-specific tuning of kernels. Additionally, at any of these levels we can decide to rearrange the user's data. In this paper we look at a number of efforts at the University of Tennessee to investigate these areas.

Introduction

The increasing availability of advanced-architecture computers is having a very significant effect on all spheres of scientific computation, including algorithm research and software development. In numerous areas of computational science-such as aerodynamics (vehicle design), electrodynamics (semiconductor device design), magnetohydrodynamics (fusion energy device design), and porous media (petroleum recovery)-production runs on expensive, high-end systems can last for hours or days. A major portion of this execution time is usually spent inside numerical routines, such as for the solution of large-scale nonlinear and linear systems that derive from discretized systems of nonlinear partial differential equations. Driven by the desire of scientists for ever higher levels of detail and accuracy in their simulations, the size and complexity of required computations is growing at least as fast as improvements in processor technology. Unfortunately, it is getting more difficult to achieve the necessary high performance from available platforms because of the specialized knowledge in numerical analysis, mathematical software, compilers, and computer architecture that is required, and because rapid innovation in hardware and systems software quickly makes performance-tuning efforts obsolete. Additionally, an optimal scientific environment would have to adapt itself dynamically to changes in the computational platform (e.g., network conditions) and the developing characteristics of the problem to be solved (e.g., during a time-evolution simulation).

With good reason, scientists expect their computing tools to serve them, not the other way around. It is not uncommon for applications that involve a large amount of communication or a large number of irregular memory accesses to run at 10% of peak performance or less. Were this gap to remain fixed, we could simply wait for Moore's law to solve our problems; however, it is growing. Our goal in the self-adapting numerical software (SANS) project is to address this widening gap. There are four challenges to closing it:

* Challenge 1: The complexity of modern machines and compilers is so great that very few people know enough, or should be expected to know enough, to predict the performance of an algorithm expressed in a high-level language; there are too many layers of translation from the source code to the hardware. Seemingly small changes in the source code can change performance greatly. In particular, where a particular piece of data resides in a deep memory hierarchy is both hard to predict and critical to performance.

* Challenge 2: The speed of innovation in hardware and compilers is so great that even if one knew enough to tune an algorithm for high performance on a particular machine and with a particular compiler, that work would soon be obsolete. Also, platformspecific tuning impedes the portability of the code.

* Challenge 3: The number of algorithms, or even algorithmic kernels in standard libraries [1], is large and growing too rapidly for the few experts to keep up with tuning-or to even know them all.

* Challenge 4: The need for tuning cannot be restricted to problems that can be solved by libraries in which all optimization is done at design time, installation time, or even compile time. In particular, sparse matrix computations require information about matrix structure for tuning, while interprocessor communication routines require information about machine size and the configuration used for a particular program run. It may be critical to use tuning information captured in prior runs to tune future runs.

In this paper we discuss our efforts to meet these challenges with the following approaches:

* Generic code optimization: a system for automatically generating optimized kernels.

* LAPACK for clusters: software that manages parallelism of dense linear algebra transparently to the user.

* SALSA (self-adaptive large-scale solver architecture): a system for picking optimal algorithms based on statistical analysis of the user problem.

* Fault-tolerant linear algebra: a linear algebra approach to fault tolerance and error recovery.

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with ProQuest