Self-adapting numerical software (SANS) effort
IBM Journal of Research and Development, Mar-May 2006 by Dongarra, J, Bosilca, G, Chen, Z, Eijkhout, V, Et al
Trivially, the owner (or manager) of the system is interested in optimal resource utilization, while the user expects the shortest time to obtain the solution. Instead of aiming at the optimization of either the former (by maximizing memory utilization and sacrificing the total solution time by minimizing the number of processes involved) or the latter (by using all of the available processors), a benchmarking engineer would be interested in best floating-point performance.
Experimental results
Figure 3 illustrates how the time to solution is influenced by the aspect ratio of the logical process grid for a range of process counts. (Each processor was a 1.4-GHz AMD Athlon** with 2 GB of memory; the interconnect was Myricom Myrinet** 2000.) It is clear that sometimes it might be beneficial not to use all of the available processors for computation (the idle processors might be used, for example, for fault-tolerance reasons). This is especially true if the number of processors is a prime number, which leads to a one-dimensional process grid and thus very poor performance on many systems. It is unrealistic to expect that nonexpert users will make the correct decision in every case. It is a matter of having either expertise or relevant experimental data to guide the choice, and our experiences suggest that perhaps a combination of both is required to make good decisions consistently. As a side note, the collection of data for Figure 3 required a number of floating-point operations that would compute the LU factorization of a square dense matrix of order almost 300k. Matrices of that size are usually suitable for supercomputers (the slowest supercomputer on the TOP500** [34] list that factored such a matrix was on position 16 in November 2002).
Figure 4 shows the large extent to which the aspect ratio of the logical process grid influences another facet of numerical computation: per-processor performance of the LFC parallel solver. The plots in the figure show data for various numbers of processors (between 40 and 64) and consequently do not represent a function, because, for example, ratio 1 may be obtained with 7 × 7 and 8 × 8 process grids (within the specified range of the number of processors). The figure shows the performance of both parallel LU decomposition using the Gaussian elimination algorithm and parallel Cholesky factorization code. A side note: The data is relevant for only the LFC parallel linear solvers that are based on the solvers from ScaLAPACK. [26]; it would not be indicative of the performance of a different solver, such as HPL [35], which uses different communication patterns and consequently behaves differently with respect to the process grid aspect ratio.
SALSA
Algorithm choice, the topic of this section, is an inherently dynamic activity in which the numerical content of the user data is of prime importance. Speaking abstractly, we could say that the need for dynamic strategies arises here from the fact that any description of the input space is of a very high dimension. As a corollary, we cannot hope to search this input space exhaustively, and we have to resort to some form of modeling of the parameter space.
- 5 Rules for Immediate Annuities
- Death in the Family: 12 Things to Do Now
- Dumbest Things You Do With Your Money
- 6 Online Networking Mistakes to Avoid
- 401(k) Mistakes to Avoid
- 5 Economic Scenarios to Keep You Up at Night
- The Real ‘Best Places to Retire’
- Best Credit Cards for You
- 12 Tough Questions to Ask Your Parents
- The Real ‘Best Colleges’
- Home Buyer Tax Credit: How to Cash In
- Why You Shouldn't Bash Cash
- 8 Phony 'Bargains' and Better Alternatives
- Danger: 3 Debit Card Scams to Avoid
- 6 Myths About Gas Mileage
- 29 Fees We Hate Most
- Quick and Easy Ways to Boost Returns
- Best Stocks to Buy Now
- Lower Your Taxes: 10 Moves to Make Now
- New Jobs: 8 Lessons from Real-Life Career Switchers
- The New Job Market: Who Wins and Who Loses?
- Health Care Reform's Public Option: Everything You Need to Know
- Volunteer Work When Unemployed: Should You Work for Free?
- Whose Recovery Is This?
- Long-Term-Care Insurance: 4 Biggest Risks to Avoid
Content provided in partnership with
Most Recent Technology Articles
Most Recent Technology Publications
Most Popular Technology Articles
- BizRate to monitor in-store customer satisfaction for Office Depot stores - Market Intelligence
- Speed control of separately excited DC motor
- Building cost comparison between conventional and formwork system: a case study of four-storey school buildings in Malaysia
- Political stability and economic growth in Asia
- Failed businesses in Japan: a study of how different companies have failed, and tips on how to succeed, in the Japanese market


