Self-adapting numerical software (SANS) effort
IBM Journal of Research and Development, Mar-May 2006 by Dongarra, J, Bosilca, G, Chen, Z, Eijkhout, V, Et al
ATLAS uses an orthogonal search [13]. For an optimization problem min f(x^sub 1^, X^sub 2^, ..., X^sub n^), parameters X^sub i^ (where 1 ≤ i ≤ n) are initialized with reference values. From x^sub 1^ to x^sub n^, orthogonal search does a linear onedimensional search for the optimal value of x^sub i^, and it uses previously found optimal values for x^sub 1^, x^sub 2^, ..., x^sub n-1^.
Applying simplex search to ATLAS
We have replaced the ATLAS global search with the modified Nelder-Mead simplex search and conducted experiments on four different architectures: 2.4-GHz Intel Pentium** 4, 900-MHz Intel Itanium** 2, 1.3-GHz IBM POWER4*, and 900-MHz Sun UltraSPARC**.
Given values for a set of parameters, the ATLAS code generator generates a code variant of matrix multiply. The code is executed with randomly generated 1000 × 1000 dense matrices as input. After execution of the search heuristic, the output is a set of parameters that gives the best performance for that platform. Figure 1 compares the total time spent by each of the search methods on the search itself. The Itanium 2 search time (for all search techniques) is much longer than those for the other platforms because we are using the Intel compiler, which, in our experience, takes longer to compile the same piece of code than the GNU compiler collection (GCC) used on the other platforms. Figure 2 shows the comparison of the performance of matrix multiply on different sizes of matrices using the ATLAS libraries generated by the simplex search and the original ATLAS search.
Empirical generic code optimization
Current empirical optimization techniques such as ATLAS and FFTW can achieve good performance because the algorithms to be optimized are known ahead of time. We are addressing this limitation by extending the techniques used in ATLAS to the optimization of arbitrary code. Since the algorithm to be optimized is not known in advance, it requires compiler technology to analyze the source code and generate the candidate implementations. The ROSE project [15, 16] from the Lawrence Livermore National Laboratory provides, among other benefits, a source-to-source codetransformation tool that can produce blocked and unrolled versions of the input code. In combination with our search heuristic and hardware information, we can use ROSE to perform empirical code optimization. For example, on the basis of an automatic characterization of the hardware, we direct their compiler to perform automatic loop blocking at varying sixes, which we can then evaluate to find the best block size for that loop. To perform the evaluations, we have developed a test infrastructure that automatically generates a liming driver for the optimized routine on the basis of a simple description of the arguments.
The generic code optimization system is structured as a feedback loop. The code is fed into the loop processor for optimization and separately fed into the timing driver generator, which generates the code that actually runs the oplimized code variant to determine its execution time. The results of the timing are led back into the search engine. On the basis of these results, the search engine may adjust the parameters used to generate the next code variant. The initial set of parameters can be estimated on the basis of the characteristics of the hardware (e.g.. cache size).
- 5 Rules for Immediate Annuities
- Death in the Family: 12 Things to Do Now
- Dumbest Things You Do With Your Money
- 6 Online Networking Mistakes to Avoid
- 401(k) Mistakes to Avoid
- 5 Economic Scenarios to Keep You Up at Night
- The Real ‘Best Places to Retire’
- Best Credit Cards for You
- 12 Tough Questions to Ask Your Parents
- The Real ‘Best Colleges’
- Home Buyer Tax Credit: How to Cash In
- Why You Shouldn't Bash Cash
- 8 Phony 'Bargains' and Better Alternatives
- Danger: 3 Debit Card Scams to Avoid
- 6 Myths About Gas Mileage
- 29 Fees We Hate Most
- Quick and Easy Ways to Boost Returns
- Best Stocks to Buy Now
- Lower Your Taxes: 10 Moves to Make Now
- New Jobs: 8 Lessons from Real-Life Career Switchers
- The New Job Market: Who Wins and Who Loses?
- Health Care Reform's Public Option: Everything You Need to Know
- Volunteer Work When Unemployed: Should You Work for Free?
- Whose Recovery Is This?
- Long-Term-Care Insurance: 4 Biggest Risks to Avoid
Content provided in partnership with
Most Recent Technology Articles
- Verizon expands 3G network coverage in upstate New York
- PlasmaTech Inc names Alpha Security Systems Ltd as new platinum distributor
- ADC's GSM base station and switching product portfolio acquired by Altobridge
- Verizon expands 3G network coverage in upstate New York
- Partner Communications appoints Eli Glickman as Deputy CEO
Most Recent Technology Publications
Most Popular Technology Articles
- Building cost comparison between conventional and formwork system: a case study of four-storey school buildings in Malaysia
- Political stability and economic growth in Asia
- Failed businesses in Japan: a study of how different companies have failed, and tips on how to succeed, in the Japanese market
- What's the point of differential protection?
- Speed control of separately excited DC motor



