On this page we overview software that we developed in this project. We have three types of projects: (a) program generators for particular problem domains, (b) hand-optimized software for certain problems, and (c) tools related to performance
We are doing research on language environments that support the constructions of program generators for performance. As an example, we built a Spiral prototype n Scala using LMS (staging). Doing so, various transformations and features can be implemented elegantly.
Go to the Spiral-Scala page.
The SPIRAL Code Generator is our flagship. Completely autonomous code generation, optimization, and platform adaptation for discrete signal transforms.
Go to the code generator site.
We developed a program generator, called LGen, for basic linear algebra computations. The design is very similar to the design of Spiral for transforms (see above).
Go to LGen site.
We provide an online generator for Viterbi decoders (in software) for convolutional codes.
Go to the Viterbi decoder site.
We have implemented a measurement framework to create roofline plots with measured performance data.
Go to the roofline site.
Also note our generalization of the rofline model.
We have developed a sorting library that adapts itself to platform characteristics and to the structure or statistics to the input to achieve better performance. Go to the sorting site.
We have developed a program generator for the all-pairs shortest path problem. It generates high performance scalar and vector code tuned to a given platform.
Go to the shortest path website.
We have developed algorithms and tools that generated multiplier blocks that multiply by one or several constants "multiplierless," that is, using only additions, subtractions, and shifts.
Go to the multiplierless website.
We developed a self-adaptable, easy-to-install package for efficently computing the Walsh-Hadamard transform on single processor and SMP platforms. The design of the package is based on an FFT package by Sebastian Egner. We use this package to experiment with new optimization techniques to be incorporated into the SPIRAL system (above).
Go to the WHT site.
We have developed a flexible C implementation of the fast hierarchical backprojection algorithm to investigate performance trade-offs with respect to runtime and accuracy.
Go to the backprojection site.
We have developed a tool to obtain high resolution performance profiles for a given program. For example, the number of cache misses during the execution recorded in very small time steps.
Go to the website.
We have generated software (for the DFT and other kernels) for the Cell BE chip multiprocessor.
Go to the Cell BE site.
We have optimized the recent new sparse FFT algorithm from Hassanieh et al. at MIT.
Go to the sparse FFT site.
We have developed a generalization of the roofline model to identify performance bottlenecks. Go to the generalized roofline page.