The backprojection operation arises from a formula for inverting the Radon transform. It is used in popular image reconstruction algorithms for computed tomography (CT), spotlight-mode synthetic aperture radar (SAR), and other imaging techniques. All these techniques have one thing in common: the acquired data represents, either directly or after some preprocessing, the Radon transform of the scanned image---a collection of image projections at various angles.
The image reconstruction problem, thus, equals the problem of inverting the Radon transform. The most notable reconstruction methods are Fourier-based techniques that rely on the slice-projection property of the Radon transform, and widely used filtered backprojection (FBP) algorithms.
In the FBP approach, each of the projections in the Radon transform is filtered with a windowed ramp filter and then backprojected to reconstruct the image. The backprojection is the costliest stage of the reconstruction with O(N^3) operations, N being the image size in pixels, whereas Fourier techniques require only O(N^2 log N) operations. Efficient implementation of the backprojection operation is therefore crucial for the overall performance of the algorithm.
The fast hierarchical backprojection algorithm (FHBP)  reduces the cost of the backprojection asymptotically to O(N^2 log N) by dividing the problem into smaller problems that require fewer projections for effective reconstruction. We provide a flexible C package that implements different variants of the FHBP algorithm, including the direct O(N^3) backprojection as a special case. The software allows the user to choose parameters that control the depth of the recursion, the amount of oversampling, and the choice of the interpolating function, all designed to exploit possible runtime/distortion tradeoffs.
 S. Basu and Y. Bresler, "O(N^2 log_2(N)) filtered backprojection reconstruction algorithm for tomography,'' IEEE Trans. on Image Processing, vol. 9, pp. 1760--1772, October 2000.
|Runtime and Performance in MFLOPS for different levels of recursion in the fast backprojection. The fastest runtime is obtained when recursing only up to image size 16 x 16 and to continue then with the direct backprojection. For this strategy, we achieve ~500 MFLOPS, about 15% of the peak performance (excluding vector instructions.
See the paper below for more experimental results.