The Spiral Viterbi Software Generator automatically generates high-performance software implementations for decoders for a large class of convolutional codes. The user specifies the code and the target architecture; the generator returns the source code (Intel compatible) for an encoder and a high-performance decoder using the Viterbi algorithm.
We reuse the infrastructure, the encoder, and the traceback of the decoder from Phil Karn's implementation (slightly modified). We only generated the (most costly) forward part of the Viterbi decoder.
Convolutional codes [2] are a common type of error-correcting code where encoding is traditionally done in hardware using a shifting register and modulo-2 adders. Every cycle, a new value enters the register and the modulo-2 additions of some values inside the register constitute the output bits. The inverse of the number of adders describes the ratio of meaningful data within the message and is called the rate r of the code. The length of the shifting register constitutes the "memory" of the code and is called the constraint-length, usually denoted as K. The actual wiring of the adders are called the polynomials. There are as many polynomials as the inverse of the rate and each one is an integer between 1 and 2^K - 1. Because of the finiteness of K, the encoding operation can be equivalently represented as a finite state machine (FSM) with 2^(K-1) states.
The Viterbi algorithm [3,4] implements a maximum likelihood decoder: Given a corrupted encoded bitstream, its purpose is to return the most probable original data. Using a representation called the Viterbi trellis, the algorithm maintains the cost of being in every state of the FSM based on the Hamming distance to the message. It resolves conflicts locally during the forward pass and reads off results during the traceback.
Input: parameters specifying the convolutional code and the target architecture
Output: C code, or C code with Intel compatible SSE vector instructions, for a Viterbi decoder for the specified code. Longer vector length means higher performance (see benchmark below).
Below we benchmark our generated implementations of Viterbi decoding for four different convolutional codes and for different degrees of SIMD vectorization (scalar, 4-way, 8-way, 16-way) against Phil Karn's implementation. The base line is Karn's C code, which is normalized to 1. Higher is better. A missing bar means that the vector length is not provided by Karn (e.g., 4-way for the Voyager code).
Blue = Karn's implementations, red = Spiral-generated implementations, darker = longer vectors = higher performance

This generator is part of the Spiral library generation system and was created by Frédéric de Mesmay based on earlier work by Srinivas Chellappa.
Contact: Frédéric de Mesmay, fdemesma (at) xxxece.cmu.edu (delete xxx)