CENTAR » Products N=2n

N=2ⁿ: Architecture

The architecture for Centar’s FFT circuits for which the transform size is a power-of-two is based on a new matrix equation form of the discreet Fourier transform (DFT) which is derived using decimation in time and frequency. For a transform length M, this form is

Y=W_b•C_M1 X

Z = C_M2 Y^t

where W_b is an M/4 x M/4 coefficient matrix, C_M1 is an M/4 x 4 coefficient matrix, X is a 4 x M/4 input matrix, C_M2 is a 4 x M/4 coefficient matrix, Z is a 4 x M/4 output matrix containing the transform outputs, and “ •” means element-by-element multiply. Above, the matrix-matrix products C_M1X and C_M2Y^t involve only exchanges of real and imaginary parts plus additions because the elements of C_M1 and C_M2 contain only ±1 or ±j. Also, the usual “Z=CX” form of the DFT requires M² complex multiplications, vs. the (M/4)² in above. (For more details about the algorithmic basis see the Technology section.)

Functionally, the implementation of this matrix form equation can be mapped to a simple systolic (pipelined) structure containing two 4×4 arrays of adders for the matrix-matrix products and a linear array of multipliers to do the element-by-element multiplies as shown in Fig. 1(a) and (b). Here, C_M1 and Z are stored internally (in RAMs) and X and C_M2 flow into the array in pipelined fashion from the boundaries. The length of the array is M/4 and therefore scales with transform size M in the north-south direction as shown in the figure. Each of the small boxes (processing elements or “PEs”) contain either a complex adder or multiplier, a couple of registers, a small memory and communicates locally with its neighbor.

architecture

Fig.1 (a) Functional operation of base-4 architecture (b=4) and (b) circuit implementation for M=16
showing inputs at different times t and internal matrix values.

The DFT transform is computed using the arrays in Fig. 1 based on the well-known row/column factorization, N= N_r N_c , where N is the desired transform length and N_c and N_r are the number of columns/rows. This approach requires calculation of two sets of DFTs (using the formula above), N_c transforms of length N_r (referred to as “column” transforms) and N_r transforms of length N_c (referred to as “row” transforms). In between column and row transforms it is necessary to multiply each of the N intermediate values by a corresponding twiddle factor,(W_N)^nk, n=0,1,..,N_c–1, k=0,1,..N_r–1. (Without the twiddle multiplication a 2‑D DFT is performed.) The column and row DFTs are performed sequentially using the same hardware array with M=N_r for the column DFTs and M=N_c for the row DFTs.