1 fft processor_modified

Upload: pratik-agrawal

Post on 08-Apr-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 1 FFT PROCESSOR_modified

    1/19

    ACCESSICLAB

    Graduate Institute of Electronics Engineering, NTU

    FFT VLSI ImplementationFFT VLSI Implementation

    VLSI Signal Processing

    1. Shousheng He and Mats Torkelson, A new approach to pipeline FFTprocessor. IEEE Proc. Of IPPS, P766-770, 1996.

    2. E. Bidet, D. Castelain, C. Joanblanq, and P. Senn, A fast single-chip

    implementation of 8192 complex point FFT. IEEE J. Solid-State Circuits,

    P300-305, March 1995

  • 8/7/2019 1 FFT PROCESSOR_modified

    2/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    FFT Review

    with

    1-N0,1,...,kfor)()(

    )/2(

    1

    0

    Nj

    N

    N

    n

    nk

    N

    eW

    WnkX

    T

    G

    !

    !

    !!

    1

    1

    1

    1

    1

    1

    1

    1

    1

    1

    1

    1

    W N0

    WN

    0

    W N0

    W N0

    W N0

    W N2

    W N0

    W N2

    W N0

    W N2

    W N1

    W N3

    G [ ]0

    G [ ]4

    G [ ]2

    G [ ]6

    G [ ]1

    G [ ]5

    G [ ]3

    G [ ]7

    X [ ]0

    X [ ]1

    X [ ]2

    X [ ]3

    X [ ]4

    X [ ]5

    X [ ]6

    X [ ]7

  • 8/7/2019 1 FFT PROCESSOR_modified

    3/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Implementation--- Two Extreme Method

    Slow ----------------- Speed ----------------- Fast

    Small ------------------Area------------------- Large

    1

    1

    1

    1

    1

    1

    1

    1

    1

    1

    1

    1

    N

    0

    N

    0

    N

    0

    N

    0

    W N0

    W N2

    W N0

    WN

    2

    N

    0

    N

    2

    N

    1

    N

    3

    G [ ]0

    G [ ]4

    G [ ]2

    G[ ]6

    G [ ]1

    G [ ]5

    G [ ]3

    G [ ]7

    X [ ]0

    X [ ]1

    X [ ]2

    X [ ]3

    X [ ]4

    X [ ]5

    X [ ]6

    X [ ]7

    Fully SpreadReuse Single Butterfly

    Complicated ------------ Control --------------- Simple

  • 8/7/2019 1 FFT PROCESSOR_modified

    4/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Design ConsiderationSystem Requirement

    e.g., speed, area,power

    Trade-off in these two cases, we need

    More Processing Elements (PEs)

    Better Processing Element Utilization

    RateBetter Control Scheme

  • 8/7/2019 1 FFT PROCESSOR_modified

    5/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    FFT Processor--- Block Diagram

    COEF

    ROM

    Processing

    Element

    (Butterfly)

    FFT

    RAM

    INPUT

    BUFFER

    CONTROL

    DATA OUTDATA IN

    CONTROL

    SIGNAL

  • 8/7/2019 1 FFT PROCESSOR_modified

    6/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Some Current Themes

    BF2

    8

    BF2

    4

    BF2

    2

    jBF2

    1

    Radix-2 Single-path Delay Feedback. ( N = 16 )

    Radix-2 Multi-path Delay Commutator. ( N = 16 )

  • 8/7/2019 1 FFT PROCESSOR_modified

    7/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Some Current Themes (cont.)

    Radix-4 Single-path Delay Feedback. ( N = 256 )

    BF4

    8

    BF4

    4

    BF4

    2

    jBF4

    1

    DC6x64 BF4 BF4 DC6x16 BF4 DC6x4 BF4 DC6x1

    Radix-4 Single-path Delay Commutator. ( N = 256 )

    C4

    192

    BF4

    C4 C4 C4 BF4128

    64

    16

    32

    48

    48

    BF4

    32

    16

    4812

    12

    BF4

    84

    123

    321

    Radix-4 Multi-path Delay Commutator. ( N = 256 )

  • 8/7/2019 1 FFT PROCESSOR_modified

    8/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Distinctive merit of the aboveThe delay-feedback are more efficient

    than delay-commutator in terms of

    memory utilizationRadix-4 has higher multiplier utilization

    ,however,Radix-2 has simpler BF which

    are better utilized

  • 8/7/2019 1 FFT PROCESSOR_modified

    9/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Comparison

    Control ThemeSimple ----------------------------------- Complex

    ProcessingAbility / Unit

    Low ----------------------------------- High

    Radix / Speed

    Low ----------------------------------- High

    Combine the advantages

    Further decompose high radix PE

  • 8/7/2019 1 FFT PROCESSOR_modified

    10/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Decompose Method (1)Simply reuse the repeated micro unit

    Reuse 4times

    A radix-4PE

  • 8/7/2019 1 FFT PROCESSOR_modified

    11/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Decompose Method (2)From algorithm levelApplying 3 index:

    n=N

    k=N

    Summation ofn1

    where n1,n2={0,1} ;n3={0~N/4-1}

  • 8/7/2019 1 FFT PROCESSOR_modified

    12/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Decompose Method (2) cont.Summation of n2

    Only real-imaginary swapping & sign inversion

  • 8/7/2019 1 FFT PROCESSOR_modified

    13/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Graphical Explanation (N=16)

    Trivial multiplication

  • 8/7/2019 1 FFT PROCESSOR_modified

    14/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Graphical Explanation (cont.)The Eqs are equivalent to the operations

    below

    BF4

    Control

    BF2 I BF2 II

    Control

  • 8/7/2019 1 FFT PROCESSOR_modified

    15/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Circuit of BF2I

    Xr(n)

    Xi(n)

    Xr(n+N/2)

    Xi(n+N/2)

    Zr(n+N/2)

    Zi(n+N/2)

    Zr(n)

    Zi(n)

    First N/2 cycles

    Second N/2 cycles

  • 8/7/2019 1 FFT PROCESSOR_modified

    16/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Circuit of BF2II

    Xi(n)

    Xr(n+N/2)

    Xr(n)

    Xi(n+N/2) Z i(n)

    Zr(n)

    Zi(n+N/2)

    Zr(n+N/2)

    Swap Re&Im and sign inversion

  • 8/7/2019 1 FFT PROCESSOR_modified

    17/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Radix-22

    Single-path Delay Feedback

    BF2i

    128

    x(n) BF2ii

    64

    BF2i

    32

    BF2ii

    16

    BF2i

    8

    BF2ii

    4

    BF2i

    2

    BF2ii

    1

    X(k)

    W1(n) W2(n) W3(n)

    01234567clk

    FFT architecture using the above technique, forN=256

    Compare with original architecture, forN=256

  • 8/7/2019 1 FFT PROCESSOR_modified

    18/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Structural advantageRadix-2 has the same complexity as

    radix-4,but still retain radix-2 BF

    structureThe stage has non-trivial multiplication

    Control is simple;

    synchronization controller

    address counter for W

    2

    n

  • 8/7/2019 1 FFT PROCESSOR_modified

    19/19

    ACCESSICLAB Graduate Institute of Electronics Engineering, NTU

    Conclusions1. FFT Applications: Radar Signal Processing, Fast

    convolution, Spectrum Estimation, OFDM-based

    Modulation/demodulations

    2. Efficient VLSI architectures (parallel processing) are

    required for real-time processing.

    3. However, most systems still employ DSP processors (e.g.,

    TI C3x/C5x) for computations (fast algorithms like DIT and

    DIF FFT).4. VLIW (Very Long-length Instruction Word)-based processors

    (TI C6x) need new programming skills to utilize the two

    parallel MAC units.