ece 695 - lecture 9

37
ECE 695 Numerical Simulations Lecture 9: Fast Fourier Transforms Prof. Peter Bermel January 30, 2017

Upload: others

Post on 20-Apr-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ECE 695 - Lecture 9

ECE 695Numerical Simulations

Lecture 9: Fast Fourier Transforms

Prof. Peter Bermel

January 30, 2017

Page 2: ECE 695 - Lecture 9

Outline• Fourier Analysis• Sampling Theorem• Discrete Fourier Transforms

– Naïve approach– Danielson-Lanczos lemma– Cooley-Tukey algorithm

• Variations of DFTs• Correlation Measurements• Fourier Signal Processing• FFTW

– Rationale– Planning and Executing DFTs– Application Examples

1/30/2017 ECE 695, Prof. Bermel 2

Page 3: ECE 695 - Lecture 9

Fourier Analysis

• Fourier transformation is a linear operation that maps time-series data into the Fourier-domain:

𝑓 𝜔 = −∞

𝑑𝑡 𝑓 𝑡 𝑒𝑖𝜔𝑡

• Independent variable interpreted as frequency

• Inverse Fourier transform:

𝑓 𝑡 =1

2𝜋 −∞

𝑑𝜔 𝑓(𝜔)𝑒−𝑖𝜔𝑡

1/30/2017 ECE 695, Prof. Bermel 3

Page 4: ECE 695 - Lecture 9

Sampling Theorem

• At finite sampling rates, need at least two data points

• For a sample time interval D, max measurable frequency is the Nyquist frequency fc=1/2D

• Sampling theorem says that fc bandwidth-limited spectrum completely determined by data sampled at intervals of D

1/30/2017 ECE 695, Prof. Bermel 4

Page 5: ECE 695 - Lecture 9

Discrete Fourier Transforms

• Accounts for finite time spacing of data points

• Gives rise to finite frequency spacing and maximum frequency (from sampling theorem)

• DFT defined by: 𝐹 𝑛 = 𝑘=1𝑁 𝑓(𝑥𝑘)𝑒2𝜋𝑗(𝑥𝑘𝑛/𝑥𝑁)

• For uniform time spacing D:

𝐹 𝑛 =

𝑘=1

𝑁

𝑓𝑘𝑒2𝜋𝑗𝑘𝑛/𝑁

1/30/2017 ECE 695, Prof. Bermel 5

Page 6: ECE 695 - Lecture 9

Discrete Fourier Transforms: Naïve Algorithm

• Rewrite DFT as:

𝐹 𝑛 =

𝑘=1

𝑁

𝑓𝑘 𝑊𝑛𝑘

• Where 𝑊 = 𝑒2𝜋𝑗/𝑁

• Just perform sum (N operations) for each frequency (N times)

• Overall time scales as N2

1/30/2017 ECE 695, Prof. Bermel 6

Page 7: ECE 695 - Lecture 9

Discrete Fourier Transforms: Naïve Algorithm

• Obtain same number of points in DFT as original series

• Symmetry properties same as for continuous FT

• For inverse Fourier transform – just use the inverse of W:

𝑓𝑘 =1

𝑁

𝑘=1

𝑁

𝐹 𝑛1

𝑊

𝑛𝑘

1/30/2017 ECE 695, Prof. Bermel 7

Page 8: ECE 695 - Lecture 9

Fast Fourier Transforms: Danielson-Lanczos Lemma

• Based on Danielson-Lanczos lemma:

𝐹 𝑛 =

𝑘=1

𝑁

𝑓𝑘 𝑊𝑛𝑘

=

𝑘=1

𝑁/2

𝑓2𝑘 𝑊𝑛2𝑘 +

𝑘=1

𝑁/2

𝑓2𝑘+1 𝑊𝑛(2𝑘+1)

= 𝐹𝑒 𝑛 + 𝑊𝑛𝐹𝑜(𝑛)

• Can be applied recursively

1/30/2017 ECE 695, Prof. Bermel 8

Page 9: ECE 695 - Lecture 9

Discrete Fourier Transforms: Cooley-Tukey Algorithm

• If one applies D-L lemma recursively to a data set with 𝑁 = 2𝑚, reduce to FT of single point!

• Key is to order everything to keep track of where everything should go – then work backwards

1/30/2017 ECE 695, Prof. Bermel 9

Page 10: ECE 695 - Lecture 9

Discrete Fourier Transforms: Cooley-Tukey Algorithm

• Algorithm devised by Cooley and Tukey:

– Sort data into bit reversed order

– Perform FTs on lengths 1, 2, 4, 8, etc. with D-L lemma

• Operations for first step go as N

• Operations for second step go as N per cycle, with log2 𝑁 cycles

• Overall time is 𝑁 log2 𝑁– considerably better than naïve approach

1/30/2017 ECE 695, Prof. Bermel

J.W. Cooley (IEEE Global History Network)

10

Page 11: ECE 695 - Lecture 9

Fast Fourier Transform Data

• Input data is uniformly spaced

• Output consists of rising positive frequencies, followed by negative frequencies decreasing in magnitude

1/30/2017 ECE 695, Prof. Bermel 11

Page 12: ECE 695 - Lecture 9

Cooley-Tukey Algorithm

1/30/2017 ECE 695, Prof. Bermel 12

Page 13: ECE 695 - Lecture 9

Real FFTs

• For real functions, the general complex FFT procedure is wasteful

• Solutions:

– Pack twice as many FFTs into each calculation

– Reduce length by half, sort out result

– Use sine and cosine transforms

• Application: signal processing of experimental measurement data

1/30/2017 ECE 695, Prof. Bermel 13

Page 14: ECE 695 - Lecture 9

Multidimensional FFTs

• Applications: image processing, band structures

• Definition:

𝐹 𝑛𝑥 , 𝑛𝑦

=

𝑘𝑥=1

𝑁

𝑘𝑦=1

𝑁

𝑓𝑘𝑥𝑘𝑦𝑒2𝜋𝑗 𝑘𝑥𝑛𝑥+𝑘𝑦𝑛𝑦 /𝑁

• For FFT data in 2D or 3D, can efficiently perform FT in each dimension successively

1/30/2017 ECE 695, Prof. Bermel

M. Leistikow et al., Phys. Rev. Lett. 107, 193903 (2011).

14

Page 15: ECE 695 - Lecture 9

Correlation Measurements

• Application: ultrafast optics, quantum optics

• Correlation for discrete data defined by:

𝑔2 𝑚 =

𝑘=1

𝑁

𝑓𝑘ℎ𝑘+𝑚

• Autocorrelation: special case where 𝑓 = ℎ

1/30/2017 ECE 695, Prof. Bermel

From A.M. Weiner, Ultrafast Optics (2009).

15

Page 16: ECE 695 - Lecture 9

Correlation Measurements

• Autocorrelation powerful signature of the nature of one’s data set

• Largest value for m=0

• Pure noise: d-function correlated

• Pure periodic signal: cross-correlation also has same period

• Most signals decay with characteristic correlation time 𝜏𝑐

1/30/2017 ECE 695, Prof. Bermel 16

Page 17: ECE 695 - Lecture 9

Time-domain data analysis

• Many PDE solvers produce a time series of data warranting spectral analysis

• Examples: finite-difference time domain, drift-diffusion models

1/30/2017 ECE 695, Prof. Bermel 17

Page 18: ECE 695 - Lecture 9

Signal Processing

• Most obvious approach: least-squares fit to FFT of time-series data

• Given a set of narrow Lorentzian peaks, should fit well, right? Problem solved!

1/30/2017 ECE 695, Prof. Bermel 18

Page 19: ECE 695 - Lecture 9

Signal Processing

• But what if the decay is slow, and unfinished?

• The FFT of the time-series will look significantly different from goal

1/30/2017 ECE 695, Prof. Bermel 19

Page 20: ECE 695 - Lecture 9

Signal Processing• An even greater challenge – what if you have

two time decays with relatively close frequencies (this case is fairly common)?

• Can’t even detect the number of modes!

1/30/2017 ECE 695, Prof. Bermel 20

Page 21: ECE 695 - Lecture 9

Signal Processing

• Need to find an alternative strategy to straightforward FFTs

• Might want to add damping explicitly

• Most obvious approach known as decimated signal diagonalization

• One particularly useful approach devised by Mandelshtam is known as filter diagonalization

1/30/2017 ECE 695, Prof. Bermel 21

Page 22: ECE 695 - Lecture 9

Filter Diagonalization Method

1/30/2017 ECE 695, Prof. Bermel

[ Mandelshtam, J. Chem. Phys. 107, 6756 (1997) ]

Given time series yn, write:

yn y(nDt) akeiknDt

k

…find complex amplitudes ak & frequencies k

by a simple linear-algebra problem!

Idea: pretend y(t) is autocorrelation of a quantum system:

ˆ H i

t

say:

yn (0)(nDt) (0) ˆ U n (0)

time-∆t evolution-operator:

ˆ U ei ̂H Dt /

22

Page 23: ECE 695 - Lecture 9

Filter-Diagonalization Method[ Mandelshtam, J. Chem. Phys. 107, 6756 (1997) ]

yn (0)(nDt) (0) ˆ U n (0)

ˆ U ei ̂H Dt /

We want to diagonalize U: eigenvalues of U are ei∆t

…expand U in basis of |(n∆t)>:

Um,n (mDt) ˆ U (nDt) (0) ˆ U m ˆ U ˆ U n (0) ymn1

Umn given by yn’s — just diagonalize known matrix!

ECE 695, Prof. Bermel1/30/2017 23

Page 24: ECE 695 - Lecture 9

Filter-Diagonalization Summary[ Mandelshtam, J. Chem. Phys. 107, 6756 (1997) ]

Umn given by yn’s — just diagonalize known matrix!

A few omitted steps:—Generalized eigenvalue problem (basis not orthogonal) —Filter yn’s (Fourier transform):

small bandwidth = smaller matrix (less singular)

• resolves many peaks at once

• # peaks not known a priori

• resolve overlapping peaks

• resolution >> Fourier uncertainty

ECE 695, Prof. Bermel1/30/2017 24

Page 25: ECE 695 - Lecture 9

Rationale for FFTW

• In past, most codes focused exclusively on data sets of length 2m

• Required padding can 2x runtime

• Processing pure real data can 2x runtime

• Ignoring symmetry/anti-symmetry 2x runtime

• How do we account for all of these possibilities with a single software package?

1/30/2017 ECE 695, Prof. Bermel 25

Page 26: ECE 695 - Lecture 9

Planning in FFTW

• “Most people don’t plan to fail; they fail to plan” – John L. Beckley

• Planning our FFT’s before we perform them can make an enormous difference

• FFTW uses a set of short codes, or “codelets,” which can be called as needed by the planner

• FFTW also compares the different possibilities using dynamic programming

1/30/2017 ECE 695, Prof. Bermel 26

Page 27: ECE 695 - Lecture 9

Planning in FFTW

• Execution time can be found in different ways:– Estimate: uses heuristics to roughly determine

– Measure: makes direct test runs with multiple candidate plans

• Execution time may not be directly related to the number of operations

• Instruction-level parallelism can play a critical role in enhancing performance – for example: SIMD

1/30/2017 ECE 695, Prof. Bermel 27

Page 28: ECE 695 - Lecture 9

Planning in FFTW

#include <fftw3.h>

...

{

fftw_complex *in, *out;

fftw_plan p;

...

in = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);

out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);

p = fftw_plan_dft_1d(N, in, out, FFTW_FORWARD, FFTW_ESTIMATE);

fftw_execute(p);

1/30/2017 ECE 695, Prof. Bermel

Sign in exponent Method of estimating execution time

28

Page 29: ECE 695 - Lecture 9

1D Real DFT’s

1/30/2017 ECE 695, Prof. Bermel

#include <fftw3.h>

...

{

double *in, *final;

fftw_complex *out;

fftw_plan p1, p2;

...

p1 = fftw_plan_dft_r2c_1d(N, in, out, FFTW_MEASURE);

p2 = fftw_plan_dft_c2r_1d(N, out, final, FFTW_MEASURE);

fftw_execute(p1);

fftw_execute(p2);

Method of estimating execution time

Transform forward

Transform back

29

Page 30: ECE 695 - Lecture 9

Multidimensional Real DFTs

1/30/2017 ECE 695, Prof. Bermel

#include <fftw3.h>

...

{

double *in, *final;

fftw_complex *out;

fftw_plan p1, p2;

...

p1 = fftw_plan_dft_r2c_2d(n0, n1, in, out, FFTW_PATIENT);

p2 = fftw_plan_dft_c2r_2d(n0, n1, out, final, FFTW_PATIENT);

fftw_execute(p1);

fftw_execute(p2);

Method of estimating execution time

2D forward transform

2D backwards transform (un-normalized)

30

Page 31: ECE 695 - Lecture 9

Multidimensional Complex DFT’s

1/30/2017 ECE 695, Prof. Bermel

#include <fftw3.h>

...

{

double fftw_complex *in, *out, *final;

fftw_plan p1, p2;

...

p1 = fftw_plan_dft_2d(n0, n1, in, out, FFTW_EXHAUSTIVE);

p2 = fftw_plan_dft_2d(n0, n1, out, final, FFTW_EXHAUSTIVE);

fftw_execute(p1);

fftw_execute(p2);

Method of estimating execution time

2D forward transform

2D backwards transform (un-normalized)

31

Page 32: ECE 695 - Lecture 9

Learn from Your Experience

• Wisdom allows one to compute good plans once and save them to disk:

fftw_export_wisdom_to_filename(“wise-dft.wis”);

• Can then restore the wisdom next time with:fftw_import_wisdom_from_filename(“wise-dft.wis”);

• While wisdom accumulates over time, one can discard it with:

fftw_forget_wisdom();

1/30/2017 ECE 695, Prof. Bermel 32

Page 33: ECE 695 - Lecture 9

Example: Beam Propagation

• Starting from the Helmholtz equation:

−𝛻2𝜓 =𝑛𝜔

𝑐

2

𝜓

• One can assume a solution of the form:

𝜓 = 𝜙𝑒−𝑗𝛽𝑧

• Where f is slowly varying, which gives rise to:−𝛻2𝜙 + 2𝑗𝛽𝛻𝜙 = 𝑘⊥

2 𝜓

1/30/2017 ECE 695, Prof. Bermel 33

Page 34: ECE 695 - Lecture 9

Example: Beam Propagation

• BPM closely resembles the nonlinear Schrodinger equation, which describes a broad class of problems

• For now, we’ll focus on direct applications in optics

• Can solve in real-space or Fourier-space

1/30/2017 ECE 695, Prof. Bermel 34

Page 35: ECE 695 - Lecture 9

Example: Beam Propagation

1/30/2017 ECE 695, Prof. Bermel

[xx,yy] = meshgrid([xa:del:xb-del],[1:1:zmax]);

mode = A*exp(-((x+x0)/W0).^2); % Gaussian pulse

dftmode = fix(fft(mode)); % DFT of Gaussian pulse

zz = imread('ybranch.bmp','BMP'); %Upload image with the profile

phase1 = exp((i*deltaz*kx.^2)./(nbar*k0 + sqrt(max(0,nbar^2*k0*2 - kx.^2))));

for k = 1:zmax,

phase2 = exp(-(od + i*(n(k,:) - nbar)*k0)*deltaz);

mode = ifft((fft(mode).*phase1)).*phase2;

zz(k,:) = abs(mode);

end

35

Page 36: ECE 695 - Lecture 9

Example: Beam Propagation

1/30/2017 ECE 695, Prof. Bermel 36

Page 37: ECE 695 - Lecture 9

Next Class

• Will discuss beam propagation method

• Recommended reading: Obayya, Chapter 2

1/30/2017 ECE 695, Prof. Bermel 37