frequency domain predictive modeling with aggregated...

39

Upload: others

Post on 27-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Frequency Domain Predictive Modeling withAggregated Data

Sanmi Koyejo

University of Illinois at Urbana-Champaign

Page 2: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Joint work with

Avradeep Bhowmik Joydeep Ghosh

@University of Texas at Austin

Page 3: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Images courtesy: Econintersect (BEA), NOAA, Blue Hill Observatory

Page 4: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Motivation

Data often released in aggregated form in practice (Burrellet al., 2004; Lozano et al., 2009; Davidson et al., 1978)

Worse, sampling periods need not be aligned, aggregationperiods need not be uniform1

ratio of government debt to GDP reported yearlyGDP growth rate reported quarterlyunemployment rate and ination rate reported monthlyinterest rate, stock market indices and currency exchange ratesreported daily

1Bureau of Labor Statistics, Bureau of Economic Analysis

Page 5: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Motivation - II

Naive tting of aggregated data may result in ecologicalfallacy (Freedman et al., 1991; Robinson, 2009)

Reconstruction (before model tting) is expensive andunreliable

Main Contribution

Model estimation procedure in the frequency domain

avoids input data reconstruction

achieves provably bounded generalization error.

Page 6: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Motivation - II

Naive tting of aggregated data may result in ecologicalfallacy (Freedman et al., 1991; Robinson, 2009)

Reconstruction (before model tting) is expensive andunreliable

Main Contribution

Model estimation procedure in the frequency domain

avoids input data reconstruction

achieves provably bounded generalization error.

Page 7: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Problem Setup

Page 8: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Features x(t) = [x1(t), x2(t) · · ·xd(t)], targets y(t)

Weak Stationarity+

Zero-mean E[y(t)] = 0.

Finite variance E[y(t)] <∞Autocorrelation function satises: E[y(t)y(t′)] = ρ(‖t− t′‖)

same assumptions for x(t)

Residual process

Let εβ(t) = x(t)>β − y(t) be the residual error process of alinear model

Observe that εβ(t) is weakly stationary

Page 9: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Features x(t) = [x1(t), x2(t) · · ·xd(t)], targets y(t)

Weak Stationarity+

Zero-mean E[y(t)] = 0.

Finite variance E[y(t)] <∞Autocorrelation function satises: E[y(t)y(t′)] = ρ(‖t− t′‖)

same assumptions for x(t)

Residual process

Let εβ(t) = x(t)>β − y(t) be the residual error process of alinear model

Observe that εβ(t) is weakly stationary

Page 10: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Problem Setup - II

Performance measure is the expected squared residual error

L(β) = E[|εβ(t)|2] = E[|x(t)>β − y(t)|2]

which is optimized as:

β∗ = arg minβ

L(β)

Page 11: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Data Aggregation in Time Series

Non-Aggregated Feature X1

Aggregated Feature X1

Non-Aggregated Feature X2

Aggregated Feature X2

Non-Aggregated Feature X3

Aggregated Feature X3

Non-Aggregated Target YAggregated Target Y

Page 12: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Data Aggregation in Time Series - II

Each coordinate of the feature set is aggregated

xi[l] =1

Ti

lTi/2∫(l−1)Ti/2

xi(τ)dτ

Similarly, the targets are aggregated

y[k] =1

T

kT/2∫(k−1)T/2

y(τ)dτ

for k, l ∈ Z = · · · − 1, 0, 1, · · · .

Page 13: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Aggregation in Time and Frequency DomainFourier captures global properties of the signal

In time domain, convolution with square wave + sampling

z(t)convolution−−−−−−−→ sampling−−−−−−→ −→ z[k]

In frequency domain, multiplication with sinc function + sampling

Z(ω)multiplication−−−−−−−−−→ sampling−−−−−−→ −→ Z(ω)

Page 14: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Restricted Fourier Transform

For signal z(t), T -restricted Fourier Transform dened as:

ZT (ω) = FT [z](ω) =

∫ T

−Tz(t)e−ιωtdt

Equivalent to a full Fourier Transform if the signal istime-limited within (−T, T )

Always exists nitely if the signal z(t) is nite

Page 15: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Time-limited Data

Innite time series data are not available, instead assume dataavailable between time intervals (−T0, T0)

We apply T0-restricted Fourier transforms computed fromtime-limited data

Assume time-restricted Fourier transform decay rapidly withfrequency e.g. autocorrelation function is a Schwartzfunction (Terzioglu, 1969)

Thus, most of the signal power between frequencies (−ω0, ω0)

Page 16: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Proposed Algorithm

Page 17: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Step I

1 Input parameters T0, ω0, D, aggregated data samples x[k],y[l]

2 Sample D frequencies uniformly between (−ω0, ω0)

Ω = ω1, ω2, · · ·ωD : ωi ∈ (−ω0.ω0)

3 For each ω ∈ Ω, compute T0-restricted Fourier TransformsXT0(ω),YT0(ω) from aggregated signals x[k],y[l]

Page 18: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Step I

1 Input parameters T0, ω0, D, aggregated data samples x[k],y[l]

2 Sample D frequencies uniformly between (−ω0, ω0)

Ω = ω1, ω2, · · ·ωD : ωi ∈ (−ω0.ω0)

3 For each ω ∈ Ω, compute T0-restricted Fourier TransformsXT0(ω),YT0(ω) from aggregated signals x[k],y[l]

Page 19: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Step I

1 Input parameters T0, ω0, D, aggregated data samples x[k],y[l]

2 Sample D frequencies uniformly between (−ω0, ω0)

Ω = ω1, ω2, · · ·ωD : ωi ∈ (−ω0.ω0)

3 For each ω ∈ Ω, compute T0-restricted Fourier TransformsXT0(ω),YT0(ω) from aggregated signals x[k],y[l]

Page 20: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Step II

Recall: UT is Fourier transform of square wave

4 Estimate non-aggregated Fourier transforms

Xi,T0(ω) =Xi,T0(ω)

UTi(ω), YT0(ω) =

YT0(ω)

UT (ω)

5 Estimate parameter β as:

β = arg minβ

1

|Ω|∑ω∈Ω

E‖XT0(ω)>β − YT0(ω)‖2

Page 21: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Step II

Recall: UT is Fourier transform of square wave

4 Estimate non-aggregated Fourier transforms

Xi,T0(ω) =Xi,T0(ω)

UTi(ω), YT0(ω) =

YT0(ω)

UT (ω)

5 Estimate parameter β as:

β = arg minβ

1

|Ω|∑ω∈Ω

E‖XT0(ω)>β − YT0(ω)‖2

Page 22: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Generalization Analysis

Page 23: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Main result I

Theorem (Bhowmik, Ghosh, and Koyejo (2017))

For every small ξ > 0, ∃ corresponding T0, D such that:

E[|x(t)>β − y(t)|2

]< (1 + ξ)

(E[|x(t)>β∗ − y(t)|2

])+ 2ξ

with probability at least 1− e−O(D2ξ2)

Thus, generalization error bounded with suciently large T0, D

Page 24: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Aliasing Eects, Non-uniform Sampling

Signals not bandlimited ⇒ Aliasing

Errors minimum for frequencies around 0

=⇒

Non-uniform sampling leads to further error

Performance will depend on rapid decay of power spectraldensity

Page 25: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Aliasing Eects, Non-uniform Sampling

Signals not bandlimited ⇒ Aliasing

Errors minimum for frequencies around 0

=⇒

Non-uniform sampling leads to further error

Performance will depend on rapid decay of power spectraldensity

Page 26: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Main result IINon-uniform aggregation, Finite samples

Theorem (Bhowmik, Ghosh, and Koyejo (2017))

Let ωi, ωy be the sampling rate for xi(t), y(t) respectively. Let

ωs = minωy, ω1, ω2, · · ·ωd. Then, for small ξ > 0, ∃corresponding T0, D such that:

E[|x(t)>β − y(t)|2

]<(1 + ξ)

(E[|x(t)>β∗ − y(t)|2

])+4ξ + 2e−O((ωs−2ω0)2)

with probability at least 1− e−O(D2ξ2) − e−O(N2ξ2)

Generalization error can be made small if T0, D are high, ω0 is small,

minimum sampling frequency ωs is high

Page 27: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Empirical Evaluation

Page 28: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Synthetic Data

Fig 1(a): No Discrepancy Fig 1(b): Low Discrepancy

Performance on synthetic data with varying ω0, and increasingsampling and aggregation discrepancy

Page 29: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Synthetic Data - II

Fig 1(c): Medium Discrepancy Fig 1(d): High Discrepancy

Performance on synthetic data with varying ω0, and increasingsampling and aggregation discrepancy

Page 30: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Las Rosas Dataset

Regressing corn yield against nitrogen levels, topographicalproperties, brightness value, etc.

Page 31: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

UCI Forest Fires Dataset

Regressing burned acreage against meteorological features, relativehumidity, ISI index, etc. on UCI Forest Fires Dataset

Page 32: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Comprehensive Climate Dataset (CCDS)

Regressing atmospheric vapour levels over continental UnitedStates vs readings of carbon dioxide levels, methane, cloud cover,

and other extra-meteorological measurements

Page 33: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Conclusion

Page 34: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Additional Details

More detailed analysis (not shown) allows for more preciseerror control

Algorithm and analysis easily extend to multi-dimensionalindexes e.g. spatio-temporal data using the multi-dimensionalFourier transform

number of frequency samples may depend exponentially onindex dimension (typically < 4)

Extends to cases where aggregation and sampling period arenon-overlapping.

Page 35: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Conclusion and Future Work

Proposed a novel procedure with bounded generalization errorfor learning with aggregated data

Signicant improvements vs reconstruction-based estimation.

Future Work:

Exploit other frequency domain structure e.g. sparse spectrumto improve estimates.

Extensions to non-linear estimators

Page 36: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Conclusion and Future Work

Proposed a novel procedure with bounded generalization errorfor learning with aggregated data

Signicant improvements vs reconstruction-based estimation.

Future Work:

Exploit other frequency domain structure e.g. sparse spectrumto improve estimates.

Extensions to non-linear estimators

Page 37: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

Questions?

For more details:

Bhowmik, A., Ghosh, J. and Koyejo, O., 2017. Frequency Domain

Predictive Modeling with Aggregated Data. In Proceedings of the20th International conference on Articial Intelligence and Statistics(AISTATS).

[email protected]

Page 38: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

References

Page 39: Frequency Domain Predictive Modeling with Aggregated Datasanmi.cs.illinois.edu/documents/koyejo-freq-aggregate.pdf · 2017-02-18 · Time-limited Data In nite time series data are

References I

Avradeep Bhowmik, Joydeep Ghosh, and Oluwasanmi Koyejo. Frequency domain predictive modellingwith aggregated data. In Proceedings of the 20th International conference on Articial Intelligenceand Statistics (AISTATS), 2017.

Jenna Burrell, Tim Brooke, and Richard Beckwith. Vineyard computing: Sensor networks in agriculturalproduction. IEEE Pervasive computing, 3(1):3845, 2004.

James EH Davidson, David F Hendry, Frank Srba, and Stephen Yeo. Econometric modelling of theaggregate time-series relationship between consumers' expenditure and income in the unitedkingdom. The Economic Journal, pages 661692, 1978.

David A Freedman, Stephen P Klein, Jerome Sacks, Charles A Smyth, and Charles G Everett.Ecological regression and voting rights. Evaluation Review, 15(6):673711, 1991.

Aurelie C Lozano, Hongfei Li, Alexandru Niculescu-Mizil, Yan Liu, Claudia Perlich, Jonathan Hosking,and Naoki Abe. Spatial-temporal causal modeling for climate change attribution. In Proceedings ofthe 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages587596. ACM, 2009.

William S Robinson. Ecological correlations and the behavior of individuals. International journal ofepidemiology, 38(2):337341, 2009.

T Terzioglu. On schwartz spaces. Mathematische Annalen, 182(3):236242, 1969.