minimal-dispersion and maximum- likelihood predictors with ... · – minimal dispersion •...

67
Luis G. Crespo, Sean P. Kenny, and Daniel P. Giesy Dynamic Systems and Controls Branch NASA Langley Research Center, Hampton, VA, 23666 Minimal-Dispersion and Maximum- Likelihood Predictors with a Linear Staircase Structure

Upload: others

Post on 23-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

Luis G. Crespo, Sean P. Kenny, and Daniel P. Giesy

Dynamic Systems and Controls BranchNASA Langley Research Center, Hampton, VA, 23666

Minimal-Dispersion and Maximum-Likelihood Predictors with a Linear Staircase

Structure

Page 2: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

2

Outline

•  Problem statement •  Background •  Random predictor models •  Conclusions

Page 3: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

3

Problem Statement•  Goal: create a computational model of a Data Generating

Mechanism (DGM) given N input-output pairs D={x(i), y(i)}

Page 4: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

4 DGM is a deterministic function of 2 inputs without noise

Problem Statement: On the DGM

Page 5: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

5 Model form uncertainty vs. deterministic function + colored noise

Problem Statement: On the DGM

Page 6: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

6

Problem Statement•  Parametric models vs. non-parametric models •  This paper focuses on the parametric model

•  This form is implied by the superposition property of linear system theory

•  The calibration problem of interest is not standard since the calibrated variable is unobservable

Page 7: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

7

Outline

•  Problem statement •  Computational models •  Staircase variables •  Random predictor models •  Conclusions

Page 8: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

8

Computational Models•  Interval Predictor Models (IPM)

Page 9: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

9

Interval Predictor Models•  The output is an interval valued function of the input •  IPM considered here are given by

•  This leads to

where the IPM boundaries are known analytically •  Interval and functional representation •  The spread of the IPM is

Page 10: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

10

Interval Predictor Models•  IPMs are calculated by solving the convex program

Additional set of constraints

Page 11: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

11

Interval Predictor Models

np=10 N=1K

Page 12: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

12

Interval Predictor Models

np=20 N=1K

Page 13: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

13

Interval Predictor Models: Reliability•  Reliability of the Predictor: scenario theory enables

bounding the probability of a future observation falling outside the IPM: distribution-free, non-asymptotic

•  This is a probabilistic certificate of correctness prescribing the interplay between the amount of information available, the complexity of the model, a confidence parameter, and the reliability of the model

Page 14: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

14

Computational Models•  Random Predictor Models (RPM)

Page 15: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

15

Maximal-entropy Staircase RPM

The modality and skewness vary strongly with the input

Random Predictor Models

Page 16: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

16

Outline

•  Problem statement •  Computational models •  Staircase variables •  Random predictor models •  Conclusions

Page 17: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

17

Background

•  Hyper-parameters

•  Desired variables must match these constraints •  Only some are feasible •  Polynomial feasibility constraints:

Page 18: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

18

Background: 𝛉-Feasibility Equations

Distribution free

Page 19: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

19

•  A staircase random variables has a piecewise constant density function over a uniform partition of the domain that match the constraints imposed by

•  Staircases are found by solving the convex program

Staircase Variables

-1 -0.5 0 0.5 1

f z

0

0.5

1

-1 -0.5 0 0.5 1

f z

0

0.5

1

z-1 -0.5 0 0.5 1

f z

0

0.5

1

Page 20: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

20

•  Able to represent a wide range of density shapes by using different optimality criteria –  Max entropy –  Max likelihood –  Max degree of unimodality, etc

•  Able to represent most of the feasible space •  Low-computational cost: from convex optimization

Staircase Variables: Key Attributes

Page 21: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

21

Outline

•  Problem statement •  Computational models •  Staircase variables •  Random predictor models

–  Moment-matching –  Minimal dispersion

•  Conclusions

Page 22: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

22

Random Predictor Models•  The output is a random process •  RPM considered here are given by

•  Goal: given the data sequence D={x(i), y(i)} we want to characterize the distribution of p

x

y ⨯

⨯ ⨯

p1

p2

?

Page 23: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

23

Random Predictor Models•  Bayesian/Maximum likelihood approach

–  Pros: any model, any distribution –  Cons: expensive, tight to assumed distribution

x

y ⨯

⨯ ⨯

p1

p2

Page 24: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

24

Random Predictor Models•  Taking the expected value of the model equation we have

which can be combined to obtain the moment functions

Parameter independency is assumed hereafter

Page 25: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

25

Moment-Matching RPMs•  Idea: Find the moments of p leading to a prediction that

minimizes the offset between the predicted moments and the empirical moments

•  A sliding-window approach is used to estimate the empirical moments

•  The predicted moments, given by

depend upon the design variables:

Page 26: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

26

Moment-Matching RPMs•  Solution Approach: a sequence of optimization programs for

moments of increasing order. 1.  Solve for the mean

2.  Find a feasible support set using IPMs 3.  Solve for the variance

for

4.  Find a feasible support set using IPMs….

Page 27: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

27

Moment-Matching RPMs•  Outcome:

•  Advantage: approach is distribution-free: no need to assume a distribution for p upfront

•  Setting a particular uncertainty model: use staircase variables to realize the optimal moments

Page 28: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

28

Moment-Matching Example•  Goal: to characterize the unknown loading of a cantilever

beam from displacement measurements •  A datum in the sequence is a set of measurements

Page 29: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

29

Moment-Matching Example•  Basis chosen from Euler-beam theory:

Page 30: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

30

Moment-Matching Example

Skewed response

Page 31: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

31

Moment-Matching Example

Page 32: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

32

Minimal-Dispersion RPM•  Idea: find the moments of p leading to a prediction that

concentrates the response as close as possible to the data while enclosing it into a high-probability region (trade-off)

•  Solution approach: solve the optimization program

where

and the high-probability region is

Page 33: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

33

Minimal-Dispersion RPM•  Same outcome and advantage as the previous approach •  When to use: unimodal DGM •  Challenge: characterizing 𝛪𝛂 as a function of 𝛉 𝛂 as a function of 𝛉 as a function of 𝛉 In the paper we use:

but a better 𝛪𝛂 can be derived using regression/staircases 𝛂 can be derived using regression/staircases can be derived using regression/staircases

Page 34: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

34

Minimal-Dispersion: Example•  Consider the data-cloud, and an arbitrary basis

0 0.2 0.4 0.6 0.8 1x

-5

-4

-3

-2

-1

0

1

2

3

4

y

DataIPM

Page 35: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

35

Minimal-Dispersion: Example•  Resulting RPM

Page 36: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

36

Minimal-Dispersion: Example•  Distribution of the staircase parameters

p10 0.2 0.4 0.6

f p1

0

2

p2-0.2 0 0.2 0.4 0.6

f p2

00.51

p30 0.5 1

f p30

0.5

p40 0.5 1

f p4

0

1

p50 0.5 1 1.5

f p5

0

0.5

p60 0.5 1 1.5

f p6

0

0.5

p70 0.5 1

f p7

00.51

p80 0.5 1

f p8

00.51

p9-0.1 0 0.1 0.2 0.3 0.4

f p9

0

2

p10-0.2 0 0.2 0.4 0.6 0.8

f p10

00.51

Page 37: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

37

Conclusions

•  A framework for calibrating affine probabilistic models was developed

•  Technique is moment-based and distribution-free •  Computational demands are considerably lower than

maximum/likelihood based approaches •  Eliminates the need for assuming a distribution of the

uncertainty upfront •  Analytical propagation of moments is possible when

dependency is a known polynomial (we only did linear)

Page 38: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

38

Conclusions

•  Parameter dependencies can be accounted for (not done here, cumbersome)

•  All sources of uncertainty and error are lumped into the resulting characterization of p...

Page 39: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

Luis G. Crespo, Sean P. Kenny, and Daniel P. Giesy

Dynamic Systems and Controls BranchNASA Langley Research Center, Hampton, VA, 23666

Random Predictor Models with a Linear Staircase Structure

Page 40: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

40

Staircases

•  Consider a random variable z with probability density function (PDF) and support set

•  The central moments, defined as

are assumed to exist •  Goal: to calculate a random variable with a bounded

support given values for the first four moments

Page 41: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

41

𝛉-Feasibility-Feasibility

•  Does there exist a random variable that meets the constraints imposed by ?

•  Distribution-free vs. distribution fixed •  Such a random variable(s) exist if the set of

polynomial constraints is satisfied

[1] - Sharma 2015, ArXiv 1503.03786; Kumar 2012

Page 42: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

42

𝛉-Feasibility: equations-Feasibility: equations

Page 43: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

43

𝛉-Feasibility-Feasibility

•  Feasible domain

Page 44: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

44

𝛉-Feasibility: intersections-Feasibility: intersections

Page 45: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

45

𝛉-Feasibility-Feasibility

•  Feasible domain

•  This set is non-convex •  Standard random variables cannot realize most of Θ

Page 46: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

46

𝛉-Feasibility: intersections-Feasibility: intersections

Page 47: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

47

𝛉-Feasibility-Feasibility

•  Feasible domain

•  This set is non-convex •  Standard random variables cannot realize most of Θ •  There might exist infinitely many random variables

able to realize a feasible point

Page 48: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

48

𝛉-Feasibility-Feasibility

•  Feasible domain

•  This set is non-convex •  Standard random variables cannot realize most of Θ •  There might exist infinitely many random variables

able to realize a feasible point •  How to construct a family of random variables that

can realize most of Θ?

Page 49: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

49

Staircase random variables

•  Staircase variables have a piecewise constant PDF over a uniform partition of : nb bins

•  The PDF of a staircase variable is given by

where is given by

Page 50: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

50

Staircase random variables

•  Cost to be defined later •  Hyper-parameter: •  The above equation can be written as

Page 51: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

51

Staircase random variables

•  If the cost function is convex, calculating a staircase variable entails solving a convex optimization program: efficiently done for hundreds of thousands of constraints/design variables

•  This optimization problem might be infeasible: distribution-fixed

Page 52: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

52

Staircase variables: cost function

•  Does not affect staircase-feasibility •  Three classes considered

–  Maximal entropy

–  Minimal squared likelihood –  Optimal target matching

•  Other costs: max/min likelihood, min support, etc. •  Let’s explore their structure and dependencies

Page 53: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

53

Staircase random variables: nb

-1 -0.5 0 0.5 1f z

0

0.5

1

-1 -0.5 0 0.5 1

f z

0

0.5

1

z-1 -0.5 0 0.5 1

f z

0

0.5

1

Page 54: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

54

Staircase random variables: cost J

z-1 -0.5 0 0.5 1

f z

0

0.2

0.4

0.6

0.8

1

1.2 DGMMax EntropyMin Square LikelihoodPDF matching

Page 55: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

55

Staircase variables: worst-case variable

F

z

Pbox(J|θ)

1

Page 56: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

56

Staircase variables: worst-case PDF

F

z

P[ z > z1 | θ ] = [𝛂, 𝛃] 1

z1

𝛂

𝛃

Page 57: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

57

Staircase variables: feasibility

•  The staircase feasible space is defined as

•  How much of Θ can staircase variables represent?

Page 58: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

nb=10

Θ

Θ\S m

3

m2 m2 m2

Page 59: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

nb=10

Θ S

Θ\S m

3

m2 m2 m2

Page 60: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

nb=10

Θ S

Θ\S m

3

m2 m2 m2

Page 61: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

nb=10

nb=100

nb=200

Θ

Θ

Θ

S

S

S

Θ\S

Θ\S

Θ\S

m3

m3

m3

m2 m2 m2

Page 62: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

62

Staircase variables: feasibility

Page 63: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

63

Setting Target Functions

Page 64: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

64

Setting Target Functions

Page 65: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

65

Setting Target Functions

Page 66: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

66

Setting Target Functions

Page 67: Minimal-Dispersion and Maximum- Likelihood Predictors with ... · – Minimal dispersion • Conclusions . 22 Random Predictor Models • The output is a random process • RPM considered

67

-3 -2 -1 0 1 2 3x

0

0.5

1

1.5

2

y

True values Approximation

True Moments vs. Approximation

z

z

µ

m2

m3

m4