lecture 3: regression analysis & model fitting · lecture 3: regression analysis & model...

86
Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of Miami Created with {Remark.js} using {Markdown} + {MathJax} Loading [MathJax]/jax/output/HTML-CSS/jax.js

Upload: others

Post on 13-Jul-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Lecture3:

Regressionanalysis&modelfitting

ShaneElipotTheRosenstielSchoolofMarineandAtmosphericScience,

UniversityofMiami

Createdwith{Remark.js}using{Markdown}+{MathJax}

Loading[MathJax]/jax/output/HTML-CSS/jax.js

Page 2: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

References

[1]Bendat,J.S.,&Piersol,A.G.(2011).Randomdata:analysisandmeasurementprocedures(Vol.729).JohnWiley&Sons.

[2]Thomson,R.E.,&Emery,W.J.(2014).Dataanalysismethodsinphysicaloceanography.Newnes.dx.doi.org/10.1016/B978-0-12-387782-6.00003-X

[3]Press,W.H.etal.(2007).Numericalrecipes3rdedition:Theartofscientificcomputing.Cambridgeuniversitypress.

[4]vonStorch,H.andZwiers,F.W.(1999).StatisticalAnalysisinClimateResearch,CambridgeUniversityPress

[5]Rawlings,J.,Pantula,S.G.,andDickey,D.A.(1998)AppliedRegressionAnalysis,Aresearchtool,secondedition,Springer

[6]Wunsch,C.(2006).Discreteinverseandstateestimationproblems:withgeophysicalfluidapplications.CambridgeUniversityPress.

[7]Fan,J.andGijbels,I.(1996),LocalPolynomialModellingandItsApplications,CRCPress

Page 3: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Lecture3:Outline1. Introduction2. Linearregression3. Polynomialinterpolation4. LocalPolynomialmodeling5. Anoteonnonlinearmodeling

Page 4: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

1.Introduction

Page 5: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Introduction

Regressionanalysisconsistsinusingmathematicalexpressions(thatismodeling,ormodellingintheU.K.)todescribetosomeextentthebehaviorofarandomvariable(r.v.)ofinterest.Thisvariableiscalledadependentvariable.Thevariablesthatarethoughttoprovideinformationaboutthedependentvariableandareincorporatedinthemodelarecalledindependentvariables.Themodelsusedinregressionanalysistypicallyinvolveunknownconstants,calledparameters,whicharetobeestimatedfromthedata.

Page 6: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Introduction

Regressionanalysisconsistsinusingmathematicalexpressions(thatismodeling,ormodellingintheU.K.)todescribetosomeextentthebehaviorofarandomvariable(r.v.)ofinterest.Thisvariableiscalledadependentvariable.Thevariablesthatarethoughttoprovideinformationaboutthedependentvariableandareincorporatedinthemodelarecalledindependentvariables.Themodelsusedinregressionanalysistypicallyinvolveunknownconstants,calledparameters,whicharetobeestimatedfromthedata.

Themathematicalcomplexityofthemodel,andthedegreetowhichitisrealistic,dependonhowmuchisknownabouttheprocessandthepurposeoftheregressionanalysis(andtheabilityandknowledgeofthescientist).

Mostregressionmodelsthatwewillencounterarelinearintheirparameters.Iftheyarenotlinear,theycanoftenbelinearized.

Criticalthinkingshouldbeemployed,asanymodelcanbefittedto(orregressedagainst)anydata.

Page 7: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Example

DailyatmosphericCO measuredatMaunaLoainHawaiiatanaltitudeof3400m.DatafromDr.PieterTans,NOAA/ESRL(www.esrl.noaa.gov/gmd/ccgg/trends/)andDr.RalphKeeling,ScrippsInstitutionofOceanography(scrippsco2.ucsd.edu).

2

Page 8: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Example

Determiningthelineartrendofthistimeseriesisanexampleoflinearregression.Furthermodelingcouldincludeestimatingtheseasonalcycleofthetimeseriesetc.

Page 9: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

2.Linearregression

Page 10: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearregression

Wearegoingtoreviewthesimplestlinearmodelinvolvingoneindependentvariable andonedependentvariable .Inparallelwewillalsopresenttheequationsforamoregeneralmodelrelatingto dependentvariables .Matlabusesnotationthatressemblethematrixformulasforthegeneral(multivariate)linearmodel.

x yy

p , ,… ,x1 x2 xp

Page 11: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearregression

Wearegoingtoreviewthesimplestlinearmodelinvolvingoneindependentvariable andonedependentvariable .Inparallelwewillalsopresenttheequationsforamoregeneralmodelrelatingto dependentvariables .Matlabusesnotationthatressemblethematrixformulasforthegeneral(multivariate)linearmodel.

Asanexample,wewillseethattheleastsquaressolutionofthelinearmodelis

whichinMatlabcanbewritten

B = (X'*X)^-1*X'*Y;

butisbettercodedas

B = (X'*X)\X'*Y;

x yy

p , ,… ,x1 x2 xp

= ( X Yβ XT )−1XT

Page 12: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearregression

Previously,whenexaminingasetofobservations of ,weassumedthattheexpectation,ortruemean,wasconstant,i.e.

.

Yi y

E[ ] =Yi μy

Page 13: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearregression

Previously,whenexaminingasetofobservations of ,weassumedthattheexpectation,ortruemean,wasconstant,i.e.

.

Weknowconsiderthecasewhenthemeanisafunctionofanothervariable,asanexampletime.Linearregressionscanconsistinestimatingthetrendandtheseasonalcycleofyourtimeseries.

Yi y

E[ ] =Yi μy

Page 14: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

InthecaseoftheCO record,themeanisclearlynotaconstant,increasingeveryyear,butalsooscillatingwithineachyear.

Simplelinearregression

Previously,whenexaminingasetofobservations of ,weassumedthattheexpectation,ortruemean,wasconstant,i.e.

.

Weknowconsiderthecasewhenthemeanisafunctionofanothervariable,asanexampletime.Linearregressionscanconsistinestimatingthetrendandtheseasonalcycleofyourtimeseries.

Yi y

E[ ] =Yi μy

2

Page 15: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Thismodelisapplicableasanexampleforestimatingalineartrendofatimeseries,oranylinearrelationshipbetweentwor.vs.

Simplelinearregression

Thesimplestmodelisthatthetruemeanorexpectationofchangesataconstantrateasthevalueof decreasesorincreases:

where and aretheparameterstoestimate.

yx

E[ ] = + , i = 1,…,nYi β0 β1Xi

β0 β1

Page 16: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearregression

Theobservationsofthedependentvariable arelookedatasindividualrealizationsofther.vs. withpopulationsmeans .Thedeviationof from istakenintoaccountbyincorporatingarandomerror inthelinearmodel

yYi E[ ]Yi

Yi E[ ]Yi

ϵi

= + +Yi β0 β1Xi ϵi

Page 17: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

The areassumednormallyindependentidenticallydistributed(i.i.d.)r.vs.as .Sinceand areconstant, .

Incontrast,theobservedvaluesof aresupposedtobefreeoferrors,treatedasconstants.

Simplelinearregression

Theobservationsofthedependentvariable arelookedatasindividualrealizationsofther.vs. withpopulationsmeans .Thedeviationof from istakenintoaccountbyincorporatingarandomerror inthelinearmodel

yYi E[ ]Yi

Yi E[ ]Yi

ϵi

= + +Yi β0 β1Xi ϵi

ϵi

∼ N (0,σ) ,β0 β1Xi ∼ N (E[ ],σ)Yi Yi

X

Page 18: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Generallinearmodel(ormultipleregression)

Thegenerallinearmodelwith independentvariablesforobservation is

Thereare parameterstoestimate:aconstant andfactors .Inmatrixnotation,for observations,weobtainthelinearsystem

pi

= + + +⋯+ +Yi β0 β1Xi1 β2Xi2 βpXip ϵi

= p+ 1p′ ( )β0 p,… ,β1 βp n

Y = Xβ+ ϵ

⎣⎢⎢⎢⎢Y1

Y2

⋮Yn

⎦⎥⎥⎥⎥

(n× 1)

=

⎣⎢⎢⎢⎢⎢11

⋮1

X11

X21

⋮Xn1

X12

X22

⋮Xn2

⋯⋯

X1p

X2p

⋮Xnp

⎦⎥⎥⎥⎥⎥

(n× )p′

+

⎣⎢⎢⎢⎢β0

β1

⋮βp

⎦⎥⎥⎥⎥

( × 1)p′

⎣⎢⎢⎢⎢ϵ1

ϵ2

⋮ϵn

⎦⎥⎥⎥⎥

(n× 1)

Page 19: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Generallinearmodel

Eachelement isapartialregressioncoefficientthatquantifiesthechangeinthedependentvariable perunitchangeintheindependentvariable ,assumingallotherindependentvariablesareheldconstant.

β =

⎣⎢⎢⎢⎢β0

β1

⋮βp

⎦⎥⎥⎥⎥

βj

Yi

Xij

Page 20: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel

Ifthe were andthemodelwereabsolutelytrue,anytwopairsofobservations wouldbeenoughtosolveforthetwounkownparameters and .

= + +Yi β0 β1Xi ϵi

ϵi 0( , )Xi Yi

β0 β1

Page 21: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel

Ifthe were andthemodelwereabsolutelytrue,anytwopairsofobservations wouldbeenoughtosolveforthetwounkownparameters and .

Yet,becauseoferrors,anothermethodisused,calledleastsquaresestimation,whichgivesasolution,orestimate thatleadstothesmallestpossiblesumofsquareddeviationsoftheobservations fromtheestimates oftheirtruemeans .

= + +Yi β0 β1Xi ϵi

ϵi 0( , )Xi Yi

β0 β1

( , )β0 β1

Yi E[ ]Yi E[ ]Yi

Page 22: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel:LSsolution

Let providetheestimateofthetruemean

suchthatthesumofsquaresofdeviationsfromthemean

isminimized.

iscalledthe -thobservedresidual.

( , )β0 β1

= + ≡E[ ]Yi β0 β1Xi Yi

SS(Res) = ( − =∑i=1

n

Yi Yi )2 e2i

ei i

Page 23: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Generallinearmodel

Forthegeneralmodel,inmatrixnotation,

andtheresidualsarefoundinthe vector

andthesumofsquaresofresidualsis

whichisaminimumbecauseof .

Howtofind ?

≡ XY β

(n× 1)

e = Y− = Y−XY β

SS(Res) = e = (Y− (Y− ) = (Y−X (Y−X )eT Y)T Y β)T β

β

β

Page 24: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel

Themethodtofindthevalues thatminimize isclassic.Youtakethederivativesof withrespecttoeachofthe parametersandequatetheresultstozero.Youobtainasystemof equationswith unkowns.Forthesimplelinearmodelyouobtainthenormalequations

whichsolutionis

( , )β0 β1 SS(Res)SS(Res)

p+ 1p+ 1 p+ 1

nβ0

β0 ∑i

Xi

+

+

β1 ∑i

Xi

β1 ∑i

X2i

=

=

∑i

Yi

∑i

XiYi

β1

β0

=

=

= =( − )( − )∑i Xi X

¯ ¯¯Yi Y

¯ ¯¯

( −∑i Xi X¯ ¯¯ )2

sxy

sxx

sxy

s2x

−Y¯ ¯¯

β1X¯ ¯¯

Page 25: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel

Thepredictedvaluesfromthesolutionofthelinearmodelare

canbeinterpretedasbeingboththeestimateofthepopulationmeanof foragivenvalueof ,andthepredictedvaluevalueofforafuturevalueof whichis .

= + =Yi β0 β1Xi E[ ]Yi

Yi

y x yx Xi

Page 26: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Generallinearmodel

Forthemultipleregressionmodel,thenormalequationsareobtained

andtheleastsquares(LS)solutionis

Thepredictedvaluesof are

with calledtheprojectionmatrix.Thislastexpressionsshowsthattheestimated arelinearfunctionofalltheobservedvalues .

= 0 → X = Y∂SS(Res)

∂βXT β XT

= ( X Yβ XT )−1XT

y

Y ==

X = X( X Yβ XT )−1XT

PY

P = X( XXT )−1XT

Yi

Yi

Page 27: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel

Theobservationsof cannowbewrittenasthesumoftheestimatedpopulationmeanforagivenvalueof andaresidual

Thesumofthesquaresoftheobservationsare

sinceitcanbeshownthat .

Thesumofthesquaresoftheobservationsisthesumofthesquares"accountedfor"bythemodelplusthesumofthesquaresof"unaccountedfor".

yx

= +Yi Yi ei

∑i

Y 2i =

=

( +∑i

Yi ei)2

+∑i

Yi

2 ∑i

e2i

2 = 0∑i Yi

2ei

Page 28: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel

Using ,thedecompositionofthesumofthesquarescanbeusedasfollows

= (1/n)Y¯ ¯¯ ∑i Yi

− n∑i

Y 2i Y

¯ ¯¯ 2

( −∑i

Yi Y¯ ¯¯ )2

=

=

− n∑i

Yi

2Y¯ ¯¯ 2

( −β12∑

i

Xi X¯ ¯¯ )2

+

+

∑i

e2i

∑i

e2i

Page 29: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel

Using ,thedecompositionofthesumofthesquarescanbeusedasfollows

Whatdoesthissay?[Uptoafactor ]

= (1/n)Y¯ ¯¯ ∑i Yi

− n∑i

Y 2i Y

¯ ¯¯ 2

( −∑i

Yi Y¯ ¯¯ )2

=

=

− n∑i

Yi

2Y¯ ¯¯ 2

( −β12∑

i

Xi X¯ ¯¯ )2

+

+

∑i

e2i

∑i

e2i

1/(n− 1)

Page 30: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel

Using ,thedecompositionofthesumofthesquarescanbeusedasfollows

Whatdoesthissay?[Uptoafactor ]

Itapproximatelysaysthat:

"Thetotalvariancefromobservations"="variancefromtheregression"+"varianceoftheresiduals"

Inthemodel ,theregressionpartis .

iscalledtheregressioncoefficient.

= (1/n)Y¯ ¯¯ ∑i Yi

− n∑i

Y 2i Y

¯ ¯¯ 2

( −∑i

Yi Y¯ ¯¯ )2

=

=

− n∑i

Yi

2Y¯ ¯¯ 2

( −β12∑

i

Xi X¯ ¯¯ )2

+

+

∑i

e2i

∑i

e2i

1/(n− 1)

= +Yi β1Xi β0 β1Xi

β1

Page 31: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Coefficientofdetermination

Fromthelinearmodel,weareinterestedinaquantitycalledthecoefficientofdetermination

Forthesimple(univariate)linearmodel,

isthusthesquareofthePearson'scorrelationcoefficientbetween and .

= = 1 −R2( − −∑i Yi Y

¯ ¯¯ )2 ∑i e2i

( −∑i Yi Y¯ ¯¯ )2

∑i e2i

( −∑i Yi Y¯ ¯¯ )2

R2 =

=

=( −β1

2∑i Xi X

¯ ¯¯ )2

( −∑i Yi Y¯ ¯¯ )2

( )sxy

s2x

2s2x

s2y

= =s2xy

s2xs2y

⎛⎝⎜

sxy

s2xs2y

− −−−√⎞⎠⎟2

r2xy

R2

x y

Page 32: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel

Atraditionalinterpretationof isthatitisameasureofthefractionofvarianceofthedependentvariable explainedbytheindependentvariable .

Thisiswhythe(squareofthe)Pearsoncorrelationcoefficientisveryquicklyinterpretedasbeingameasureoftheamountofvarianceexplainedbetweentwovariables.

Asanexampleif youwilloftenreadsometinglike"xisabletoexplain49%ofthevarianceofy".(Since )

R2

R2

yx

= 0.7rxy= 0.490.72

Page 33: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel:uncertainties

Inthemodel whereweassumedthat,wedidnotknowthevariance .= + +Yi β0 β1Xi ϵi

∼ N (0,σ)ϵi σ2

Page 34: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel:uncertainties

Inthemodel whereweassumedthat,wedidnotknowthevariance .Anunbiased

estimateof isgivenbytheresidualmeansquare:

This"mean"valueisobtainedbydividingthe bythenumberofdegreesoffreedomfortheresidualswhichisthenumberofdatapoints minusthenumberofparametersofthemodel

.

= + +Yi β0 β1Xi ϵi∼ N (0,σ)ϵi σ2

σ2

= ≡σ2

s2∑i e

2i

n− (p+ 1)

SS(Res)

(n)(p+ 1)

Page 35: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel:uncertainties

AnumberofformulasforthevarianceoftheestimatescanbederivedandusedforcalculatingCIs:

Var[ ]β1

Var[ ]β0

Var[ ]Yi

Var[ ]Y0

=

=

=

=

s2

( −∑i Xi X¯ ¯¯ )2

[ + ]1n

X¯ ¯¯ 2

( −∑i Xi X¯ ¯¯ )2

s2

[ + ]1n

( −Xi X¯ ¯¯ )2

( −∑i Xi X¯ ¯¯ )2

s2

[1 + + ]1n

( −X0 X¯ ¯¯ )2

( −∑i Xi X¯ ¯¯ )2

s2

Page 36: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Simplelinearmodel:uncertainties

AnumberofformulasforthevarianceoftheestimatescanbederivedandusedforcalculatingCIs:

Var[ ]β1

Var[ ]β0

Var[ ]Yi

Var[ ]Y0

=

=

=

=

s2

( −∑i Xi X¯ ¯¯ )2

[ + ]1n

X¯ ¯¯ 2

( −∑i Xi X¯ ¯¯ )2

s2

[ + ]1n

( −Xi X¯ ¯¯ )2

( −∑i Xi X¯ ¯¯ )2

s2

[1 + + ]1n

( −X0 X¯ ¯¯ )2

( −∑i Xi X¯ ¯¯ )2

s2

: , = , , ∼ t(0,n− p−H0 β 1,0 Yi m1,0,Yi

−β1 m1

Var[ ]β1

− −−−−−√−β0 m0

Var[ ]β0

− −−−−−√−Yi mYi

Var[ ]Yi

− −−−−−√

Page 37: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Generallinearmodel:uncertainties

Forthegenerallinearmodel,theformulasare

P

Var[ ]β

Var[ ]Y

Var[ ]Y0

Var[e]=σ

2s2

=

=

=

=

=

=

X( XXT )−1XT

( XXT )−1σ2

Pσ2

[I+ ( X ]X0 XT )−1X0T σ2

(I−P)σ2

e/(n− p− 1)eT

Page 38: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Generallinearmodel:uncertainties

Forthegenerallinearmodel,theformulasare

Bewarethattheexpressionabovesarematrices.Asanexampleforthesimplelinearmodelforwhich :

thisimpliesthattheparameterestimatescovary.

P

Var[ ]β

Var[ ]Y

Var[ ]Y0

Var[e]=σ

2s2

=

=

=

=

=

=

X( XXT )−1XT

( XXT )−1σ2

Pσ2

[I+ ( X ]X0 XT )−1X0T σ2

(I−P)σ2

e/(n− p− 1)eT

p+ 1 = 2

Var[ ] = [ ]βVar( )β0

Cov( , )β1 β0

Cov( , )β0 β1

Var( )β1

Page 39: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

NotehowtheCIsfortheprediction forfuturevalues

of arelargerthantheCIsforthepredictionofthemeanof

.Thevarianceofthepredictionisthevarianceofestimatingthemeanplusthevarianceofthequantityestimated.

TheCIsarethesmallestfor.

Inthiscase,IgenerateddataandIhadprescribed

Simplelinearmodel:uncertainties

Y0X0 x

Yi

=X0 X¯ ¯¯

= 0, = 0.8, = 0.04β0 β1 σ2

Page 40: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Linearmodelbyleastsquares

Themethodofleastsquarestofindasolutiontothegenerallinearmodelisappropriatewhenfourassumptionsarevalid:(1)therandomerrors arenormallydistributed,(2)independent,(3)withzeromeanandconstantvariance ,and(4)the areobservationsofthe independentvariablesmeasuredwithouterrors.

= + + +⋯+ +Yi β0 β1Xi1 β2Xi2 βpXip ϵi

ϵiσ2 Xij

p

Page 41: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Linearmodelbyleastsquares

Themethodofleastsquarestofindasolutiontothegenerallinearmodelisappropriatewhenfourassumptionsarevalid:(1)therandomerrors arenormallydistributed,(2)independent,(3)withzeromeanandconstantvariance ,and(4)the areobservationsofthe independentvariablesmeasuredwithouterrors.

Ifwerelyonalargenumber ofdata,thenormalassumptionmaybeinvokedbecauseoftheCLT.Otherwise,MaximumLikelihoodmethodscanbeused.AsanexampleseeElipotetal.(2016).

= + + +⋯+ +Yi β0 β1Xi1 β2Xi2 βpXip ϵi

ϵiσ2 Xij

p

n

Page 42: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Linearmodelbyleastsquares

Themethodofleastsquarestofindasolutiontothegenerallinearmodelisappropriatewhenfourassumptionsarevalid:(1)therandomerrors arenormallydistributed,(2)independent,(3)withzeromeanandconstantvariance ,and(4)the areobservationsofthe independentvariablesmeasuredwithouterrors.

Ifwerelyonalargenumber ofdata,thenormalassumptionmaybeinvokedbecauseoftheCLT.Otherwise,MaximumLikelihoodmethodscanbeused.AsanexampleseeElipotetal.(2016).

Whenthedependentvariableobservationsarenormallydistributedbutdonothavethesamevariances,orerrors,themethodofweightedleastsquarescanbeimplemented.

= + + +⋯+ +Yi β0 β1Xi1 β2Xi2 βpXip ϵi

ϵiσ2 Xij

p

n

Page 43: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Linearmodelbyleastsquares

Themethodofleastsquarestofindasolutiontothegenerallinearmodelisappropriatewhenfourassumptionsarevalid:(1)therandomerrors arenormallydistributed,(2)independent,(3)withzeromeanandconstantvariance ,and(4)the areobservationsofthe independentvariablesmeasuredwithouterrors.

Ifwerelyonalargenumber ofdata,thenormalassumptionmaybeinvokedbecauseoftheCLT.Otherwise,MaximumLikelihoodmethodscanbeused.AsanexampleseeElipotetal.(2016).

Whenthedependentvariableobservationsarenormallydistributedbutdonothavethesamevariances,orerrors,themethodofweightedleastsquarescanbeimplemented.

Whentheindependentvariablesareactuallynotindependent(becausetheyaremaybecorrelated),themethodofgeneralleastsquarescanbeimplemented.Seereferences[5]and[6].

= + + +⋯+ +Yi β0 β1Xi1 β2Xi2 βpXip ϵi

ϵiσ2 Xij

p

n

Page 44: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Linearmodelbyweightedleastsquares

Let'sassumethatthevarianceofeach (andthusofeach )iswhere isaconstant.Asanexample,someobservationsmay

havebetteraccuracythanothers.

ϵi Yi

a2i σ2 σ

Page 45: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Linearmodelbyweightedleastsquares

Wecanconsiderthefollowingrescaledmodel,dividingby :

or

Becausethevarianceofthe is ,thevarianceofthebecomes .Wecannowuseleastsquarestoregress onthe

.

ai

= + + +⋯+ +Yi

ai

1ai

β0 β1Xi1

aiβ2

Xi2

aiβp

Xip

ai

ϵiai

= + + +⋯+ +Y ∗i X∗

i0β0 β1X∗i1 β2X

∗i2 βpX

∗ip ϵ∗i

ϵi a2i σ2 ϵ∗i

σ2 Y ∗i

X∗ij

Page 46: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Linearmodelbyweightedleastsquares

Theprinciplehereistoassigntheleastamountofweighttotheobservationswiththelargestvariance,orerror.Theweightingmatrixis

Considerthegenerallinearmodelequationleft-multipliedby

whichcanberewrittenas

with ,etc.

W =

⎣⎢⎢⎢⎢⎢1/a10

⋮0

01/a2

⋮0

⋯0

⋱⋯

0

⋮1/an

⎦⎥⎥⎥⎥⎥

W

WY = WXβ+Wϵ

= β+Y∗ X∗ ϵ∗

= WYY∗

Page 47: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Linearmodelbyweightedleastsquares

Theweightedleastsquaresolutionis

with

β

Var[ ]β

Var[ ]YVar[e]

=

=

==

( X YX′V−1 )−1X′V−1

( XX′V−1 )−1σ2

X( XX′V−1 )−1X′σ2

[V−X( X ]X′V−1 )−1X′ σ2

= W =V−1 W′

⎣⎢⎢⎢⎢⎢a21

0

⋮0

0

a22

⋮0

⋯0

⋱⋯

0

⋮a2n

⎦⎥⎥⎥⎥⎥

Page 48: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Linearmodelbyweightedleastsquares

Theweightedleastsquaresolutionis

with

Infact,theweightingmatrixcanhavewhatevercoefficientyouwant!Hereitisaspecialcasethatsimplifiestheformofthesolution.Seesection2.4ofreference[6].

β

Var[ ]β

Var[ ]YVar[e]

=

=

==

( X YX′V−1 )−1X′V−1

( XX′V−1 )−1σ2

X( XX′V−1 )−1X′σ2

[V−X( X ]X′V−1 )−1X′ σ2

= W =V−1 W′

⎣⎢⎢⎢⎢⎢a21

0

⋮0

0

a22

⋮0

⋯0

⋱⋯

0

⋮a2n

⎦⎥⎥⎥⎥⎥

Page 49: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

3.Polynomialinterpolation

Page 50: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Polynomialfitting

Fittingapolynomialfunctionofanindependentvariable toadependentvariable isalinearregressionproblemwhichconsistsinestimatingthecoefficientsofthepolynomial

xy

y = + x+ +…+ +…β0 β1 β2x2 βkx

k

Page 51: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Polynomialfitting

Fittingapolynomialfunctionofanindependentvariable toadependentvariable isalinearregressionproblemwhichconsistsinestimatingthecoefficientsofthepolynomial

Wewillreviewtwogeneralcases.Thefirstcaseisglobalpolynomialfittingwhereyouarefittingapolynomialfunctionthatexactlyestimateyourdata,maybepiecewise,inseparateintervals.Thispolynomialisofmaximumorderofthenumberofobservationsminusoneandiscalledaninterpolatingpolynomial.

xy

y = + x+ +…+ +…β0 β1 β2x2 βkx

k

Page 52: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Polynomialfitting

Fittingapolynomialfunctionofanindependentvariable toadependentvariable isalinearregressionproblemwhichconsistsinestimatingthecoefficientsofthepolynomial

Wewillreviewtwogeneralcases.Thefirstcaseisglobalpolynomialfittingwhereyouarefittingapolynomialfunctionthatexactlyestimateyourdata,maybepiecewise,inseparateintervals.Thispolynomialisofmaximumorderofthenumberofobservationsminusoneandiscalledaninterpolatingpolynomial.

Thesecondcaseiscalledlocalpolynomialestimationwhenyouarefittingapolynomialinthevicinity,thatiswithinawindow,ofagivenvalueof .Thispolynomialofarbitraryorderapproximateyourdatalocallyandthesolutionistypicallyobtainedbyweightedleastsquares.

xy

y = + x+ +…+ +…β0 β1 β2x2 βkx

k

x

Page 53: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Polynomialfitting

Fittingapolynomialfunctionofanindependentvariable toadependentvariable isalinearregressionproblemwhichconsistsinestimatingthecoefficientsofthepolynomial

Wewillreviewtwogeneralcases.Thefirstcaseisglobalpolynomialfittingwhereyouarefittingapolynomialfunctionthatexactlyestimateyourdata,maybepiecewise,inseparateintervals.Thispolynomialisofmaximumorderofthenumberofobservationsminusoneandiscalledaninterpolatingpolynomial.

Thesecondcaseiscalledlocalpolynomialestimationwhenyouarefittingapolynomialinthevicinity,thatiswithinawindow,ofagivenvalueof .Thispolynomialofarbitraryorderapproximateyourdatalocallyandthesolutionistypicallyobtainedbyweightedleastsquares.

Thesetwotypesofmethodscanbeusedingeneraltoprocessyourdatatoeitherinterpolateorgridyourdata.

xy

y = + x+ +…+ +…β0 β1 β2x2 βkx

k

x

Page 54: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Interpolatingpolynomial

Assumeyouhave pairsofobservations andwouldliketointerpolate forgivenvalueof .

Thereexistsaninterpolatingpolynomialoforder givenbythefollowingLagrangeformula

whichpassesthroughyourdatapoints,i.e.

N ( , )Xi Yi

y x

N − 1

(x) =PN−1 ∑k=1

N⎛⎝⎜⎜ ∏

j=1j≠k

N x−Xj

−Xk Xj

⎞⎠⎟⎟ Yk

( ) =PN−1 Xi Yi

Page 55: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Interpolatingpolynomial

ExamplewithN = 5

Page 56: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Polynomialfitting

Alternativelyyoucanuseleastsquarestofitapolynomialofanyorderequaltoorlessthan withthemodelN − 1

= + + +…Yi β0 β1Xi β2X2i βN−1X

N−1i

Page 57: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Interpolatingpolynomial

Bewarethatinterpolatingpolynomialcanquicklygenerateverylargeoscillations!

Sameexampleasbeforeexceptthattheoriginaldatapointwasmovedto .

= 2Xi

2.9

Page 58: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Piecewiselinearinterpolation

Apiecewiselinearinterpolation,orsimplylinearinterpolationconsistincalculatingtheinterpolatingpolynomialoforder1overaninterval ,i.ewith points.TheLagrangeformulagives

whichcanberearrangedtogivethelinearinterpolant

InMatlabitisimplementedby

yi = interp1(x,y,xi);

[ , ]Xk Xk+1 N = 2

(x) = ( ) + ( )P1x−Xk+1

−Xk Xk+1Yk

x− Xk

−Xk+1 XkYk+1

(x) = + (x− )L1 Yk Xk−Yk+1 Yk

−Xk+1 Xk

Page 59: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Piecewiselinearinterpolation:errors

TheLagrangeformulagivesyouaneasywaytoestimatetheinterpolationerror.If and aretheerrorsoruncertaintiesfor

and ,sinceδk δk+1

Yk Yk+1

(x) = +P1 akYk ak+1Yk+1

δ (x) =P1 +a2kδ2k

a2k+1δ

2k+1

− −−−−−−−−−−−−√

Page 60: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Hermitepolynomialinterpolation

TheissueoflargeoscillationsininterpolatingcanbereinedinbyusingHermitepolynomialswhichsatisfyadditionalconditionsonitsderivativesatthedatapoints,

where istobespecified. isthe -thderivativeof .

( ) = , ( ) =Pn Xk Yk P(1)n Xk dk

dk P(ν)n ν Pn

Page 61: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Hermitepolynomialinterpolation

TheissueoflargeoscillationsininterpolatingcanbereinedinbyusingHermitepolynomialswhichsatisfyadditionalconditionsonitsderivativesatthedatapoints,

where istobespecified. isthe -thderivativeof .

ApopularHermitepolynomialistheshape-preservingpiecewisecubicHermiteinterpolatingpolynomialorshape-preservingpchip,implementedinMatlabby

yi = pchip(x,y,xi);

( ) = , ( ) =Pn Xk Yk P(1)n Xk dk

dk P(ν)n ν Pn

Page 62: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

pchipexample

Apchippolynomialiscubic(order3)anditsderivatives ,orslopes,ateachdatapointarezeroortheharmonicmeansofconsecutiveslopes:

dk

= + with =1dk

1δk

1δk−1

δk−Yk+1 Yk

−Xk+1 Xk

Page 63: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Splineinterpolation

Anotherpopularinterpolationmethodusescubicsplineswhicharepiecewisecubicinterpolatingpolynomialswithconstraintsonthesecondderivativetobeacontinuous.ItisimplementedinMatlabas

yi = spline(x,y,xi);

Page 64: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Cubicinterpolation

Anothermethodusingpiecewisepolynomialsoforder3iscalledcubicconvolutionandisdescribedindetailinKeys1981.ThismethodisaccessibleinoneorhigherdimensionsinMatlabas

yi = interpn(x,y,xi,'cubic');

Page 65: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Somecomments

Interpolatingpolynomialsandsplinesaregreatsetsoftoolthatallowyoutoquicklyinterpolateyourdata.Splinesarenotnecessarilypolynomialoforder3andcanbeofgreaterorder.Thereexistsaverylargebodyoflitteraturedealingwithsplines.

Page 66: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Somecomments

Interpolatingpolynomialsandsplinesaregreatsetsoftoolthatallowyoutoquicklyinterpolateyourdata.Splinesarenotnecessarilypolynomialoforder3andcanbeofgreaterorder.Thereexistsaverylargebodyoflitteraturedealingwithsplines.

Wehavedealtsofarwithmethodsofinterpolationinonedimensionbutthesecanbeeasilyexpandedintwoormoredimensions,notablythelinearandcubicmethods.

Page 67: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Somecomments

Interpolatingpolynomialsandsplinesaregreatsetsoftoolthatallowyoutoquicklyinterpolateyourdata.Splinesarenotnecessarilypolynomialoforder3andcanbeofgreaterorder.Thereexistsaverylargebodyoflitteraturedealingwithsplines.

Wehavedealtsofarwithmethodsofinterpolationinonedimensionbutthesecanbeeasilyexpandedintwoormoredimensions,notablythelinearandcubicmethods.

Polynomialinterpolationimpliesthatyouareexactlyrecoveringyourdata,i.e .Thisimpliesthatyourdataareeffectivelyerrorfree.Wenowrelaxthisconditionandreviewsomeprinciplesoflocalpolynomialmodeling.

P ( ) =Xi Yi

Page 68: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

4.LocalPolynomialModeling

Page 69: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Polynomialbyleastsquares

Wesawearlierthatwecanuseleastsquarestofitapolynomialofanyorderequaltoorlessthanyour datapoints:N − 1

= + + +…Yi β0 β1Xi β2X2i βN−1X

N−1i

Page 70: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Localpolynomialfitting

Onceagainweattempttoestimatethevalueofadependentvariablegivenavalueoftheindependentvalue .Herewefollowclosely

reference[7].Theideaistoestimateanarbitraryfunctionanditsderivativenoted withthemodel

where areobservationsand isthevarianceof at.

y xm(x)

(x), (x),… , (x)m(1) m(1) m(p)

= m( ) + σ( )Yi Xi Xi

( , )Xi Yi σ( )Xi Yi

x = Xi

Page 71: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Localpolynomialfitting

Onceagainweattempttoestimatethevalueofadependentvariablegivenavalueoftheindependentvalue .Herewefollowclosely

reference[7].Theideaistoestimateanarbitraryfunctionanditsderivativenoted withthemodel

where areobservationsand isthevarianceof at.

Thefunction isapproximatedlocallybyapolynomialoforderbyconsideringaTaylorexpansionintheneighborhoodof as

y xm(x)

(x), (x),… , (x)m(1) m(1) m(p)

= m( ) + σ( )Yi Xi Xi

( , )Xi Yi σ( )Xi Yi

x = Xi

m(x)p x0

m(x) ≈ m( )x0

=

+ ( )(x− ) + (x− +…+ (x−m′ x0 x0( )m′′ x0

2!x0)2

m(p)

p!+ (x− ) + (x− +…+ (x−β0 β1 x0 β2 x0)2 βp x0)p

Page 72: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Localpolynomialfitting

Thefunction ismodeledlocallyas

and .Theestimatesof ofthispolynomialareobtainedforeachlocationofinterest byleastsquaresfitting,minimizingthefollowingexpression

where

iscalledakernelfunction,actingoverahalf-bandwidth .

m(x)

m(x) = (x− .∑j=0

p

βj x0)j

( ) = j!m(j) x0 βj βj

x0

{ − ( − ( − )∑i=1

n

Yi ∑j=0

p

βj Xi x0)j}2Kh Xi x0

(x) = K ( )Kh

1h

x

h

K h

Page 73: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Localpolynomialfitting

Inthisexample,theunknownfunctiongiving isestimatedat usinganorderonepolynomial,usingdatapointswithintheorangewindow.

E[ ] = m( )Yi Xi

X = x0

Page 74: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Localpolynomialfitting

Forfittingapolynomialtoyourdata,anumberofaspectsneedtobeconsidered,allcoveredinmanydetailsasanexampleinreference[7]:

1. Whichorderpolynomialdoyouneed?Areyoutryingtoestimatethevalueofyourunknownfunctiononly,orareyoutryingtoestimatethe -thderivativeaswell?Inthiscase,itisrecommendedthat beanoddnumber.

2. Whatbandwidth doyouneed?Itwilldependsonthedensityofyourdata,aswellastheorderofthechosenpolynomial.Thechoiceofthebandwidthisacompromisebetweenbiasandvarianceofyourestimate.Sinceyouaretryingtoestimateparametersbyleastsquaresyoushouldhaveatleastthatnumberofpointsinyourwindow.

3. Whatshapeshouldthekernelfunctionhave?Shoulditbeuniform?Gaussian?Quadratic?AquadratickernelcalledtheEpanechnikovkernelisoftenrecommended

(seepracticalthisafternoon!)

νp− ν

h

p+ 1

Page 75: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Localpolynomialfitting

Thisfigureshowsanexampleoffittingaknownfunctionembeddedinnoisewithaknownvariance.Itshowstheimpactofthebandwidthandpolynomialorderonthebiasandvarianceoftheestimates.

Page 76: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Localpolynomialfitting

Asimplersmootherconsistsinestimatingthefunction asapolynomialoforder .TheequivalentiscalledtheNadaraya-Watsonkernelestimatordefinedas

ThetypicalkernelfunctionsusedaretheGaussiankernel

andthesymmetricBetafamily

where isacomplicatedfunctionofnointeresthere.

m(x)0

(x) ≡mh

( − x)∑j=1

n

Kh Xj Yj

( − x)∑j=1

n

Kh Xj

K(z) = ( exp(− /2)2π−−

√ )−1 z2

K(z) = (1 − , γ = 0, 1,… ,1

B(1/2,γ+ 1)t2)γ+

B(z,w)

Page 77: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

5.Anoteonnonlinearfitting

Page 78: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Nonlinearfitting

Whatdoesonedowhenthefunctionyouaretryingtofittoyourdataisnonlinearinyourparameter?

Page 79: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Nonlinearfitting

Whatdoesonedowhenthefunctionyouaretryingtofittoyourdataisnonlinearinyourparameter?Asanexample,youexpectthatasinusoidfunctionisagoodmodeltodescribethedependencyofyourdependentvariable ontheindependentvariable ,i.e.

where istheamplitudeand isthephase.

y x

y(x) = a cos(x+ ϕ)

a ϕ

Page 80: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Nonlinearfitting

Whatdoesonedowhenthefunctionyouaretryingtofittoyourdataisnonlinearinyourparameter?Asanexample,youexpectthatasinusoidfunctionisagoodmodeltodescribethedependencyofyourdependentvariable ontheindependentvariable ,i.e.

where istheamplitudeand isthephase.Inthiscaseyou'reinluckbecauseyoucanusetrigonometricidentitiesandwrite

y x

y(x) = a cos(x+ ϕ)

a ϕ

y(x) = a cos(ϕ) cos(x) − a sin(ϕ) sin(x)

Page 81: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Nonlinearfitting

Whatdoesonedowhenthefunctionyouaretryingtofittoyourdataisnonlinearinyourparameter?Asanexample,youexpectthatasinusoidfunctionisagoodmodeltodescribethedependencyofyourdependentvariable ontheindependentvariable ,i.e.

where istheamplitudeand isthephase.Inthiscaseyou'reinluckbecauseyoucanusetrigonometricidentitiesandwrite

Youhavelinearizedyourproblem,andyouarenowfacedwithamultiplelinearregressionproblem,estimating and

asafunctionofobservationsof and (seepracticalthisafternoon).

y x

y(x) = a cos(x+ ϕ)

a ϕ

y(x) = a cos(ϕ) cos(x) − a sin(ϕ) sin(x)

a cos(ϕ)a sin(ϕ) y, cos(x) sin(x)

Page 82: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Nonlinearfitting

Whatifyoureallycannotlinearizeyourproblem?

Page 83: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Nonlinearfitting

Whatifyoureallycannotlinearizeyourproblem?Asanexample,itisoftenusefultomodelthelaggedcorrelationfunction ,asinBealetal.2015

ρ(τ)

Page 84: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Nonlinearfitting

Inthisparticularcase,weassumedthatthelaggedcorrelationfunctionforthealong-shorecomponentofvelocity,asafunctionofseparationdistance(lag)wasgivenby

Thegoalisheretofitthedataforthevalueoftheparameter ,aspatiallengthscale.Weapplythesameprincipleofminimization,tryingtofindthevalue minimizing

(r) = cos( )ρh e−(r/rh)2 πr

2rh

rh

rh

SS(Res) = ( ( ) −∑i=1

N

ρh ri ρi)2

Page 85: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Nonlinearfitting

Inthisparticularcase,weassumedthatthelaggedcorrelationfunctionforthealong-shorecomponentofvelocity,asafunctionofseparationdistance(lag)wasgivenby

Thegoalisheretofitthedataforthevalueoftheparameter ,aspatiallengthscale.Weapplythesameprincipleofminimization,tryingtofindthevalue minimizing

Sincetheproblemcannotbeputinlinearform,theleastsquaremethodisnotavailable.Insteadyoumustrelyonnonlinearoptimizationroutines.Asanexample,Matlabcanapplycommonalgorithmsbythefunction .

(r) = cos( )ρh e−(r/rh)2 πr

2rh

rh

rh

SS(Res) = ( ( ) −∑i=1

N

ρh ri ρi)2

fminsearch

Page 86: Lecture 3: Regression analysis & model fitting · Lecture 3: Regression analysis & model fitting Shane Elipot The Rosenstiel School of Marine and Atmospheric Science, University of

Practicalsession

Pleasedownloaddataatthefollowinglink:

PleasedownloadtheMatlabcodeatthefollowinglink:

MakesureyouhaveinstalledandtestedthefreejLabMatlabtoolboxfromJonathanLillyatwww.jmlilly.net/jmlsoft.html