regression
TRANSCRIPT
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Regression/Curve Fitting
Mohammad Tawfik
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Objectives
• Understanding the difference between
regression and interpolation
• Knowing how to “best fit” a polynomial into
a set of data
• Knowing how to use a polynomial to
interpolate data
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Measured Data
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Polynomial Fit!
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Line Fit!
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Which is better?
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Curve Fitting
• If the data measured is of high accuracy and it is required to estimate the values of the function between the given points, then, polynomial interpolation is the best choice.
• If the measurements are expected to be of low accuracy, or the number of measured points is too large, regression would be the best choice.
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Regression
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Why Regression?
• Measurements that we get from real
situations are not usually consistent!
• The number of “pieces” of information that
we can get about a certain project is
HUGE
• You can NEVER measure exact values!
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Measured Data
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
But, how to get the equation of a
line that is “good” for all the data
you have!
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Equation of a Line: Revision
xaay 10
If you have two points
1101 xaay
2102 xaay
2
1
1
0
2
1
1
1
y
y
a
a
x
x
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Solving for the constants!
12
121
12
21120 &
xx
yya
xx
yxyxa
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
What if I have more than two
points?
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
For every point
nn xaay
xaay
xaay
10
2102
1101
1
02
1
2
1
1
1
1
a
a
x
x
x
y
y
y
nn
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
So, we may write the error vector
1
02
1
2
1
2
1
1
1
1
a
a
x
x
x
y
y
y
e
e
e
nnn
1*22*1*1* aAye nnn
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Square the error
aAye
2
eaAAayAa
aAyyyee
TTTT
TTT
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Note: this is a scalar equation!
2
eaAAayAa
aAyyyee
TTTT
TTT
yAaaAyTTT
aAAayAayyeTTTTT
22
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Note: this is a quadratic equation in {a}!!!
aAAayAayyeTTTTT
22
To minimize the error in the above equation, we need to
differentiate with respect to the parameters
022
2
aAAyAad
ed TT
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Solving the equation
We get:
022
2
aAAyAad
ed TT
1**21*22**2 n
T
nn
T
n yAaAA
1*21*22*2 yaA yAa
1
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Example
• If you are given the
data.
• Find the equation of
the “best-fit” line.
y=a1+a2x
y x 0.5 1
2.5 2
2 3
4 4
3.5 5
6 6
5.5 7
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Solution
5.5
6
5.3
4
2
5.2
5.0
71
61
51
41
31
21
11
1
0
a
a
5.5
6
5.3
4
2
5.2
5.0
&
71
61
51
41
31
21
11
yA
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Solution
14028
287
71
61
51
41
31
21
11
7654321
1111111AA
T
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Solution
5.119
24
5.5
6
5.3
4
2
5.2
5.0
7654321
1111111yA
T
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Solution
5.119
24
14028
287
1
0
a
a yAaAA
TT
8393.0
0714.0
1
0
a
a
0714.08393.0 xy
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Example
• If you are given the
data.
• Find the equation of
the “best-fit” parabola.
y=a0+a1x+a2x2
y x 0.5 1
2.5 2
2 3
4 4
3.5 5
6 6
5.5 7
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Solution
5.5
6
5.3
4
2
5.2
5.0
49
36
25
16
9
4
1
71
61
51
41
31
21
11
2
1
0
a
a
a
5.5
6
5.3
4
2
5.2
5.0
&
49
36
25
16
9
4
1
71
61
51
41
31
21
11
yA
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Solution
4676784140
78414028
140287
49
36
25
16
9
4
1
71
61
51
41
31
21
11
49362516941
7654321
1111111
AAT
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Solution
5.665
5.119
24
5.5
6
5.3
4
2
5.2
5.0
49362516941
7654321
1111111
yAT
Regression
Mohammad Tawfik #WikiCourses
http://WikiCourses.WikiSpaces.com
Solution
yAaAATT
0298.0
0774.1
2857.0
2
1
0
a
a
a
2857.00774.10298.0 2 xxy