bruce mayer, pe licensed electrical & mechanical engineer bmayer@chabotcollege
Post on 18-Feb-2016
25 Views
Preview:
DESCRIPTION
TRANSCRIPT
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt1
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Bruce Mayer, PELicensed Electrical & Mechanical Engineer
BMayer@ChabotCollege.edu
Engr/Math/Physics 25
Chp7Statistics-
1
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt2
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Learning Goals Use MATLAB to solve Problems in
• Statistics• Probability
Use Monte Carlo (random) Methods to Simulate Random processes
Properly Apply Interpolation or Extrapolation to Estimate values between or outside of know data points
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt3
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Histogram Histograms are
COLUMN Plots that show the Distribution of Data• Height Represents
Data Frequency Some General
Characteristics• Used to represent
continuous grouped, or BINNED, data– BIN SubRange
within the Data
• Usually Does not have any gaps between bars
• Areas represent %-of-Total Data
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt4
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
HistoGram ≡ Frequency Chart A HistoGram shows how OFTEN some
event Occurs• Histograms are
often constructedusing FrequencyTables
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt5
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Histograms In MATLAB MATLAB has 6
Forms of the Histogram Cmd
The Simplest Hist(y)
The Plot Statement
• Generates a Histogram with 10 bins
Example: Max Temp at Oakland AirPort in Jul-Aug08
TmaxOAK = [70, 75, 63, 64, 65, 66, 65, 65, 67, 78, 75, 73, 79, 71, 72, 67, 69, 69, 70, 74, 71, 72, 71, 74, 77, 77, 86, 90, 90, 70, 71, 66, 66, 72, 68, 73, 72, 82, 91, 82, 76, 75, 72, 72, 69, 70, 68, 65, 67, 65, 63, 64, 72, 70, 68, 71, 77, 65, 63, 69, 69, 67]
hist(TmaxOAK), ylabel('No. Days'), xlabel('Max. Temp (°F)'), title('Oakland Airport - Jul-Aug08')
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt6
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
hist Result for Oakland
It was COLD in Summer 08
Bin Width = (91-63)/10 = 2.8 °F
60 65 70 75 80 85 90 950
5
10
15
No.
Day
s
Max. Temp (°F)
Oakland Airport - Jul-Aug08
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt7
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Histograms In MATLAB Next Example:
Max Temp at Stockton AirPort in Jul-Aug08Hist(y)
The Plot Statement
• Generates a Histogram with 10 bins
TmaxSTK = [94, 98, 93, 94, 91, 96, 93, 87, 89, 94, 100, 99, 103, 103, 103, 97, 91, 83, 84, 90, 89, 95, 94, 99, 97, 94, 102, 103, 107, 98, 86, 89, 95, 91, 84, 93, 98, 104, 105, 107, 103, 91, 90, 96, 93, 86, 92, 93, 95, 95, 86, 81, 93, 97, 96, 97, 101, 92, 89, 92, 93, 94]
hist(TmaxSTK), ylabel('No. Days'), xlabel('Max. Temp (°F)'), title(‘Stockton Airport - Jul-Aug08')
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt8
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
hist Result for Stockton
It was HOT in Summer 08
Bin Width = (107-81)/10 = 2.6 °F
80 85 90 95 100 105 1100
2
4
6
8
10
12
14
16Stockton Airport - Jul-Aug08
No.
Day
s
Max. Temp (°F)
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt9
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
hist Command Refinements Adjust The
number and width of the bins using hist(y,N)hist(y,x)• Where
– N an integer specifying the NUMBER of Bins
– x A vector that Specs CENTERs of the Bins
Consider Summer 08 Max-Temp Data from Oakland and Stockton
Make 2 Histograms• 17 bins• 60F→110F by 2.5’s
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt10
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
hist Plots 17 Bins>> hist(TmaxSTK,17), ylabel('No. Days'), xlabel('Max. Temp (°F)'), title('Stockton, CA - Jul-Aug08')>>
hist(TmaxOAK,17), ylabel('No. Days'), xlabel('Max. Temp (°F)'), title('Oakland, CA - Jul-Aug08')
80 85 90 95 100 105 1100
1
2
3
4
5
6
7
8
9
10Stockton, CA - Jul-Aug08
No.
Day
s
Max. Temp (°F)60 65 70 75 80 85 90 950
1
2
3
4
5
6
7
8
9
10Oakland, CA - Jul-Aug08
No.
Day
s
Max. Temp (°F)
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt11
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
hist Plots Same Scale>> x = [60:2.5:110];>> hist(TmaxSTK,x), ylabel('No. Days'), xlabel('Max. Temp (°F)'), title('Stockton, CA - Jul-Aug08')
>> x = [60:2.5:110];hist(TmaxOAK,x), ylabel('No. Days'), xlabel('Max. Temp (°F)'), title('Oakland, CA - Jul-Aug08')
60 65 70 75 80 85 90 95 100 105 1100
2
4
6
8
10
12
14
16Oakland, CA - Jul-Aug08
No.
Day
s
Max. Temp (°F)60 65 70 75 80 85 90 95 100 105 1100
2
4
6
8
10
12
14
16Stockton, CA - Jul-Aug08
No.
Day
s
Max. Temp (°F)
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt12
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
hist Numerical Output Hist can also
provide numerical Data about the Histogramn = hist(y)• Gives the number of
values in each of the (default) 10 Bins
For the Stockton data
k =2 5 1 10 16 7 9 2 7 3 We can also spec
the number and/or Width of Bins
>> k13 = hist(TmaxSTK,13)k13 =2 2 4 4 6 10 10 7 5 2 6 2 2
>> k2_5s = hist(TmaxOAK,x)
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt13
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
hist Numerical Output Bin-Count and Bin-Locations
(Frequency Table) for the Oakland Data>> [u, v] = hist(TmaxOAK,x)u =0 3 11 7 159 6 4 1 2 1 0 3 0 0 0 0 0 0 0 0v = 60.0000 62.5000 65.0000 67.5000 70.0000 72.5000 75.0000 77.5000 80.0000 82.5000 85.0000 87.5000 90.0000 92.5000 95.0000 97.5000 100.0000 102.5000 105.0000 107.5000 110.0000
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt14
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Histogram Commands - 1Command Descriptionbar(x,y) Creates a bar chart of y versus x.
hist(y)Aggregates the data in the vector y into 10 bins evenly spaced between the minimum and maximum values in y.
hist(y,n)Aggregates the data in the vector y into n bins evenly spaced between the minimum and maximum values in y.
hist(y,x)Aggregates the data in the vector y into bins whose center locations are specified by the vector x. The bin widths are the distances between the centers.
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt15
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Histogram Commands - 2Command Description
[z,x] = hist(y)Same as hist(y) but returns two vectors z and x that contain the frequency count and the 10 bin locations.
[z,x] = hist(y,n) Same as hist(y,n) but returns two vectors z and x that contain the frequency cnt and the n bin locations.
[z,x] = hist(y,x)
Same as hist(y,x) but returns two vectors z and x that contain the frequency count and the bin locations. The returned vector x is the same as the user-supplied vector x.
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt16
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Data Statistics Tool - 1 Make Line-
Plot of Temp Data for Stockton, CA
Use the Tools Menu to find the Data Statistics Tool
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt17
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Data Statistics Tool - 2 Use the
Tool to Add Plot Lines for• The
Mean• ±StdDev
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt18
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Data Statistics Tool - 3 Quite a
Nice Tool, Actually
The Result The Avg
Max Temp Was 96.97 °F
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt19
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Probability Probability The LIKELYHOOD that a
Specified OutCome Will be Realized• The “Odds” Run from 0% to 100%
Class Question: What are the Odds of winning the California MEGA-MILLIONS Lottery?
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt20
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
175 711 536 ... EXACTLY???!!! To Win the MegaMillions Lottery
• Pick five numbers from 1 to 56 • Pick a MEGA number from 1 to 46
The Odds for the 1st ping-pong Ball = 5 out of 56
The Odds for the 2nd ping-pong Ball = 4 out of 55, and so On
The Odds for the MEGA are 1 out of 46
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt21
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
175 711 536 ... Calculated Calc the OverAll Odds as the
PRODUCT of each of the Individual OutComes
536,711,1751
320,384,085,21120
461
!56!51!5
461
521
532
543
554
565
Odds
• This is Technically a COMBINATION
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt22
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
175 711 536 ... is a DEAL! The ORDER in Which the Ping-Pong
Balls are Drawn Does NOT affect the Winning Odds
If we Had to Match the Pull-Order:
Current theX120320,384,085,21
1!5646
!51461
521
531
541
551
561
Odds
• This is a PERMUTATION
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt23
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Normal Distribution - 1 Consider Data on the Height of a
sample group of 20 year old Men
Ht (in) No.64 164.5 065 065.5 066 266.5 467 567.5 468 868.5 1169 1269.5 1070 970.5 871 771.5 572 472.5 473 373.5 174 174.5 075 1
We can Plot this Frequency Data using bar
>> y_abs=[1,0,0,0,2,4,5,4,8,11,12,10,9,8,7,5,4,4,3,1,1,0,1];>> xbins = [64:0.5:75];>> bar(xbins, y_abs), ylabel('No.'), xlabel('Height (Inches'), title('Height of 20 Yr-Old Men')
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt24
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Normal Distribution - 2 We can also SCALE the
Bar/Hist such that the AREA UNDER the CURVE equals 1.00, exactly The Game Plan for Scaling• Calc the Height of Each Bar To Get
the Total Area = [Bin Width] x [Σ(individual counts)]
• The individual Bar Area =[Bin Width] x [individual count]
• %-Area any one bar → [Bar Areas]/[Total Area]
Ht (in) No. Area (BW*No.) No./TotArea64 1 0.5 0.0200
64.5 0 0 0.000065 0 0 0.0000
65.5 0 0 0.000066 2 1 0.0400
66.5 4 2 0.080067 5 2.5 0.1000
67.5 4 2 0.080068 8 4 0.1600
68.5 11 5.5 0.220069 12 6 0.2400
69.5 10 5 0.200070 9 4.5 0.1800
70.5 8 4 0.160071 7 3.5 0.1400
71.5 5 2.5 0.100072 4 2 0.0800
72.5 4 2 0.080073 3 1.5 0.0600
73.5 1 0.5 0.020074 1 0.5 0.0200
74.5 0 0 0.000075 1 0.5 0.0200
50.0
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt25
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Normal Distribution - 3 We can Use bar to Plot
the Scaled-Area Hist. >>y_abs=[1,0,0,0,2,4,5,4,8,11,12,10,9,8,7,5,4,4,3,1,1,0,1];>> xbins = [64:0.5:75];>> TotalArea = sum(0.5*y_abs)
>> y_scale = 100*y_abs/TotalArea;>> bar(xbins, y_scale), ylabel('Fraction (%/inch)'), xlabel('Height (inches)'), title('Height of 20 Yr-Old Men')
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt26
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Normal Distribution - 4 This is a Good
Time for a UNITS Check• Remember, our
GOAL → the Area Under the Curve = 1
Recall From the Plot the UNITS for the y-axis → %/inch (?)
The Units come from these MATLAB Statements
So TotalArea is in inches•No.
Now y_scale
TotalArea = sum(0.5*y_abs)
Bin Width in INCHES
y_scale = 100*y_abs/TotalArea;
• Cont. on Next Slide
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt27
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Normal Distribution - 5 The Units
Analysis for y-scale
Recall From MTH1 that for y = f(x) displayed in BAR Form the Area Under the Curve
y_scale = 100*y_abs/TotalArea;
inch%y_scale
No.*inchesNo.*
1%100y_scale
hi
lo
x
xlo
crv
xxxy
xBinWidthxyHgt
A AreasIndividual
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt28
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Normal Distribution - 6 In this Case
• y(x) → y_scale in %/inch
• Δx → Bin Width = 0.5 in inches
Then The Units Analysis for Our “integration”
Check the integration
inch5.0inch%
y
xxxyAhi
lo
x
xlocrv
Ht (in) No. Area (BW*No.) No./TotArea BW*(No./TotArea)64 1 0.5 0.0200 1.00%
64.5 0 0 0.0000 0.00%65 0 0 0.0000 0.00%
65.5 0 0 0.0000 0.00%66 2 1 0.0400 2.00%
66.5 4 2 0.0800 4.00%67 5 2.5 0.1000 5.00%
67.5 4 2 0.0800 4.00%68 8 4 0.1600 8.00%
68.5 11 5.5 0.2200 11.00%69 12 6 0.2400 12.00%
69.5 10 5 0.2000 10.00%70 9 4.5 0.1800 9.00%
70.5 8 4 0.1600 8.00%71 7 3.5 0.1400 7.00%
71.5 5 2.5 0.1000 5.00%72 4 2 0.0800 4.00%
72.5 4 2 0.0800 4.00%73 3 1.5 0.0600 3.00%
73.5 1 0.5 0.0200 1.00%74 1 0.5 0.0200 1.00%
74.5 0 0 0.0000 0.00%75 1 0.5 0.0200 1.00%
50.0 100.00%
Example
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt29
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Normal Distribution - 7 Example 71” The 71” Bar Area =
Hgt•Width:
area) total the(of %7
inches 5.0inch%14,71
sclA
Alternatively from the Absolute values
inchNo. 5.3
inches 5.0No.by 7,71
absA
%7inNo. 50inNo. 5.3
,
,71
absall
abs
AA
• The Total Abs Area = 50 No.•inch
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt30
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Probability Distribution Fcn (PDF) Because the Area
Under the Scaled Plot is 1.00, exactly, The FRACTIONAL Area under any bar, or set-of-bars gives the probability that any randomly Selected 20 yr-old man will be that height
e.g., from the Plot we Find • 67.5 in → 8 %/in• 68 in → 16 %/in• 68.5 in → 22%/in
Summing → 46 %/in
Multiply the Uniform BinWidth of 0.5 in → 23% of 20 yr-old men are 67.25-68.75 inches tall
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt31
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Random Variable A random variable x takes on a defined set of
values with different probabilities; e.g.. • If you roll a die, the outcome is random (not fixed)
and there are 6 possible outcomes, each of which occur with equal probability of one-sixth.
• If you poll people about their voting preferences, the percentage of the sample that responds “Yes on Proposition 101” is a also a random variable – the %-age will be slightly differently every time you poll.
Roughly, probability is how frequently we expect different outcomes to occur if we repeat the experiment over and over (“frequentist” view)
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt32
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Random variables can be Discrete or Continuous Discrete random variables have a
countable number of outcomes• Examples: Dead/Alive, Red/Black,
Heads/Tales, dice, counts, etc. Continuous random variables have an
infinite continuum of possible values. • Examples: blood pressure, weight, Air
Temperature, the speed of a car, the real numbers from 1 to 6.
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt33
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Probability Distribution Functions A Probability Distribution Function
(PDF) maps the possible values of x against their respective probabilities of occurrence, p(x)
p(x) is a number from 0 to 1.0, or alternatively, from 0% to 100%.
The area under a probability distribution function curve is always 1 (or 100%).
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt34
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Discrete Example: Roll The Die
1/6
1 4 5 62 3
xall
1 xp
x
x p(x)1 p(x=1)=1
/62 p(x=2)=1
/63 p(x=3)=1
/64 p(x=4)=1
/65 p(x=5)=1
/66 p(x=6)=1
/6
xp
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt35
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Continuous Case The probability function that accompanies a
continuous random variable is a continuous mathematical function that integrates to 1.
The Probabilities associated with continuous functions are just areas under a Region of the curve (→ Definite Integrals)
Probabilities are given for a range of values, rather than a particular value • e.g., the probability of getting a math SAT
score between 700 and 800 is 2%).
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt36
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Continuous Case PDF Example Recall the negative exponential function
(in probability, this is called an “exponential distribution”):
xexf )(
This Function Integrates to 1 zero to infinity as required for all PDF’s
11000
xx ee
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt37
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Continuous Case PDF Example
x
p(x)=e-x
1
For example, the probability of x falling within 1 to 2:
The probability that x is any exact value (e.g.: 1.9976) is 0 • we can ONLY assign
Probabilities to possible RANGES of x
x
1
1 2
p(x)=e-x
NO Area Under a
LINE
23% 23.368.135.
2)(1
12
21
2
1
ee
eexp xx
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt38
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Gaussian Curve The Man-Height HistroGram had some
Limited, and thus DISCRETE, Data If we were to Measure 10,000 (or more)
young men we would obtain a HistoGram like this As We increase the
number and fineness of the measurements The PDF approaches a CONTINUOUS Curve
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt39
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Gaussian Distribution A Distribution that
Describes Many Physical Processes is called the GAUSSIAN or NORMAL Distribution
Gaussian (Normal) distribution• Gaussian → famous “bell-shaped curve”
– Describes IQ scores, how fast horses can run, the no. of Bees in a hive, wear profile on old stone stairs...
• All these are cases where:– deviation from mean is equally probable in either
direction– Variable is continuous (or large enough integer
to look continuous)
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt40
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Normal Distribution Real-valued PDF: f(x) → −∞ < x < +∞ 2 independent fitting parameters:
µ , σ (central location and width) Properties:
• Symmetrical about Mode at µ ,• Median = Mean = Mode,• Inflection points at ±σ
Area (probability of observing event) within:• ± 1σ = 0.683 • ± 2σ = 0.955
For larger σ, bell shaped curve becomes wider and lower (since area =1 for any σ)
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt41
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Normal Distribution
22)(21 2
xexf
122)(
21 2
dxe
dxxf
x
Mathematically• Where
– σ2 = Variance– µ = Mean
The Area Under the Curve
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt42
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
68-95-99.7 Rule for Normal Dist
68% of the data
95% of the data
99.7% of the data
σσ
2σ2σ3σ 3σ
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt43
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
68-95-99.7 Rule in Math terms… Using Definite-Integral Calculus
99.7% 997.2
1
95% 95.2
1
68% 68.2
1
3
3
)(21
2
2
)(21
)(21
2
2
2
dxe
dxe
dxe
x
x
x
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt44
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
How Good is the Rule for Real? Check some example data: The mean, µ, of the weight of a large
group of women Cross Country Runners = 127.8 lbs
The standard deviation (σ) for this Group = 15.5 lbs
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt45
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
80 90 100 110 120 130 140 150 160 0
5
10
15
20
25
P e r c e n t
POUNDS
127.8 143.3112.3
68% of 120 = .68x120 = ~ 82 runners
In fact, 79 runners fall within 1σ (15.5 lbs) of the mean
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt46
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
80 90 100 110 120 130 140 150 160 0
5
10
15
20
25
P e r c e n t
POUNDS
127.896.8
95% of 120 = .95 x 120 = ~ 114 runners
In fact, 115 runners fall within 2σ of the mean
158.8
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt47
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
80 90 100 110 120 130 140 150 160 0
5
10
15
20
25
P e r c e n t
POUNDS
127.881.3
99.7% of 120 = .997 x 120 = 119.6 runners
In fact, all 120 runners fall within 3σ of the mean
174.3
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt48
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Estimating µ & σ (1) The Location &
Width Parameters, µ & σ, are Calculated from the ENTIRE POPULATION• Mean, µ
NxN
kk
1
NxN
kk
1
22
• Variance, σ2
• Standard Deviation, σ2
For LARGE Populations it is usually impractical to measure all the xk
In this case we take a Finite SAMPLE to ESTIMATE µ & σ
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt49
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Estimating µ & σ (2) Say we want to
characterize Miles/Yr driven by Every Licensed Driver in the USA
We assume that this is Normally Distributed, so we take a Sample of N = 1013 Drivers
We Take the Mean of the SAMPLE
NxxµN
kn
1
Use the SAMPLE-Mean to Estimate the POPULATION-Mean
NxxN
kn
1
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt50
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Estimating µ & σ (3) Now Calc the
SAMPLE Variance & StdDev
S Estimate
1
1
2
2
N
xxS
N
kk
• Number decreased from N to (N – 1) To Account for case where N = 1– In this case x-bar = x1,
and the S2 result is meaningless
• standard deviation: positive square root of the variance– small std dev:
observations are clustered tightly around a central value
– large std dev: observations are scattered widely about the mean
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt51
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Sample Mean and StdDev
For a series of N observations, the most probable estimate of the mean µ is the average
x of the observations. We refer to this as the sample mean
x to distinguish it from the population mean µ.
x1N
xi Sample Mean
Calculate the Population Variance, σ2, from:
2 s2 1
N 1x i x
2
Sample Variance
But we cannot know the true population mean µ so the practical estimate for the sample variance and standard deviation would be:
2222
22 121
iii
i xNNN
xNx
xN
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt52
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Error Function (erf) & Probability Guass’s Defining
Eqn dyezerf
z y 0
22
This looks a lot Like the normal dist
dxeI xG
22)(21 2
Now Let
Consider the Gaussian integral
22)(21 2
xexf
OrdxeI
x
G
2
221
dydx
dxdy
xy
2
Or2
12
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt53
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Error Function (erf) & Probability Subbing for x &
dx
dyezerfz y
0
22
As
dxeIx
G
2
2
21
dyeI
dyeI
yG
yG
2
2
1
221
ReArranging
erf
dye
dyeI
y
yG
21
221
1
2
2
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt54
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Error Function (erf) & Probability Now the Limits Plotting
This Fcn is Symmetrical about y = 0
Recall
-3 -2 -1 0 1 2 30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
y
f(y) =
exp
(-y2 )
2yeyf
dyezerfz y
0
22
And the erf properties• erf(0) = 0• erf(h) = 1
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt55
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Error Function (erf) & Probability By Symmetry about y = 0 for
122 0
0
22
dyedye yy
Thus
dyedyedyeB yyB y
0
0 222 222
So Finally integrating −h to B
)(12 2
BerfdyeB y
2ye
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt56
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Error Function (erf) & Probability Note That for a
Continuous PDF• Probability that x
is Less or Equal to b
• Probability that x is between a & b
The probability for the Normal Dist
b
dxxfbxP
b
a
dxxfbxaP
But
dxebxaP
dxebxP
b
a
x
bx
22)(21
22)(21
2
2
2
2
21
22)(21
2
2
xerf
dxeI xG
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt57
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Error Function (erf) & Probability If We Scale this
Properly we can Cast these Eqns into the ½erf Form
MATLAB has the erf built-in, so if we have the sample Mean & StdDev We can Calc Probabilities for Normally Distributed Quantities
2
121
µberfbxP
222
1
µaerfµberfbxaP
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt58
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
All Done for Today
Gaussian?Or
Normal?
Normal distribution was introduced by French mathematician A. De Moivre in 1733.• Used to approximate
probabilities of coin tossing• Called it the exponential
bell-shaped curve 1809, K.F. Gauss, a German
mathematician, applied it to predict astronomical entities… it became known as the Gaussian distribution.
Late 1800s, most believe majority of physical data would follow the distribution called normal distribution
Recall De Moivre’s Theorem
kjkRz
jRRzkk sincos
sincos
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt59
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Bruce Mayer, PELicensed Electrical & Mechanical Engineer
BMayer@ChabotCollege.edu
Engr/Math/Physics 25
Appendix 6972 23 xxxxf
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt60
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Basic Fitting Demo File% Bruce Mayer, PE
% ENGR25 * 11Apr10
% file = Demo_Basic_Fitting_Stockton_Temps_1004.m
%
TmaxSTK = [94, 98, 93, 94, 91, 96, 93, 87, 89, 94, 100, 99, 103, 103, 103, 97, 91, 83, 84, 90, 89, 95, 94, 99, 97, 94, 102, 103, 107, 98, 86, 89, 95, 91, 84, 93, 98, 104, 105, 107, 103, 91, 90, 96, 93, 86, 92, 93, 95, 95, 86, 81, 93, 97, 96, 97, 101, 92, 89, 92, 93, 94]
Ntot = length(TmaxSTK)
nday = [1:Ntot];
plot(nday, TmaxSTK, '-dk'), xlabel('No. Days after 31Jun08'), ylabel('Max. Temp (°F)'), title('Stockton, CA - Jul-Aug08')
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt61
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Normal or Gaussian? Normal distribution was introduced by French
mathematician A. De Moivre in 1733.• Used to approximate probabilities of coin tossing• Called it exponential bell-shaped curve
1809, K.F. Gauss, a German mathematician, applied it to predict astronomical entities… it became known as Gaussian distribution.
Late 1800s, most believe majority data would follow the distribution called normal distribution
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt62
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
Carl Friedrich G
auss
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt63
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
NormalDist Data
Ht (in) No. Area (BW*No.) No./TotArea BW*(No./TotArea)
64 1 0.5 0.0200 1.00%
64.5 0 0 0.0000 0.00%
65 0 0 0.0000 0.00%
65.5 0 0 0.0000 0.00%
66 2 1 0.0400 2.00%
66.5 4 2 0.0800 4.00%
67 5 2.5 0.1000 5.00%
67.5 4 2 0.0800 4.00%
68 8 4 0.1600 8.00%
68.5 11 5.5 0.2200 11.00%
69 12 6 0.2400 12.00%
69.5 10 5 0.2000 10.00%
70 9 4.5 0.1800 9.00%
70.5 8 4 0.1600 8.00%
71 7 3.5 0.1400 7.00%
71.5 5 2.5 0.1000 5.00%
72 4 2 0.0800 4.00%
72.5 4 2 0.0800 4.00%
73 3 1.5 0.0600 3.00%
73.5 1 0.5 0.0200 1.00%
74 1 0.5 0.0200 1.00%
74.5 0 0 0.0000 0.00%
75 1 0.5 0.0200 1.00%
50.0 100.00%
BMayer@ChabotCollege.edu • ENGR-25_Lec-19_Statistics-1.ppt64
Bruce Mayer, PE Engineering/Math/Physics 25: Computational Methods
SPICE Circuit
top related