data exploration: statistics (one variable) 1.basic excell/matlab functions for data exploration...
TRANSCRIPT
![Page 1: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/1.jpg)
DATA Exploration: Statistics (One Variable)
1. Basic EXCELL/MATLAB functions for data exploration 2. Measures of central tendency, Distributions
1. Mean2. Median3. Mode
3. Measures of spread1. Range2. Variance
4. Simple Sampling5. Example of Sampling by using EXCELL
![Page 2: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/2.jpg)
2
1. Working with Data in Excel: Arithmetic
![Page 3: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/3.jpg)
3
Use “Insert” then “Function” then “All” or “Statistical” to find an alphabetical list of functions
1. Summary Statistics in EXCELL (One Variable)
![Page 4: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/4.jpg)
4
1. Summary Statistics in EXCELL Average
![Page 5: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/5.jpg)
5
1. Summary Statistics in EXCELL (Median)
![Page 6: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/6.jpg)
6
1. Summary Statistics in EXCELL (Standard Deviation)
![Page 7: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/7.jpg)
7
1. Summary Statistics in EXCELL (Rand & RandBetween)
![Page 8: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/8.jpg)
8
1. Summary Statistics in EXCELL (Sort )
![Page 9: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/9.jpg)
Function Descriptionmax Maximum valuemean Average or mean valuemedian Median valuemin Smallest valuemode Most frequent valuestd Standard deviationvar Variance, which measures the spread or
dispersion of the values
1. Summary Statistics in MATLAB
![Page 10: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/10.jpg)
2. Distributions Continuous Probability Distributions
Uniform Probability Distribution
Normal Probability Distribution
Exponential Probability Distribution
f (x)f (x)
x x
Uniform
x
f (x)Normal
xx
f (x)f (x) Exponential
![Page 11: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/11.jpg)
Uniform Probability Distribution
where: a = smallest value the variable can assume b = largest value the variable can assume
f (x) = 1/(b – a) for a < x < b = 0 elsewhere f (x) = 1/(b – a) for a < x < b = 0 elsewhere
A random variable is uniformly distributed whenever the probability is proportional to the interval’s length.
The uniform probability density function is:
![Page 12: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/12.jpg)
Var(x) = (b - a)2/12Var(x) = (b - a)2/12
E(x) = (a + b)/2E(x) = (a + b)/2
Uniform Probability Distribution
Expected Value of x
Variance of x
![Page 13: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/13.jpg)
The highest point on the normal curve is at the mean, which is also the median and mode. The highest point on the normal curve is at the mean, which is also the median and mode.
Normal Probability Distribution Characteristics
x
![Page 14: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/14.jpg)
Normal Probability Distribution
Characteristics
-10 0 20
The mean can be any numerical value: negative, zero, or positive. The mean can be any numerical value: negative, zero, or positive.
x
![Page 15: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/15.jpg)
3. Normal Probability Distribution
Characteristics
s = 15
s = 25
The standard deviation determines the width of thecurve: larger values result in wider, flatter curves.The standard deviation determines the width of thecurve: larger values result in wider, flatter curves.
x
![Page 16: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/16.jpg)
Converting to the Standard Normal Distribution
Standard Normal Probability Distribution
zx
We can think of z as a measure of the number ofstandard deviations x is from .
![Page 17: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/17.jpg)
3. Normal Probability Distribution
Characteristics
xm – 3s m – 1s
m – 2sm + 1s
m + 2sm + 3s
m
68.26%
95.44%
99.72%
![Page 18: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/18.jpg)
4. Sampling and Sampling Distributions
x Sampling Distribution of
Introduction to Sampling Distributions
Point Estimation
Simple Random Sampling
Other Sampling Methods
p Sampling Distribution of
![Page 19: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/19.jpg)
4. Simple Random Sampling:
Finite populations are often defined by lists such as: Organization membership roster Credit card account numbers Inventory product numbers
A simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected.
![Page 20: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/20.jpg)
s is the point estimator of the population standard deviation . s is the point estimator of the population standard deviation .
In point estimation we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter.
In point estimation we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter.
4. Point Estimation
We refer to as the point estimator of the population mean . We refer to as the point estimator of the population mean .
x
is the point estimator of the population proportion p. is the point estimator of the population proportion p.p
![Page 21: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/21.jpg)
Process of Statistical Inference
The value of is used tomake inferences aboutthe value of m.
x The sample data provide a value forthe sample mean .x
A simple random sampleof n elements is selectedfrom the population.
Population with meanm = ?
Sampling Distribution of x
![Page 22: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/22.jpg)
4. Simple Random Sampling
The applicants were numbered, from 1 to 900, as their applications arrived.
She decides a sample of 30 applicants will be used.
Furthermore, the Director of Admissions must obtain estimates of the population parameters of interest for a meeting taking place in a few hours.
Now suppose that the necessary data on the current year’s applicants were not yet entered in the college’s database.
The population parameters of interest are the SAT scores and the percentage of students planning to live in dorms.
![Page 23: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/23.jpg)
Taking a Sample of 30 Applicants
Excel’s RAND function generates random numbers between 0 and 1
Excel’s RAND function generates random numbers between 0 and 1
4. Simple Random Sampling:
Step 1: Assign a random number to each of the 900 applicants.
Step 2: Select the 30 applicants corresponding to the 30 smallest random numbers.
![Page 24: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/24.jpg)
4. Using Excel to Selecta Simple Random Sample
Excel Formula Worksheet
A B C D
1Applicant Number
SAT Score
On-Campus Housing
Random Number
2 1 1008 Yes =RAND()3 2 1025 No =RAND()4 3 952 Yes =RAND()5 4 1090 Yes =RAND()6 5 1127 Yes =RAND()7 6 1015 No =RAND()8 7 965 Yes =RAND()9 8 1161 No =RAND()
![Page 25: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/25.jpg)
4. Using Excel to Selecta Simple Random Sample
Excel Value Worksheet
A B C D
1Applicant Number
SAT Score
On-Campus Housing
Random Number
2 1 1008 Yes 0.610213 2 1025 No 0.837624 3 952 Yes 0.589355 4 1090 Yes 0.199346 5 1127 Yes 0.866587 6 1015 No 0.605798 7 965 Yes 0.809609 8 1161 No 0.33224
![Page 26: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/26.jpg)
Put Random Numbers in Ascending Order
4. Using Excel to Selecta Simple Random Sample
Step 4 When the Sort dialog box appears: Choose Random Numbers in
the Sort by text box Choose Ascending Click OK
Step 3 Choose the Sort optionStep 2 Select the Data menu
Step 1 Select cells A2:A901
![Page 27: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/27.jpg)
Using Excel to Selecta Simple Random Sample
Excel Value Worksheet (Sorted)
A B C D
1Applicant Number
SAT Score
On-Campus Housing
Random Number
2 12 1107 No 0.000273 773 1043 Yes 0.001924 408 991 Yes 0.003035 58 1008 No 0.004816 116 1127 Yes 0.005387 185 982 Yes 0.005838 510 1163 Yes 0.006499 394 1008 No 0.00667
![Page 28: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/28.jpg)
as Point Estimator of x
as Point Estimator of pp
29,910997
30 30ix
x
2( ) 163,99675.2
29 29ix x
s
20 30 .68p
Point Estimation
Note: Different random numbers would haveidentified a different sample which would haveresulted in different point estimates.
s as Point Estimator of
![Page 29: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/29.jpg)
PopulationParameter
PointEstimator
PointEstimate
ParameterValue
m = Population mean SAT score
990 997
s = Population std. deviation for SAT score
80 s = Sample std. deviation for SAT score
75.2
p = Population pro- portion wanting campus housing
.72 .68
Summary of Point EstimatesObtained from a Simple Random Sample
= Sample mean SAT score x
= Sample pro- portion wanting campus housing
p
![Page 30: DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median](https://reader031.vdocuments.us/reader031/viewer/2022032016/56649db65503460f94aa8a38/html5/thumbnails/30.jpg)
Other Sampling Methods
Stratified Random Sampling Cluster Sampling Systematic Sampling Convenience Sampling Judgment Sampling