chaper 3

15
Chaper 3 Some basic concepts of statistics

Upload: cricket

Post on 05-Jan-2016

23 views

Category:

Documents


0 download

DESCRIPTION

Chaper 3. Some basic concepts of statistics. Population versus Sample. Population. Sample. Numbers that describe the sample are called __________________ Sample mean is represented by ________ Sample variance is represented by ________. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chaper 3

Chaper 3

Some basic concepts of statistics

Page 2: Chaper 3

Population versus Sample

Population• Numbers that describe the

population are called _________________

• Population mean is represented by ________

• Population variance is represented by ________

Sample• Numbers that describe the

sample are called __________________

• Sample mean is represented by ________

• Sample variance is represented by ________

Page 3: Chaper 3

Sample mean and varianceUse the following data set: 5,9,8,7,6,5,8,4,1

•Calculate sample mean:

•Calculate sample variance:

•Sample standard deviation:

Page 4: Chaper 3

Population Mean and Standard deviation

• = E(Y) = yip(yi)

• Population standard deviation: 2 = (yi-)2p(yi)

Use the following information to calculate Population mean, variance and standard deviation:

Y P(Y)1 0.12 0.63 0.24 0.1

Page 5: Chaper 3

Sampling distribution

• The distribution of all y-bars possible with n=50.

• E(y-bar)= • Var(y-bar)= 2/n

Page 6: Chaper 3

Section 3.3 Summarizing Information in Populations and Samples: The Finite

Population Case• If the population is infinitely large, we can

assume sampling without replacement (probabilities of selecting observations are independent)

• However, if population is finite, then probabilities of selecting elements will change as more elements are selected

(Example: rolling a die versus selecting cards from standard 52 card deck)

Page 7: Chaper 3

Estimating total population

• We will represent the total of a population as and the statistic as -hat

• More to come on this in the next few chapters

Page 8: Chaper 3

Sampling without replacement

• Same idea can be used with sampling without replacement, but probabilities become more difficult to find (STT 315 helps to understand how to calculate these).

Page 9: Chaper 3

3.4 Sampling distribution

• In your introductory statistics class, you discovered that the sampling distribution of y-bar was normally distributed (if n was large enough) with mean and standard deviation /sqrt(n).

Page 10: Chaper 3

Tchebysheff’s theorem

• If n is NOT large enough to assume CLT and the population distribution is NOT normal, then we can still use Tchebysheff’s theorem to get a lower bound:For any k > 1, at least (1-(1/k2)) will fall within k standard deviations of the mean (this is a LOWER BOUND!!) . Therefore, within 1 standard deviation, at least 0% (not very useful); within 2 standard deviations, at least 75%; within 3, at least 88.88889%

Page 11: Chaper 3

Finite population sizeAll the theory in introductory statistics class (and so far in this

class) assumes INDEPENDENT observations (infinite population…..or so large that we can assume infinite population)

What happens when this is not true?

RcodeR-code:x<-rgamma(80,shape=0.5,scale=9)hist(x)x.bar.dist<-function(x,n){xbar<-vector(length=1000) for (i in 1:1000){ temp<-sample(x,n,replace=FALSE) xbar[i]<-mean(temp) } return(xbar)}

RcodeR-code:x.bar.dist1<-function(n){xbar<-vector(length=1000) for (i in 1:1000){ temp<-rgamma(n,shape=0.5,scale=9) xbar[i]<-mean(temp) }

return(xbar)}

Page 12: Chaper 3

3.5 Covariance and Correlation

• Relationship between two random variables: covariance

• The covariance indicates how two variables “covary”• Positive covariance indicates a positive “covary” or

association• Negative covariance indicates a negative “covary” or

association• Zero covariance indicates no association (NOT

necessarily independence!!!)

Page 13: Chaper 3

More on Covariance

• We calculate covariance by E[(y1-1)(y2-2)].

• Look at graphs to discuss covariance (measure of LINEAR dependency)

• However, covariance depends on the scale of the two variables

• Correlation “standardizes” the covariance

• Correlation = cov(y1,y2)/(12) =

• Note that -1<<1

Page 14: Chaper 3

3.6 Estimation

• Since we do not know parameters, we estimate them with statistics!! If is the parameter of interest, then -hat is the estimator of . We want the following properties to hold:

1.E(-hat) = 2.V(-hat) = 2

(-hat) is small

Page 15: Chaper 3

Error of Estimations and Bounds

• The error of estimation is defined as |(-hat)-|

• Set a bound on this error of estimation (B) such that

P(|(-hat)-| < B) = 1-The value of B (bound) can be thought of as the

margin of error. In fact, this is how confidence intervals (when the sampling distribution of the statistics is normally distributed).