se7204-bda-l2

19
Sampling Distributions Sampling Distributions JS-SSNCE

Upload: whosuresh

Post on 27-Nov-2015

24 views

Category:

Documents


1 download

DESCRIPTION

Presentation for the subject Big Data Analytic for PG SE

TRANSCRIPT

Page 1: SE7204-BDA-L2

Sampling DistributionsSampling Distributions

JS-S

SNCE

Page 2: SE7204-BDA-L2

Outline

Parameters and statisticsParameters and statistics

Statistical estimation and the law of large Statistical estimation and the law of large numbersnumbers

Sampling distributionsSampling distributions

The sampling distribution of the sample meanThe sampling distribution of the sample mean

The central limit theorem <Later>The central limit theorem <Later>JS-S

SNCE

Page 3: SE7204-BDA-L2

Parameters and Statistics IParameters and Statistics I

A parameter(population parameter) A parameter(population parameter)

–– A number that describe the populationA number that describe the population–– Fixed but unknownFixed but unknown

For example, the population mean is a parameter.For example, the population mean is a parameter.

JS-S

SNCE

Page 4: SE7204-BDA-L2

Parameters and Statistics IIParameters and Statistics II

A statistic (sample statistic)A statistic (sample statistic)

–– A number that describe a sampleA number that describe a sample

–– Known after we take a sampleKnown after we take a sample

–– Change from sample to sampleChange from sample to sample

–– Used to estimate an unknown parameterUsed to estimate an unknown parameter

For example, the mean of the data from a sample is For example, the mean of the data from a sample is used to give information about the overall mean in used to give information about the overall mean in the population from which that sample was drawn.the population from which that sample was drawn.

JS-S

SNCE

Page 5: SE7204-BDA-L2

Example Example

A survey conducted by a research in art A survey conducted by a research in art education found that, 17% of those surveyed, education found that, 17% of those surveyed, had taken one course in dance in their life.had taken one course in dance in their life.

Q: Is the number 73% (= 100%Q: Is the number 73% (= 100%--17%) a 17%) a statistic or a parameter?statistic or a parameter?

Q: Is the unknown true percentage of Q: Is the unknown true percentage of American citizen that have taken at least one American citizen that have taken at least one course in dance in their life a parameter or a course in dance in their life a parameter or a statistic?statistic?

JS-S

SNCE

Page 6: SE7204-BDA-L2

Statistical estimation & the law of large numbersStatistical estimation & the law of large numbers

•• Random variables are used to estimate a population Random variables are used to estimate a population parameter. Because good samples are chosen randomly, parameter. Because good samples are chosen randomly, statistic such as are random variables.statistic such as are random variables.

•• The probability of any outcome of a The probability of any outcome of a randomrandom phenomenon phenomenon is the proportion of times the outcome will occur in the is the proportion of times the outcome will occur in the long run. Thus, we can describe the behavior of a sample long run. Thus, we can describe the behavior of a sample statistics by a probability model that answers the question statistics by a probability model that answers the question “What would happen if we do this many times?”“What would happen if we do this many times?”and and

•• “What would happen if we take a big # of “What would happen if we take a big # of observations ?”observations ?”

x

JS-S

SNCE

Page 7: SE7204-BDA-L2

Example 10.2 (Example 10.2 (page 251page 251))

•• Here are the odor thresholds for ten randomly Here are the odor thresholds for ten randomly chosen subjects: 28 40 28 33 20 31 29 27 17 21, chosen subjects: 28 40 28 33 20 31 29 27 17 21, the mean is 27.4.Since SRS should represent the mean is 27.4.Since SRS should represent the population, so that we expect that close to the population, so that we expect that close to the mean of the population.the mean of the population.

•• Q: Each sample of the same population will have Q: Each sample of the same population will have a different mean , why is it a reasonable a different mean , why is it a reasonable estimate of the population mean?estimate of the population mean?

x

x

xJS-S

SNCE

Page 8: SE7204-BDA-L2

One answer is the Law of Large Number

JS-S

SNCE

Page 9: SE7204-BDA-L2

Example 10.3(P252): How sample means approach the population mean (=25).

JS-S

SNCE

Page 10: SE7204-BDA-L2

Sampling distributionSampling distribution

The sampling distribution of a statistic(not The sampling distribution of a statistic(not parameter) is the parameter) is the distribution of values taken distribution of values taken by the statisticby the statistic (not parameter) (not parameter) in in allall possible possible samplessamples of the of the same sizesame size from the from the same same populationpopulation..

JS-S

SNCE

Page 11: SE7204-BDA-L2

Example 10.4 Example 10.4 (page 254)(page 254)

-- what would happen in many samples?what would happen in many samples?

JS-S

SNCE

Page 12: SE7204-BDA-L2

Recall Some Features of the Sampling Distribution

It will approximate a normal curve even if the It will approximate a normal curve even if the population you started with does NOT look population you started with does NOT look normalnormal

Sampling distribution serves as a bridge between Sampling distribution serves as a bridge between the sample and the populationthe sample and the populationJS

-SSNCE

Page 13: SE7204-BDA-L2

Mean of a sample meanMean of a sample mean x

JS-S

SNCE

Page 14: SE7204-BDA-L2

Standard Deviation of a sample mean x

JS-S

SNCE

Page 15: SE7204-BDA-L2

Third Property: Sample Size and the Standard Deviation

The larger the sample size, the smaller the The larger the sample size, the smaller the standard deviation of the mean standard deviation of the mean

OrOr

As n increases, the standard deviation of the As n increases, the standard deviation of the mean decreasesmean decreases

x

JS-S

SNCE

Page 16: SE7204-BDA-L2

JS-S

SNCE

Page 17: SE7204-BDA-L2

SamplingSampling distribution of a sample mean distribution of a sample mean

Definition: For a random variable x and a given sample Definition: For a random variable x and a given sample size n, the distribution of the variable , that is the size n, the distribution of the variable , that is the distribution of all possible sample means, is called the distribution of all possible sample means, is called the sampling distribution of the sample mean.sampling distribution of the sample mean.

x

x

JS-S

SNCE

Page 18: SE7204-BDA-L2

Sampling distribution of the sample mean

Case 1. Population follows Normal distributionCase 1. Population follows Normal distribution

–– Draw an SRS of size Draw an SRS of size nn from any population.from any population.

–– Repeat sampling.Repeat sampling.

–– Population follows a Normal distribution with Population follows a Normal distribution with mean mean µ µ and standard deviation and standard deviation σσ..

–– Sampling distribution of follows normal Sampling distribution of follows normal distribution as follows: distribution as follows: N(µ, N(µ, σσ/√n )./√n ).

x

n/JS

-SSNCE

Page 19: SE7204-BDA-L2

Example 10.5Example 10.5(The population distribution follow a Normal (The population distribution follow a Normal distribution, then so does the sample meandistribution, then so does the sample mean))

JS-S

SNCE