3 - inferential

Elisa Maietti

Elisa Maietti

Inferential statistics

Background on Statistics – Inferential StatisticsElisa Maietti

DescriptivevsInferentialstatisticsDescriptive statistics:

• it provides information about data sample, it describes what we observ and highlights the main data charcteristics

• What we can desume from descriptive statistics relates just to the sample analysed

Inferential statistics:• It tries to extend the results from the data sample analysis to the

population from which the sample was drawn• the sample should be rappresentative of the whole population

We use:• descriptive statistics to simply describe what's going on in our data• Inferential statistics to make inferences from our data to more general

conditions


InferentialstatisticsInferential statistics is used to quantify the probability that a result from the data

sample analysis could be worth for the entire population

Definitions:� Parameter: unobservable population characteristicExample: Italian women mean age � Statistic: parameter estimate made on data sample Example: women mean age measured in the data sample

Every estimate made on a sample, even if the sample is rapresentative of the whole population, differs from the real value of the parameter of a certain quantity called the sampling error (it means that if we could observ the whole population this error won’t stand)

Inferential statistics is a set of techniques that allow to use sample estimates to make inference about population parameter


PopulationvsSampleDescriptive statistics

Inferential statistics


6

StatisticalInference:methodsEstimation

� Puntual estimation: estimate of the parameter value (examples: mean, proportion)

� Interval estimation: estimate of an interval of possible values for the parameter

Hypothesis test� Hypothesis on population parameter value� Use data sample analysis to reject or not the hypothesis


Puntualestimation

Population of size N

µ = mean = ∑∑XiN

σ2 = variance= ∑∑ (Xi-µµ)2

N

σ = standard deviation = √ σ2

pk = proportion (kth category)= ∑∑Xik

N

Sample of size n

X = sample mean = ∑∑Xin

s2 = sample variance= ∑∑(Xi-X)2

n-1

s = sample standard deviation = √s2

fk = sample relative frequency = ∑∑Xikn

Sample estimatesPopulation parameters


ProbabilitydistributionProbability distribution (of a variable): is the set of the probabilities associated to

the variable possible values. At each value correspond an exact probability of occurrence.

Statistic is a measure computed on the data sample:E.g. the sample mean or the sample standard deviation� it is a random variable: it varies from sample to sample in a way that cannot be

predicted with certainty� it has a mean, a standard deviation, and a probability distribution� the probability distribution of a statistic depends on the distribution of the

variable of interest.

� Continuous variable: Normal distribution and Student-t distribution� Dicothomous variable: Binomial distribution


Normaldistribution� principal probability distribution

� remarkably useful because it has many properties that makes it comfortable to use

� Many natural events have a probability distribution that approximate the normal distribution


Normaldistribution:propertiesIt’s also called the Gaussian distribution or the bell curve:� bell shape� Symmetric around its mean µ: mean = median = mode� Range in (-∞; +∞)

NB: the Area under the curve is equal to 1 because probability ranges in [0; 1]

Parameters:µµ = meanσ = standard deviation


Normaldistribution:parametersEvery normal distribution depends on its mean µ and its standard deviation σ :Normal distributions can differ by the mean or by the variance or by both the

parameters, still keeping their properties:� varying the mean value: the curve is shifted on x axis� varying the standard deviation: the curve becomes larger or pointed


StandardizationThe variable X has normal distribution with parameters µ and σ:

X ~ N (µ, σ2)

Standardization:

The variable Z has normal distribution with parameters 0 and 1:

Z ~ N (0,1)


StandardNormaldistributionStandard Normal distribution is that normal distribution with parameters:µ = 0σ2 = 1

This is a well-konwn distribution used for many statistical tests.Its probability values are displayed on the relative probability tables.


Student’stdistributionContinuous probability distribution

Properties:1. Bell shaped2. Range in (-∞; +∞)3. Symmetry: Mean = median = mode = 04. heavier tails and narrow sides: higher variance than standard normal5. Depends on ν=n-1 degrees of freedom6. when (n-1) ∞ it approximates the standard Normal distribution

0

0.1

0.2

0.3

0.4

-8 -6 -4 -2 0 2 4 6 8

f(t)

t di Student (n=2)

l l

1.891.28 t

gaussiana

νν

p=0.1

p=0.1


BinomialdistributionDiscrete probability distribution used to estimate proportionsDefinition: It describes the probability of the number of successes in a sequence

of n independent yes/no experiments, each of which yields success(=yes) with probability p.

Parameters:� n independent trials� p = probability of success for each trial

X ~ BINOM (n, p)Properties:� mean µ = np� variance σ2 = np(1-p)� When n ∞ BINOM approximate the standard normal distribution � The shape depends on the n and p values (as p gets close to 0.5 the curve

becomes symmetric)


Hypothesistest� Hypothesis on the parameter value� Utilization of sample estimates to test the hypothesis: comparison between

estimate and hypothesized value in terms of probability

Every time we test an hypothesis we implictly do two alternatives:� H0: the null hypothesis (that one we usually hope to reject)� H1: the alternative hypothesis (the contrasting hypothesis that we accept when

we reject H0)

Test on H0:decision probability

H0 true (H1 false) Ho false (H1 true)

Reject H0 in favour of H1 α = type I error 1-β = power

Not reject H0 1- α β = type II error

Right conclusions


HypothesistestProbability distribution of X under H0 and H1

� H0 µ=90� H1 µ >90

Distribution of X under

βtype II error

αtype I error

μ

90

Prob


Teststatisticandp-valueDefinitions:- Test statistic: the statistics with known probability distribution- α: the probability to reject H0 when it is true (generally: α=0.05 or 0.01)- p-value: the probability to obtain the observed or a more extreme value of the test

statistic when H0 it is true.

Operatively:To test the null Hypothesis H0, we compared:the observed value of the test Statistic (Zc) with its value at α level (Zα):

or likewise the p-value with α:

More extreme result than Zα under H0 reject H0Zc

Less extreme result than Zα under H0 do not reject H0

< α reject H0P-value

≥ α do not reject H0


HypothesistestProbability distribution of the test statistic Z under the Null Hypothesys H0

Case 1 Case 2

Zα Zc Zc Zα

P-value

1.4

P-value

Case 1: Zc > Zα or alternatively α > p-value H0 rejectedCase 2: Zc < Zα or alternatively p-value > α H0 not rejected

α α


20

Oneandtwotailedtest� One tailed test : it highligths differences in one precise directionEg. we want to know if the weight of babies with underweight mothers is lower than the

weigth of the other babiesH0: µ ≥ µ0H1: µ < µ0

Distribution of X under H0

� Two tailed test: it highlights any difference between expected and observed valuesEg. we want to know if the weight of babies with underweight mothers is different from the

weigth of the other babiesH0: µ = µ0H1: µ ≠ µ0

Distribution of X under H0

α

α/2 α/2


Parameterestimation:distributionPopulation parameters� Mean µ : normal and Student-t distributionExample 1. we want to test if babies with underweight mother have a lower birth

weight than the othersExample. 2 we want to estimate the mean age of the patient admitted to the

hospital for hart attack

� Proportion p : binomial distributionExample 3: we want to test if the percentage of underweight babies in

underweight mothers is higher than in normal weight mothers


Testonµ:XdistributionThe sample mean X distribution depends on X distribution:� If X in the target population is normally distributed its sample mean X has normal distribution with parameters:Mean: µVariance: σ2/n,

X ~ (µ, σ2) X ~ N (µ, σ2/n)

� If X is not normally distributed but the sample size is “large”, for the central limit theorem:

(n ∞) X ~ N (µ, σ2/n)

With n>30 there is a good approximation


Testonµ : TeststatisticTo test H0 we need to compute the test statistic: a measure with known probability distribution.

Z is the test statistic, but to compute it we need to know the parameter σ2

When σ2 is unknown, we can use its sample estimate s2 and obtain the ratio:

What is its probability ditribution?

Student’s t with ν= n-1 degrees of freedom

If x ~ N (µ, σ2/n) and then z ~ N (0, 1) z xn

xs nZ =


24

teston µ:teststatisticsA. X ~ N (µ, σ2) with known σ2, test statistic

B. X ~ N (µ, σ2) with unknown σ2 , test statistic

C. X has no normal distribution but n ∞

z xn

~ t student (n-1 df)

n ∞ ~ N(0,1)

~ N (0,1)

~ N (0,1)xs n

xs n


Hypothesistestonµ example:Example 1. We want to test if the mean weight of the babies with underweight

mother is significantly lower than the other babiesWe know that in normal weight mothers population, the mean weight of a baby is

3.3 Kg, thus:

H0 µ= 3.3H1 µ < 3.3α = 0.05 and zα = -1.645

Sample statistics: n = 82 X = 3.121 s = 6.811σ not known but n>30 hence zc ~ N (0,1)zc= -2.155 relative p-value = 0.039

zc is a more extreme value than zαor alternatively p-value < α

H0 rejected : mean weight of babies with underweight mother is significantly lower than the others

α

p-value

-2.15 -1.64 0 z value

probability


26

Meanstandarderror� When we use sample mean to estimate the population mean value, it’s

reasonable wondering how far the estimate is from the real value

� The mean standard error σσ/√n gives an idea of this distance: larger the sample size better the sample mean resembles the parameter value.

� Foundamental assumption: the sample is randomly drawn from the population

Example.2 µ=mean age of patients with heart attack

n = 120 X = 73.4 S2= 16,64 (s= 4,08)

s/ √n = 0.37

Puntual estimation of µ: is 73.4 with a standard error of 0.37

we are interested in estimating an interval of possible values for µ


IntervalestimationWith an interval estimation we can compute a range of possible values in which

the population parameter is said to lie

The size of the interval depends on the confidence level:

� confidence levels typically used are: 90% or 95% or 99% that correspond to a probability error α of 0.10 or 0.05 and 0.01 respectively.

� a 95% confidence level means that the 95% of intervals computed on all the possible random samples drawn from the population, contains the real value of the parameter

� confidence level represents the likelihood that the interval estimation computed on our sample, will include the real value of the parameter


Operatively the confidence interval is defined by:

The margin of error is composed by the product of the mean standard error and the test statistic at (1-α) level

95% CI for µ : X ~ N (µ, σ2) with unknown variance σ2

X ± z (α/2) * s/√n z (α/2)

Examplen=120 X = 73.4 s/√n= 0.37 zα/2= 1.96 95% CI: 73.4 ±1.96*0.37 = [72.7; 74.1]

sample statistic + margin of error

ConfidenceInterval

margin of error

~ N (0,1) if n>30

~ Student’s t with df=(n-1) if n<30


RelationshipbetweenCIandtestTest� α significance level of a test: level of probability at which we reject H0Confidence interval� (1-α) confidence level at which we compute interval estimation

Considering the parameter θTesting H0: θ = θ0 setting α=0.05 it’s the same ascomputing the 95%CI for θ and checking if it contains the θ0 value

Example.2 mean age of people admitted to the hospital for an heart attackn=120 X = 73.4 s/√n= 0.37 α=0.05

Test H0: µ=70 H1: µ >70Zc = 73.4-70/(0.37) =9.19 zα=1.645 Zc > Z α H0 rejected

95% CI: 73,4 ±1.96*0.37= [72.7; 74.1] 95%CI does not contain the value 70


Proportion:testandconfidenceintervalX ~ BINOM (n, p) µ= np σ2=np(1-p)

H0 : p=p0 X~ BINOM(n, p0)

statistic test: relative frequency f=x/np-value: probability under H0 that the proportion is equal to f or to a more

extreme value if p-value > α we do not reject H0

for n>30 central limit theorem: (x- µ) / σ ~ N(0,1)

Test statistic

Hence for n>30 we can compute the interval estimation for p as:

(1- α)% CI : f ± zα/2 * √(f(1-f )/n)

z x npnp 1 p

Z ~ N(0,1) Normal test


Proportiontestand95%CI:exampleExample.3 we want to test if the proportion of underweight babies is higher in

underweight mothers than the others

one side tailed test:

H0: p = 0.15 H1: p > 0.15 α=0.05

n=82f=14/82=0.171 z=0.526 p-value=0.300P-value > α H0 not rejected

Alternatively 95% CI for p:f=0.171 n=82 α=0.0595% CI: 0.171 ± 1.96 * √(0.171*(1-0.171)/82) = [0.089 ; 0.252]

α

Z value 0 0.53 1.645


Thanksforyourattention

3 - inferential

Documents