statistics, distributions and probabilityritesh.singh/lecture_1.pdf · statistics, distributions...

50
Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata Mohanpur Campus, Nadia SERC Prep School, IISER Bhopal in collaboration with Partha Konar Physical Research Laboratory Ahmedabad Satyaki Bhattacharya Saha Institute of Nuclear Physics Kolkata Ritesh Singh Statistics: Lecture –1 1 / 16

Upload: others

Post on 28-Jun-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

Statistics, Distributions and Probability

Ritesh K. Singh

Indian Institute of Science Education & Research KolkataMohanpur Campus, Nadia

SERC Prep School, IISER Bhopal

in collaboration with

Partha Konar

Physical Research LaboratoryAhmedabad

Satyaki Bhattacharya

Saha Institute of Nuclear PhysicsKolkata

Ritesh Singh Statistics: Lecture –1 1 / 16

Page 2: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

Why Statistics?

What is Statistics?As a disciplineAs a quantifierData Modelling

ProbabilityHistograms as Probability DistributionProbability Theory

Ritesh Singh Statistics: Lecture –1 2 / 16

Page 3: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

Why Statistics?

Need for Statistics

It is often said that the language of science is mathematics. It couldwell be said that the language of experimental science is statistics. It isthrough statistical concepts that we quantify the correspondencebetween theoretical predictions and experimental observations.

Kyle Cranmer, 1503.07622

Ritesh Singh Statistics: Lecture –1 3 / 16

Page 4: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

Why Statistics?

Need for Statistics

... we quantify the correspondence between theoretical predictionsand experimental observations.

Positive outcomePeak in invariant mass distributionof two photons⇒

Higgs mass (in GeV) =125.09± 0.21(stat.)± 0.11(syst.)(ATLAS + CMS)

I Parameter estimationI Hypothesis testing

Ritesh Singh Statistics: Lecture –1 4 / 16

Page 5: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

Why Statistics?

Need for Statistics

... we quantify the correspondence between theoretical predictionsand experimental observations.

Positive outcomePeak in invariant mass distributionof two photons⇒

Higgs mass (in GeV) =125.09± 0.21(stat.)± 0.11(syst.)(ATLAS + CMS)

I Parameter estimationI Hypothesis testing

Ritesh Singh Statistics: Lecture –1 4 / 16

Page 6: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

Why Statistics?

Need for Statistics

... we quantify the correspondence between theoretical predictionsand experimental observations.

Null outcomeNo conclusive evidence of signalsof Higgs boson at LEP⇒

mH > 114.3 GeV at 95% C.L.Best fit for mH

Upper limit on mH (SM like)

I Parameter limitationI Hypothesis testing

Ritesh Singh Statistics: Lecture –1 5 / 16

Page 7: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

Why Statistics?

Need for Statistics

... we quantify the correspondence between theoretical predictionsand experimental observations.

Null outcomeNo conclusive evidence of signalsof Higgs boson at LEP⇒

mH > 114.3 GeV at 95% C.L.Best fit for mH

Upper limit on mH (SM like)

I Parameter limitationI Hypothesis testing

Ritesh Singh Statistics: Lecture –1 5 / 16

Page 8: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?As a discipline

What is Statistics?

Statistics is a discipline which is concerned with:

I designing experiments and other data collection(like simulations),

I summarizing information to aid understanding(in terms of quantifiers, estimators),

I drawing conclusions from data(translating quantifiers into model parameters),

I estimating the present or predicting the future(post-diction) or (pre-diction)

Ritesh Singh Statistics: Lecture –1 6 / 16

Page 9: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?As a discipline

What is Statistics?

Statistics is a discipline which is concerned with:

I designing experiments and other data collection(like simulations),

I summarizing information to aid understanding(in terms of quantifiers, estimators),

I drawing conclusions from data(translating quantifiers into model parameters),

I estimating the present or predicting the future(post-diction) or (pre-diction)

Ritesh Singh Statistics: Lecture –1 6 / 16

Page 10: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?As a discipline

What is Statistics?

Statistics is a discipline which is concerned with:

I designing experiments and other data collection(like simulations),

I summarizing information to aid understanding(in terms of quantifiers, estimators),

I drawing conclusions from data(translating quantifiers into model parameters),

I estimating the present or predicting the future(post-diction) or (pre-diction)

Ritesh Singh Statistics: Lecture –1 6 / 16

Page 11: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?As a discipline

What is Statistics?

Statistics is a discipline which is concerned with:

I designing experiments and other data collection(like simulations),

I summarizing information to aid understanding(in terms of quantifiers, estimators),

I drawing conclusions from data(translating quantifiers into model parameters),

I estimating the present or predicting the future(post-diction) or (pre-diction)

Ritesh Singh Statistics: Lecture –1 6 / 16

Page 12: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?As a discipline

What is Statistics?

Statistics is a discipline which is concerned with:

I designing experiments and other data collection(like simulations),

I summarizing information to aid understanding(in terms of quantifiers, estimators),

I drawing conclusions from data(translating quantifiers into model parameters),

I estimating the present or predicting the future(post-diction) or (pre-diction)

Ritesh Singh Statistics: Lecture –1 6 / 16

Page 13: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?As a quantifier

Statistics ≡ quantifiers

Let D = {x1, x2, ... , xN} be a large set of data.

I Histogram, frequencydistribution

I Cumulative distribution,I Median center of distributionI Mode, most frequentI Mean, center of mass

Ritesh Singh Statistics: Lecture –1 7 / 16

Page 14: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?As a quantifier

Statistics ≡ quantifiers

Let D = {x1, x2, ... , xN} be a large set of data.

I Histogram, frequencydistribution

I Cumulative distribution,

I Median center of distributionI Mode, most frequentI Mean, center of mass

Ritesh Singh Statistics: Lecture –1 7 / 16

Page 15: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?As a quantifier

Statistics ≡ quantifiers

Let D = {x1, x2, ... , xN} be a large set of data.

I Histogram, frequencydistribution

I Cumulative distribution,I Median center of distributionI Mode, most frequentI Mean, center of mass

Ritesh Singh Statistics: Lecture –1 7 / 16

Page 16: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?As a quantifier

Moments of distribution

MeanCenter of mass for a given data set

〈x〉 =1

N

N∑i=1

xi

Mean for a given histogram

〈x〉 =

(H∑i=1

wi

)−1 H∑i=1

wi xi

Here H is number of histogramsand wi is the frequency in the i th

histogram.

Higher Moments

For a given data set nth moment is:

〈xn〉 =1

N

N∑i=1

xni

and for a given histogram is:

〈xn〉 =

(H∑i=1

wi

)−1 H∑i=1

wi xni

The mean is the first (n = 1)moment. Other importantquantifier is related to 〈x2〉.

Ritesh Singh Statistics: Lecture –1 8 / 16

Page 17: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?As a quantifier

Moments of distribution

VarianceMean or average, 〈x〉 is a measureof center of the data. Variancemeasures the spread of dataaround the mean. Since we have:

〈(x − 〈x〉)〉 = 0

The variance is defines as

V = 〈(x − 〈x〉)2〉 = 〈x2〉 − 〈x〉2

Higher Moments

Show:I 〈(x − 〈x〉)3〉 =〈x3〉+ 2〈x〉3 − 3〈x〉〈x2〉Skewness

I 〈(x − 〈x〉)4〉 = 〈x4〉 − 3〈x〉4 +6〈x2〉〈x〉2 − 4〈x〉〈x3〉Kurtosis

Show these relations numericallyduring Tutorials.

Ritesh Singh Statistics: Lecture –1 9 / 16

Page 18: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?Data Modelling

Data Modelling ≡ Data Compression

I We have D = {x1, x2, ... , xN}, a large set of data points.I Chosing H number of bins, the data set can be converted into a

histogram (frequency distribution). Now a set of N data pointsare represented or modelled by set of 2H number, H for the bincenters and H for frequencies. This is the first step of modelling,where we replace a large set of data by its frequency distribution.

I Next level modelling involves finding a functional form of thefrequency distribution. Then we can replace the 2H numbersdescribing the histogram with few moments 〈xn〉, which areparameters of its functional approximation.

I Example: If the shape of the histogram is Gaussian, it isparametrized with the mean and the variance.

D ⇒ 〈x〉 ±√V = 〈x〉 ±

√〈x2〉 − 〈x〉2

Ritesh Singh Statistics: Lecture –1 10 / 16

Page 19: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?Data Modelling

Data Modelling ≡ Data Compression

I We have D = {x1, x2, ... , xN}, a large set of data points.

I Chosing H number of bins, the data set can be converted into ahistogram (frequency distribution). Now a set of N data pointsare represented or modelled by set of 2H number, H for the bincenters and H for frequencies. This is the first step of modelling,where we replace a large set of data by its frequency distribution.

I Next level modelling involves finding a functional form of thefrequency distribution. Then we can replace the 2H numbersdescribing the histogram with few moments 〈xn〉, which areparameters of its functional approximation.

I Example: If the shape of the histogram is Gaussian, it isparametrized with the mean and the variance.

D ⇒ 〈x〉 ±√V = 〈x〉 ±

√〈x2〉 − 〈x〉2

Ritesh Singh Statistics: Lecture –1 10 / 16

Page 20: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?Data Modelling

Data Modelling ≡ Data Compression

I We have D = {x1, x2, ... , xN}, a large set of data points.I Chosing H number of bins, the data set can be converted into a

histogram (frequency distribution). Now a set of N data pointsare represented or modelled by set of 2H number, H for the bincenters and H for frequencies. This is the first step of modelling,where we replace a large set of data by its frequency distribution.

I Next level modelling involves finding a functional form of thefrequency distribution. Then we can replace the 2H numbersdescribing the histogram with few moments 〈xn〉, which areparameters of its functional approximation.

I Example: If the shape of the histogram is Gaussian, it isparametrized with the mean and the variance.

D ⇒ 〈x〉 ±√V = 〈x〉 ±

√〈x2〉 − 〈x〉2

Ritesh Singh Statistics: Lecture –1 10 / 16

Page 21: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?Data Modelling

Data Modelling ≡ Data Compression

I We have D = {x1, x2, ... , xN}, a large set of data points.I Chosing H number of bins, the data set can be converted into a

histogram (frequency distribution). Now a set of N data pointsare represented or modelled by set of 2H number, H for the bincenters and H for frequencies. This is the first step of modelling,where we replace a large set of data by its frequency distribution.

I Next level modelling involves finding a functional form of thefrequency distribution. Then we can replace the 2H numbersdescribing the histogram with few moments 〈xn〉, which areparameters of its functional approximation.

I Example: If the shape of the histogram is Gaussian, it isparametrized with the mean and the variance.

D ⇒ 〈x〉 ±√V = 〈x〉 ±

√〈x2〉 − 〈x〉2

Ritesh Singh Statistics: Lecture –1 10 / 16

Page 22: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

What is Statistics?Data Modelling

Data Modelling ≡ Data Compression

I We have D = {x1, x2, ... , xN}, a large set of data points.I Chosing H number of bins, the data set can be converted into a

histogram (frequency distribution). Now a set of N data pointsare represented or modelled by set of 2H number, H for the bincenters and H for frequencies. This is the first step of modelling,where we replace a large set of data by its frequency distribution.

I Next level modelling involves finding a functional form of thefrequency distribution. Then we can replace the 2H numbersdescribing the histogram with few moments 〈xn〉, which areparameters of its functional approximation.

I Example: If the shape of the histogram is Gaussian, it isparametrized with the mean and the variance.

D ⇒ 〈x〉 ±√V = 〈x〉 ±

√〈x2〉 − 〈x〉2

Ritesh Singh Statistics: Lecture –1 10 / 16

Page 23: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityHistograms as Probability Distribution

Histogram ≡ Probabilities

A group of 50people hasfollowing agedistribution

Age #0− 5 2

5− 10 810− 15 1515− 20 1420− 25 725− 30 4

Randomly choose a personI What is the probability of

age below 15 years?I Probability of age above 25

year?I Probability of age between

10 and 20 years?

Ans: 50%, 8%, 58%

Probability density =probability/(bin width)

Ritesh Singh Statistics: Lecture –1 11 / 16

Page 24: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityHistograms as Probability Distribution

Histogram ≡ Probabilities

A group of 50people hasfollowing agedistribution

Age #0− 5 2

5− 10 810− 15 1515− 20 1420− 25 725− 30 4

Randomly choose a personI What is the probability of

age below 15 years?I Probability of age above 25

year?I Probability of age between

10 and 20 years?

Ans: 50%, 8%, 58%

Probability density =probability/(bin width)

Ritesh Singh Statistics: Lecture –1 11 / 16

Page 25: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityHistograms as Probability Distribution

Histogram ≡ Probabilities

A group of 50people hasfollowing agedistribution

Age #0− 5 2

5− 10 810− 15 1515− 20 1420− 25 725− 30 4

Randomly choose a person

I What is the probability ofage below 15 years?

I Probability of age above 25year?

I Probability of age between10 and 20 years?

Ans: 50%, 8%, 58%

Probability density =probability/(bin width)

Ritesh Singh Statistics: Lecture –1 11 / 16

Page 26: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityHistograms as Probability Distribution

Histogram ≡ Probabilities

A group of 50people hasfollowing agedistribution

Age #0− 5 2

5− 10 810− 15 1515− 20 1420− 25 725− 30 4

Randomly choose a personI What is the probability of

age below 15 years?

I Probability of age above 25year?

I Probability of age between10 and 20 years?

Ans: 50%, 8%, 58%

Probability density =probability/(bin width)

Ritesh Singh Statistics: Lecture –1 11 / 16

Page 27: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityHistograms as Probability Distribution

Histogram ≡ Probabilities

A group of 50people hasfollowing agedistribution

Age #0− 5 2

5− 10 810− 15 1515− 20 1420− 25 725− 30 4

Randomly choose a personI What is the probability of

age below 15 years?I Probability of age above 25

year?

I Probability of age between10 and 20 years?

Ans: 50%, 8%, 58%

Probability density =probability/(bin width)

Ritesh Singh Statistics: Lecture –1 11 / 16

Page 28: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityHistograms as Probability Distribution

Histogram ≡ Probabilities

A group of 50people hasfollowing agedistribution

Age #0− 5 2

5− 10 810− 15 1515− 20 1420− 25 725− 30 4

Randomly choose a personI What is the probability of

age below 15 years?I Probability of age above 25

year?I Probability of age between

10 and 20 years?

Ans: 50%, 8%, 58%

Probability density =probability/(bin width)

Ritesh Singh Statistics: Lecture –1 11 / 16

Page 29: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityHistograms as Probability Distribution

Histogram ≡ Probabilities

A group of 50people hasfollowing agedistribution

Age #0− 5 2

5− 10 810− 15 1515− 20 1420− 25 725− 30 4

Randomly choose a personI What is the probability of

age below 15 years?I Probability of age above 25

year?I Probability of age between

10 and 20 years?

Ans: 50%, 8%, 58%

Probability density =probability/(bin width)

Ritesh Singh Statistics: Lecture –1 11 / 16

Page 30: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityHistograms as Probability Distribution

Histogram ≡ Probabilities

A group of 50people hasfollowing agedistribution

Age #0− 5 2

5− 10 810− 15 1515− 20 1420− 25 725− 30 4

Randomly choose a personI What is the probability of

age below 15 years?I Probability of age above 25

year?I Probability of age between

10 and 20 years?

Ans: 50%, 8%, 58%

Probability density =probability/(bin width)

Ritesh Singh Statistics: Lecture –1 11 / 16

Page 31: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityHistograms as Probability Distribution

Probability Density Functions

I If p(x) is the PDF of x thenp(x)dx is the probability offinding the value between xand x + dx .

I∫Rdx p(x) = 1, R is the range

of variable x .I p(x) ≥ 0 for x ∈ R

Uniform (x , a, b), x ∈ [a, b]

1

b − a

Exponential (x , a), x ∈ [0,∞]

1

aexp(−x/a)

Gaussian (x , µ, σ), x ∈ [−∞,∞]

1√2πσ

exp

[(x − µ)2

2σ2

]

Ritesh Singh Statistics: Lecture –1 12 / 16

Page 32: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 33: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 34: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 35: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 36: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 37: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 38: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 39: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 40: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 41: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 42: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 43: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Kolmogorov axiom

Let S is the set of events. Let A andB be the subsets of S .

I For each A ⊂ S , there is areal-valued function P suchthat P(A) ≥ 0.

I P(S) = 1

I For A ∩ B = φP(A ∪ B) = P(A) + P(B)

These are Kolmogorov axioms forthe function P to be probability.

Many other properties ofprobability can be deduced fromthese.

I Probability of A or B isP(A ∪ B)

I Probability of A and B isP(A ∩ B)

I P(A) = 1− P(A), A iscomplement of A w.r.t. S

I P(A ∪ A) = 1

I P(φ) = 0

I If A ⊂ B, then P(A) ≤ P(B)

I P(A ∪ B) =P(A) + P(B)− P(A ∩ B)

Ritesh Singh Statistics: Lecture –1 13 / 16

Page 44: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Conditional Probability

If A and B are independent thenprobability of A and B

P(A ∩ B) = P(A) P(B)

I Month = April (say A) andyear being even (say B) isindependent of each other,since there is April in everyyear.

I 29 February (A) andeven/odd-ness of year (B) isnot independent.

Conditional probability of A givenB (with P(B) 6= 0) is given by

P(A|B) =P(A ∩ B)

P(B)

If A and B are independent

P(A|B) =P(A ∩ B)

P(B)= P(A)

i.e. When A and B areindependent, then the conditionalprobability of A given B does notdepend upon B at all.

Ritesh Singh Statistics: Lecture –1 14 / 16

Page 45: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Conditional Probability

If A and B are independent thenprobability of A and B

P(A ∩ B) = P(A) P(B)

I Month = April (say A) andyear being even (say B) isindependent of each other,since there is April in everyyear.

I 29 February (A) andeven/odd-ness of year (B) isnot independent.

Conditional probability of A givenB (with P(B) 6= 0) is given by

P(A|B) =P(A ∩ B)

P(B)

If A and B are independent

P(A|B) =P(A ∩ B)

P(B)= P(A)

i.e. When A and B areindependent, then the conditionalprobability of A given B does notdepend upon B at all.

Ritesh Singh Statistics: Lecture –1 14 / 16

Page 46: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Conditional Probability

If A and B are independent thenprobability of A and B

P(A ∩ B) = P(A) P(B)

I Month = April (say A) andyear being even (say B) isindependent of each other,since there is April in everyyear.

I 29 February (A) andeven/odd-ness of year (B) isnot independent.

Conditional probability of A givenB (with P(B) 6= 0) is given by

P(A|B) =P(A ∩ B)

P(B)

If A and B are independent

P(A|B) =P(A ∩ B)

P(B)= P(A)

i.e. When A and B areindependent, then the conditionalprobability of A given B does notdepend upon B at all.

Ritesh Singh Statistics: Lecture –1 14 / 16

Page 47: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Conditional Probability

If A and B are independent thenprobability of A and B

P(A ∩ B) = P(A) P(B)

I Month = April (say A) andyear being even (say B) isindependent of each other,since there is April in everyyear.

I 29 February (A) andeven/odd-ness of year (B) isnot independent.

Conditional probability of A givenB (with P(B) 6= 0) is given by

P(A|B) =P(A ∩ B)

P(B)

If A and B are independent

P(A|B) =P(A ∩ B)

P(B)= P(A)

i.e. When A and B areindependent, then the conditionalprobability of A given B does notdepend upon B at all.

Ritesh Singh Statistics: Lecture –1 14 / 16

Page 48: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Conditional Probability

If A and B are independent thenprobability of A and B

P(A ∩ B) = P(A) P(B)

I Month = April (say A) andyear being even (say B) isindependent of each other,since there is April in everyyear.

I 29 February (A) andeven/odd-ness of year (B) isnot independent.

Conditional probability of A givenB (with P(B) 6= 0) is given by

P(A|B) =P(A ∩ B)

P(B)

If A and B are independent

P(A|B) =P(A ∩ B)

P(B)= P(A)

i.e. When A and B areindependent, then the conditionalprobability of A given B does notdepend upon B at all.

Ritesh Singh Statistics: Lecture –1 14 / 16

Page 49: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

Bayes’ Theorem

Probability of A given B is

P(A|B) =P(A ∩ B)

P(B)

and probability of B given A is

P(B|A) =P(B ∩ A)

P(A)

Since P(A ∩ B) = P(B ∩ A)

P(A|B) =P(B|A) P(A)

P(B)

Let’s assume S is composed ofdisjoint subsets Ai , thus S = ∪iAi

Further

B = B∩S = B∩(∪iAi ) = ∪i (B∩Ai )

P(B) =∑i

P(B ∩ Ai )

=∑i

P(B|Ai )P(Ai )

P(A|B) =P(B|A) P(A)∑i P(B|Ai )P(Ai )

Ritesh Singh Statistics: Lecture –1 15 / 16

Page 50: Statistics, Distributions and Probabilityritesh.singh/Lecture_1.pdf · Statistics, Distributions and Probability Ritesh K. Singh Indian Institute of Science Education & Research Kolkata

ProbabilityProbability Theory

An example of Bayes’ Theorem

The probability of havingAIDS(prior to any tests) for a givenpopulationP(AIDS) = 0.001P(no AIDS) = 0.999

Consider a test with results + or −and conditional probabilities asP(+|AIDS) = 0.98P(−|AIDS) = 0.02

P(+|no AIDS) = 0.03P(−|no AIDS) = 0.98

Follwing questions can be askedand answered:

I What is the probability forfind +(−) result for thepopulation?0.03095 (0.96905)

I If the result is +, what is thelikelihood of person suffringfrom AIDS?0.03166

I If the result is −, what is thelikelihood of person stillsuffring from AIDS?2.06× 10−5

Ritesh Singh Statistics: Lecture –1 16 / 16