a very small intro to bayesian statistics · goals of bayesian statistics 1.set up a...

27
A very small intro to Bayesian Statistics Rosana Zenil-Ferguson, Will Freyman, and Jordan Koch University of Minnesota Botany 2018

Upload: others

Post on 21-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

A very small intro to Bayesian Statistics

Rosana Zenil-Ferguson, Will Freyman, and Jordan Koch

University of Minnesota

Botany 2018

Page 2: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Goals of Bayesian statistics

1. Set up a (probabilistic) model based on hypothesis ofinterest

2. Condition that model on observed data

3. Draw inferences, evaluate its fit and implicationsGelman et al. 2014 Bayesian Data Analysis. Third Edition

Page 3: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Goals of Bayesian statistics

1. Set up a (probabilistic) model based on hypothesis ofinterest

2. Condition that model on observed data

3. Draw inferences, evaluate its fit and implicationsGelman et al. 2014 Bayesian Data Analysis. Third Edition

Page 4: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Goals of Bayesian statistics

1. Set up a (probabilistic) model based on hypothesis ofinterest

2. Condition that model on observed data

3. Draw inferences, evaluate its fit and implicationsGelman et al. 2014 Bayesian Data Analysis. Third Edition

Page 5: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

In a Bayesian framework

We are always interested in knowing the posterior distribution

P (θ|D) ∝ P (D|θ) P (θ)

Page 6: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

What is a prior distribution P (θ)?

In simple terms it is our hypothesis

I It is subjective because it is an informed assumptionI We need clarify how it is set up (elicit priors)I We usually set our hypothesis via parameters that are

unknown and random

Page 7: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

What is a prior distribution P (θ)?

In simple terms it is our hypothesisI It is subjective because it is an informed assumption

I We need clarify how it is set up (elicit priors)I We usually set our hypothesis via parameters that are

unknown and random

Page 8: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

What is a prior distribution P (θ)?

In simple terms it is our hypothesisI It is subjective because it is an informed assumptionI We need clarify how it is set up (elicit priors)

I We usually set our hypothesis via parameters that areunknown and random

Page 9: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

What is a prior distribution P (θ)?

In simple terms it is our hypothesisI It is subjective because it is an informed assumptionI We need clarify how it is set up (elicit priors)I We usually set our hypothesis via parameters that are

unknown and random

Page 10: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Hypothesis: Red flowers evolve into purple and viceversa

θ = (qBR, qRB)

Page 11: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Hypothesis: Red flowers evolve into purple and viceversa

θ = (qBR, qRB)

Page 12: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

The prior distribution: P (θ)

By selecting a blue exponential faster than the red we areimplicitly saying that evolution from blue to red has

happened more frequently than red to blue

Page 13: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

The prior distribution: P (θ)

By selecting a blue exponential faster than the red we areimplicitly saying that evolution from blue to red has

happened more frequently than red to blue

Page 14: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

D is our dataWe go into our favorite herbarium, field site, or green houseand we collect color of multiple species

How do we integrate our model θ and our data D ?

Page 15: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Calculating the likelihood P (D|θ)

I Likelihood function: Probability of the sample given thehypothesis

I Is it a probability?

Yes for the sample. BUT NO! for theparameters

I In likelihood framework then the parameters are unknownbut fixed

I Implications: parameters do not have a probabilitydistribution, and it is more complicated to assess theiruncertainty

Page 16: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Calculating the likelihood P (D|θ)

I Likelihood function: Probability of the sample given thehypothesis

I Is it a probability?

Yes for the sample. BUT NO! for theparameters

I In likelihood framework then the parameters are unknownbut fixed

I Implications: parameters do not have a probabilitydistribution, and it is more complicated to assess theiruncertainty

Page 17: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Calculating the likelihood P (D|θ)

I Likelihood function: Probability of the sample given thehypothesis

I Is it a probability?

Yes for the sample. BUT NO! for theparameters

I In likelihood framework then the parameters are unknownbut fixed

I Implications: parameters do not have a probabilitydistribution, and it is more complicated to assess theiruncertainty

Page 18: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Calculating the likelihood P (D|θ)

I Likelihood function: Probability of the sample given thehypothesis

I Is it a probability? Yes for the sample. BUT NO! for theparameters

I In likelihood framework then the parameters are unknownbut fixed

I Implications: parameters do not have a probabilitydistribution, and it is more complicated to assess theiruncertainty

Page 19: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Calculating the likelihood P (D|θ)

I Likelihood function: Probability of the sample given thehypothesis

I Is it a probability? Yes for the sample. BUT NO! for theparameters

I In likelihood framework then the parameters are unknownbut fixed

I Implications: parameters do not have a probabilitydistribution, and it is more complicated to assess theiruncertainty

Page 20: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Calculating the likelihood is computationallychallenging

Thousands of optimizations to find maximum likelihoodestimates and confident intervals

Page 21: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Calculating the likelihood is computationallychallenging

Thousands of optimizations to find maximum likelihoodestimates and confident intervals

Page 22: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Where does the posterior come from?

Bayes theorem (conditional probabilities)

P (θ|D) =P (θ,D)

P (D)=P (D|θ)P (θ)

P (D)

Why do we ignore P (D) and put a symbol ∝?

P (θ|D) ∝ P (D|θ) P (θ)

Because P (D) is the probability of the sample and does notcontain information about θ

Page 23: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Where does the posterior come from?

Bayes theorem (conditional probabilities)

P (θ|D) =P (θ,D)

P (D)=P (D|θ)P (θ)

P (D)

Why do we ignore P (D) and put a symbol ∝?

P (θ|D) ∝ P (D|θ) P (θ)

Because P (D) is the probability of the sample and does notcontain information about θ

Page 24: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Where does the posterior come from?

Bayes theorem (conditional probabilities)

P (θ|D) =P (θ,D)

P (D)=P (D|θ)P (θ)

P (D)

Why do we ignore P (D) and put a symbol ∝?

P (θ|D) ∝ P (D|θ) P (θ)

Because P (D) is the probability of the sample and does notcontain information about θ

Page 25: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Making inferences with the posterior distribution

I The posterior distribution is probability.

I .Measures the uncertainty of the hypothesis after beingconfronted to data (update of my hypothesis)

I We need explore it thoroughly (MCMC quality).

Page 26: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Making inferences with the posterior distribution

I The posterior distribution is probability.I .Measures the uncertainty of the hypothesis after being

confronted to data (update of my hypothesis)

I We need explore it thoroughly (MCMC quality).

Page 27: A very small intro to Bayesian Statistics · Goals of Bayesian statistics 1.Set up a (probabilistic) model based on hypothesis of interest 2.Condition that model on observed data

Making inferences with the posterior distribution

I The posterior distribution is probability.I .Measures the uncertainty of the hypothesis after being

confronted to data (update of my hypothesis)I We need explore it thoroughly (MCMC quality).