a default bayesian hypothesis test for mediation

13
A default Bayesian hypothesis test for mediation Michèle B. Nuijten & Ruud Wetzels & Dora Matzke & Conor V. Dolan & Eric-Jan Wagenmakers # Psychonomic Society, Inc. 2014 Abstract In order to quantify the relationship between mul- tiple variables, researchers often carry out a mediation analy- sis. In such an analysis, a mediator (e.g., knowledge of a healthy diet) transmits the effect from an independent variable (e.g., classroom instruction on a healthy diet) to a dependent variable (e.g., consumption of fruits and vegetables). Almost all mediation analyses in psychology use frequentist estima- tion and hypothesis-testing techniques. A recent exception is Yuan and MacKinnon (Psychological Methods, 14, 301322, 2009), who outlined a Bayesian parameter estimation proce- dure for mediation analysis. Here we complete the Bayesian alternative to frequentist mediation analysis by specifying a default Bayesian hypothesis test based on the JeffreysZellnerSiow approach. We further extend this default Bayesian test by allowing a comparison to directional or one-sided alternatives, using Markov chain Monte Carlo tech- niques implemented in JAGS. All Bayesian tests are imple- mented in the R package BayesMed (Nuijten, Wetzels, Matzke, Dolan, & Wagenmakers, 2014). Keywords Bayes factor . Evidence . Mediated effects Mediated relationships are central to the theory and practice of psychology. In the prototypical scenario, a mediator ( M; e.g., knowledge of a healthy diet) transmits the effect from an inde- pendent variable (X; e.g., classroom instruction on a healthy diet) to a dependent variable (Y ; e.g., consumption of fruits and vegetables). Other examples arise in social psychology, where attitudes (X) cause intentions (M) and these intentions affect behavior (Y ; MacKinnon, Fairchild, & Fritz, 2007). To quantify such relationships between mediator, independent variable, and dependent variable, researchers often use a toolbox of popular statistical methods collectively known as mediation analysis. The currently available tools for mediation analyses are almost exclusively based on classical or frequentist statistics, featuring concepts such as confidence intervals and p values. Recently, Yuan and MacKinnon (2009) proposed a Bayesian mediation analysis that allows researchers to obtain a posterior distribution (and associated credible interval) for the mediated effect. This posterior distribution quantifies the uncertainty about the strength of the mediated effect under the assumption that the effect does not equal zero. This approach constitutes a valuable addition to the toolbox of mediation methods, but it specifically concerns parameter estimation and not hypothesis testing. As Yuan and MacKinnon stated in their conclusion: One important topic we have not covered in this article is hypothesis testing . . . Strict Bayesian hypothesis testing is based on Bayes factor, which is essentially the odds of the null hypothesis being true versus the alternative hypothesis being true, conditional on the observed data. The use of Bayesian hypothesis testing . . . would be a reasonable future research topic in Bayesian mediation analysis. Hence, the goal of this article is to add another statistical method to the toolbox of mediation analysisnamely, the Bayes factor hypothesis test alluded to by Yuan and MacKinnon (2009). In the development of this test, we have assumed a default specification of prior distributions based on the JeffreysZellnerSiow framework (Liang, Paulo, Molina, Electronic supplementary material The online version of this article (doi:10.3758/s13428-014-0470-2) contains supplementary material, which is available to authorized users. M. B. Nuijten Tilburg University, Tilburg, The Netherlands R. Wetzels : D. Matzke : E.<J. Wagenmakers (*) Department of Psychology, University of Amsterdam, Weesperplein 4, 1018 XA Amsterdam, The Netherlands e-mail: [email protected] C. V. Dolan VU University Amsterdam, Amsterdam, The Netherlands Behav Res DOI 10.3758/s13428-014-0470-2

Upload: eric-jan

Post on 25-Jan-2017

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A default Bayesian hypothesis test for mediation

A default Bayesian hypothesis test for mediation

Michèle B. Nuijten & Ruud Wetzels & Dora Matzke &

Conor V. Dolan & Eric-Jan Wagenmakers

# Psychonomic Society, Inc. 2014

Abstract In order to quantify the relationship between mul-tiple variables, researchers often carry out a mediation analy-sis. In such an analysis, a mediator (e.g., knowledge of ahealthy diet) transmits the effect from an independent variable(e.g., classroom instruction on a healthy diet) to a dependentvariable (e.g., consumption of fruits and vegetables). Almostall mediation analyses in psychology use frequentist estima-tion and hypothesis-testing techniques. A recent exception isYuan and MacKinnon (Psychological Methods, 14, 301–322,2009), who outlined a Bayesian parameter estimation proce-dure for mediation analysis. Here we complete the Bayesianalternative to frequentist mediation analysis by specifying adefault Bayesian hypothesis test based on the Jeffreys–Zellner–Siow approach. We further extend this defaultBayesian test by allowing a comparison to directional orone-sided alternatives, using Markov chain Monte Carlo tech-niques implemented in JAGS. All Bayesian tests are imple-mented in the R package BayesMed (Nuijten, Wetzels,Matzke, Dolan, & Wagenmakers, 2014).

Keywords Bayes factor . Evidence .Mediated effects

Mediated relationships are central to the theory and practice ofpsychology. In the prototypical scenario, a mediator (M; e.g.,knowledge of a healthy diet) transmits the effect from an inde-pendent variable (X; e.g., classroom instruction on a healthy diet)to a dependent variable (Y; e.g., consumption of fruits andvegetables). Other examples arise in social psychology, whereattitudes (X) cause intentions (M) and these intentions affectbehavior (Y; MacKinnon, Fairchild, & Fritz, 2007). To quantifysuch relationships between mediator, independent variable, anddependent variable, researchers often use a toolbox of popularstatistical methods collectively known as mediation analysis.

The currently available tools for mediation analyses arealmost exclusively based on classical or frequentist statistics,featuring concepts such as confidence intervals and p values.Recently, Yuan and MacKinnon (2009) proposed a Bayesianmediation analysis that allows researchers to obtain a posteriordistribution (and associated credible interval) for the mediatedeffect. This posterior distribution quantifies the uncertaintyabout the strength of the mediated effect under the assumptionthat the effect does not equal zero. This approach constitutes avaluable addition to the toolbox of mediation methods, but itspecifically concerns parameter estimation and not hypothesistesting. As Yuan and MacKinnon stated in their conclusion:“One important topic we have not covered in this article ishypothesis testing . . . Strict Bayesian hypothesis testing isbased on Bayes factor, which is essentially the odds of the nullhypothesis being true versus the alternative hypothesis beingtrue, conditional on the observed data. The use of Bayesianhypothesis testing . . . would be a reasonable future researchtopic in Bayesian mediation analysis.”

Hence, the goal of this article is to add another statisticalmethod to the toolbox of mediation analysis—namely, theBayes factor hypothesis test alluded to by Yuan andMacKinnon (2009). In the development of this test, we haveassumed a default specification of prior distributions based onthe Jeffreys–Zellner–Siow framework (Liang, Paulo, Molina,

Electronic supplementary material The online version of this article(doi:10.3758/s13428-014-0470-2) contains supplementary material,which is available to authorized users.

M. B. NuijtenTilburg University, Tilburg, The Netherlands

R. Wetzels :D. Matzke : E.<J. Wagenmakers (*)Department of Psychology, University of Amsterdam, Weesperplein4, 1018 XA Amsterdam, The Netherlandse-mail: [email protected]

C. V. DolanVU University Amsterdam, Amsterdam, The Netherlands

Behav ResDOI 10.3758/s13428-014-0470-2

Page 2: A default Bayesian hypothesis test for mediation

Clyde, & Berger, 2008), promoted in psychology by JeffRouder, Richard Morey, and colleagues (Rouder & Morey,2012; Rouder, Morey, Speckman, & Province, 2012; Rouder,Speckman, Sun, Morey, & Iverson, 2009), as well as ourselves(Wetzels, Grasman, & Wagenmakers, 2012; Wetzels,Raaijmakers, Jakab, & Wagenmakers, 2009; Wetzels &Wagenmakers, 2012; for an alternative approach, see Semmens-Wheeler, Dienes, & Duka, 2013). In our opinion, the defaultspecification of prior distributions is useful because it provides areference analysis that can be carried out regardless of subjectiveconsiderations about the topic at hand. Of course, researchers whohave prior knowledge may wish to incorporate that knowledgeinto themodels to devise a more informative test (e.g., Armstrong& Dienes, 2013; Dienes, 2011; Guo, Li, Yang, & Dienes, 2013).Here, we focus solely on the default test as it pertains to theprototypical, single-level scenario of three variables.

The outline of this article is as follows. First, we brieflydiscuss the conventional frequentist tests and the existingBayesian mediation analysis proposed by Yuan andMacKinnon (2009). We then explain Bayesian hypothesis test-ing in general and introduce our default Bayesian hypothesistest for mediation.We illustrate the performance of our test witha simulation study and an example of a psychological study.Finally, we discuss software in which we implemented theBayesian methods for mediation analysis: the R packageBayesMed (Nuijten et al., 2014).

Frequentist mediation analysis

Consider a relation between an independent variable X and adependent variable Y (see Fig. 1a). In a linear regressionequation, such a relation can be represented as follows:

Y i ¼ β0 1ð Þ þ τX i þ ε 1ð Þ; ð1Þ

where subscript i identifies the participant, τ represents therelation between the independent variable X and the depen-dent variable Y, β0(1) is the intercept, and ε(1) is the residual.The effect of X on Y, path τ, is called the total effect.

The relation between X and Y can be mediated by variableM, which means that a change in X leads to a change in M,which then leads to a change in Y (see Fig. 1b, c). Theresulting mediationmodel can be represented by the followingset of linear regression equations:

Y i ¼ β0 2ð Þ þ τ 0X i þ βMi þ ε 2ð Þ; ð2Þ

Mi ¼ β0 3ð Þ þ αX i þ ε 3ð Þ; ð3Þwhere τ ′ represents the relation between X and Y afteradjusting for the effects of the mediator M, α represents therelation between X and M, and β represents the relationbetweenM and Y. Furthermore, ε(1), ε(2), and ε(3) are assumedto be conditionally normally distributed, independent,

homoskedastic residuals. Throughout the remainder of thisarticle, we focus on the standardized mediation model (i.e., amodel in which the variables are standardized) and refer to theregression coefficients α, β, and τ ′ as paths.

The product of α and β is the indirect effect, or the medi-ated effect, assuming that α and β are independent. Theremaining direct effect of X on Y is denoted with τ ′. If themediated effect differs from zero and τ ′ equals zero, the effectof X on Y is completely mediated byM (see Fig. 1c). If τ ′ has avalue other than zero, the relationship between X and Y is onlypartially mediated by M (see Fig. 1b).

A popular method to test for mediation is to test pathsα and β

simultaneously. The estimated indirect effect bαbβ is divided by itsstandard error, and the resulting Z statistic is compared with thestandard normal distribution to assess whether the effect is sig-nificantly different from zero—in which case, the null hypothesisof no mediation can be rejected.

There exist several ways to calculate the standard error ofbαbβ , but the one used in the Sobel test (Sobel, 1982) iscommonly reported:

bσbαbβ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffibβ2bσ2

α þ bα2bσ2

β

r; ð4Þ

where bα and bβ are the point estimates of the regressioncoefficients of the mediated effect and bσa and bσβ theirstandard errors. The 95% confidence interval for the mediatedeffect is then given by bαbβ � 1:96� bσbαbβ .

One problem with the Sobel test is that it assumes asymmetrical sampling distribution for the mediated effect,whereas in reality this distribution is skewed (MacKinnon,Lockwood, & Hoffman, 1998). Consequently, the Sobel testhas relatively low power (MacKinnon, Warsi, & Dwyer,1995). A solution to this problem is to construct a confidenceinterval that takes the asymmetry of the distribution intoaccount (see, e.g., the product method of MacKinnon,Lockwood, Hoffman, & West, 2002) or the profile likelihoodmethod (see Venzon & Moolgavkar, 1988).

Our goal here is not to argue against frequentist statistics ingeneral, or p values in particular; for this, we refer the inter-ested reader to the following articles and references therein:Berger and Delampady (1987), Berger and Wolpert (1988),Dienes (2011), Edwards, Lindman, and Savage (1963),O’Hagan and Forster (2004), Rouder et al. (2012), Sellke,Bayarri, and Berger (2001), Wagenmakers (2007); Wetzelset al., (2011). Instead, our goal is to outline an additionalBayesian tool that can be used for mediation analysis. Theavailability of multiple tools is useful, not just because differ-ent situations may require different tools, but also becausethey allow a robustness check; if different tools yield opposingconclusions, the careful researcher does well to report theresults from both tests, indicating that the data are ambiguous

Behav Res

Page 3: A default Bayesian hypothesis test for mediation

in the sense that the conclusion depends on the analysismethod at hand.

An alternative: Bayesian estimation

Our end goal is to propose a Bayesian alternative for thefrequentist mediation test. Below, we consider the Bayesiantreatment of the mediation model in detail, but first we brieflydiscuss Bayesian inference in general terms. In the Bayesianframework, uncertainty is quantified by probability. Priorbeliefs about parameters are formalized by prior probabilitydistributions that are updated by the observed data to result inposterior beliefs or posterior distributions (Dienes, 2008;Kruschke, 2010; Lee & Wagenmakers, 2013; O’Hagan &Forster, 2004).

The Bayesian updating process proceeds as follows. First,before observing the data under consideration, the Bayesianstatistician assigns a probability distribution to one or moremodel parameters θ on the basis of his or her prior knowledge;hence, this distribution is known as the prior probabilitydistribution, or simply the prior, denoted p(θ). Next oneobserves data D, and the statistical model can be used tocalculate the associated probability of D occurring underspecific values of θ, a quantity known as the likelihood,denoted p(D|θ). The prior distribution p(θ) is then updated tothe posterior distribution p(θ|D) according to Bayes’ rule:

p θ Djð Þ ¼ p D θjð Þp θð Þp Dð Þ : ð5Þ

Note that the marginal likelihood p(D)= ∫p(D|θ)p(θ) dθfunctions as a normalizing constant that ensures that theposterior distribution will integrate to one. Because the nor-malizing constant does not contain θ, it is not important forparameter estimation, and Eq. 5 is often written as follows:

p θ Djð Þ∝p D θjð Þp θð Þ; ð6Þor in words:

Posterior Distribution∝Likelihood� Prior Distribution; ð7Þ

where ∝ means proportional to.In a Bayesian mediation analysis, the above updating prin-

ciple can be used to transition from prior to posterior

distributions for parameters α, β, and τ ′, as proposed byYuan and MacKinnon (2009). Their method allows the userto determine the posterior distribution of the indirect effectαβ, together with a 95 % credible interval. This interval hasthe intuitive interpretation that we can be 95 % confident thatthe true value of αβ resides within this interval.

The approach of Yuan and MacKinnon (2009) is appropri-ate when estimating the size of the mediated effect. However,in experimental psychology, the research question is oftenframed in terms of model selection or hypothesis testing; thatis, the researcher seeks to answer the question: “does the effectexist?” Parameter estimation and model selection have differ-ent aims and, depending on the situation at hand, one proce-dure may be more appropriate than the other. We contend thatthere are situations where a hypothesis test is scientificallyuseful (e.g., Iverson, Wagenmakers, & Lee, 2010; Rouderet al., 2009), and in what follows, we proceed to outline adefault Bayesian hypothesis test for mediation. In order tokeep this article self-contained, we will first introduce theprinciples of Bayesian hypothesis testing (Hoijtink, Klugkist,& Boelen, 2008; Myung & Pitt, 1997; Vandekerckhove,Matzke, & Wagenmakers, in press; Wagenmakers, Lodewyckx,Kuriyal, & Grasman, 2010).

Bayesian hypothesis testing

A Bayesian hypothesis test is a model selection procedurewith two models or hypotheses. Assume two competingmodels or hypotheses,ℳ0 andℳ1, with respective a prioriplausibility p(ℳ0) and p(ℳ1)=1−p(ℳ0). Differences inprior plausibility are often subjective but can be used toformalize the idea that extraordinary claims require extraordi-nary evidence (Lee & Wagenmakers, 2013, Chap. 7). Theratio p(ℳ1)/p(ℳ0) is known as the prior model odds. Thedata update the prior model odds to arrive at the posteriormodel odds, p(ℳ1|D)/p(ℳ0|D), as follows:

p ℳ1 Djð Þp ℳ0 Djð Þ ¼

p D ℳ1jð Þp D ℳ0jð Þ

p ℳ1ð Þp ℳ0ð Þ: ð8Þ

or in words:

Posterior Model Odds ¼ Bayes Factor � Prior Model Odds:

ð9Þ

a b c

Fig. 1 Diagram of the standardmediation model. a Directrelation between X and Y, panel. bPartial mediation. c Fullmediation. Diagonal arrowsindicate that the graphical node isperturbed by an error term

Behav Res

Page 4: A default Bayesian hypothesis test for mediation

Equation 8 shows that the change in model odds broughtabout by the data is given by the so-called Bayes factor(Jeffreys, 1961) which is the ratio of marginal likelihoods(i.e., normalizing constants in Eq. 5):

BF10 ¼ p D ℳ1jð Þp D

���ℳ0

� �: ð10Þ

The Bayes factor quantifies the weight of evidence forℳ1

versus ℳ0 that is provided by the data and, as such, itrepresents “the standard Bayesian solution to the hypothesistesting and model selection problems” (Lewis & Raffery,1997, p. 648) and “the primary tool used in Bayesian inferencefor hypothesis testing and model selection” (Berger, 2006, p.378).

When BF10>1, this indicates that the data are more likelyunderℳ1, and when BF10<1, this indicates that the data aremore likely under ℳ0. For example, when BF10=0.08, theobserved data are 12.5 times more likely under ℳ0 thanunder ℳ1 (i.e., BF01=1/BF10=1/.08=12.5). Note that theBayes factor allows researchers to quantify evidence in favorof the null hypothesis.

Even though the default Bayes factor has an unambiguousand continuous scale, it is sometimes useful to summarize theBayes factor in terms of discrete categories of evidentialstrength. Jeffreys (1961, Appendix B) proposed the classifi-cation scheme shown in Table 1. We replaced the labels“worth no more than a bare mention” with “anecdotal,” “de-cisive” with “extreme,” and “substantial” with “moderate.”These labels facilitate scientific communication but should beconsidered only as an approximate descriptive articulation ofdifferent standards of evidence.

Under equal prior odds, Bayes factors can be converted toposterior probabilities p(ℳ1|D)=BF10/(BF10+1). Thismeans that, for example, BF10=2 translates to p(ℳ1|D)=2/3.

Bayesian hypothesis test for mediation

The Bayesian hypothesis test for mediation contrasts the fol-lowing two models:

ℳ0 : αβ ¼ 0;ℳ1 : αβ≠0 :

ð11ÞObserve that ℳ1 entails that both α≠0 and β≠0, so that

BF10 can be obtained by combining the evidence for thepresence of the two paths. Furthermore, note that in thestandardized model, path α equals the correlation rXM andpath β equals the partial correlation rMY|X. This means thatwe can use the existing default Bayesian hypothesis tests forcorrelation and partial correlation (Wetzels & Wagenmakers,2012) and combine the evidence for the presence of theseparate paths to yield the overall Bayes factor for mediation.

The default JZS prior

The construction of good default priors is an active area ofresearch in Bayesian statistics (e.g., Consonni, Forster, & LaRocca, 2013; Overstall & Forster, 2010). Most work in thisarea has been done in the context of linear regression. It istherefore advantageous to formulate the tests for correlationand partial correlation in terms of linear regression, so thatexisting developments for the selection of default priors canbe brought to bear.

A popular default prior for linear regression is Zellner’s gprior, which includes a normal distribution on the regressioncoefficients α, Jeffreys’s (1961) prior on the precision ϕ (i.e.,a prior that is invariant under transformation), and a uniformprior on the intercept β0:

p α ϕ; g;Xjð Þ∼N 0;g

ϕXTX� �−1�

;

p ϕð Þ∝ 1

ϕ;

p β0ð Þ∝1;

ð12Þ

where X denotes the matrix of predictor variables and theprecision ϕ is the inverse of the variance. The coefficient gis a scaling factor and controls the weight of the prior relativeto the weight of the data. For example, if g=1, the prior hasexactly as much weight as the data, and if g=10, the prior hasone tenth of the weight of the data. A popular default choice isg=n, the unit information prior, where the prior has as muchinfluence as a single observation (Kass & Wasserman, 1995)and the behavior of the test becomes similar to that of BIC(Schwarz, 1978).

However, Liang et al. (2008) showed that the above spec-ification yields a bound on the Bayes factor, even when thereis overwhelming information supportingℳ1. This “informa-tion paradox” can be overcome by assigning the regression

Table 1 Evidence categories for the Bayes factor BF10 (Jeffreys, 1961)

Bayes Factor BF10 Interpretation

>100 Extreme evidence forℳ1

30–100 Very Strong evidence forℳ1

10–30 Strong evidence for ℳ1

3–10 Moderate evidence for ℳ1

1–3 Anecdotal evidence for ℳ1

1 No evidence

1/3–1 Anecdotal evidence for ℳ0

1/10–1/3 Moderate evidence for ℳ0

1/30–1/10 Strong evidence for ℳ0

1/100–1/30 Very Strong evidence forℳ0

<1/100 Extreme evidence forℳ0

Note. We replaced the labels “not worth more than a bare mention” with“anecdotal,” “decisive” with “extreme,” and “substantial” with“moderate.”

Behav Res

Page 5: A default Bayesian hypothesis test for mediation

coefficients a Cauchy prior instead of a normal prior (Zellner& Siow, 1980). Equivalently, this can be accomplished byassigning g from Eq. 12 an Inverse-Gamma(1/2,n/2) prior:

p α ϕ; g;Xjð Þ∼N 0;g

ϕXTX� �−1�

;

p gð Þ ¼ n=2ð Þ1=2Γ 1=2ð Þ g −3=2ð Þe−n= 2gð Þ;

p ϕð Þ∝ 1

ϕ:

ð13Þ

The above specification is known as the Jeffreys–Zellner–Siow, or JZS, prior. The JZS prior was adopted byWetzels andWagenmakers (2012) for the default tests of correlation andpartial correlation, and the same tests are used here to computethe Bayes factor for mediation. It should be stressed, however,that the framework is general and allows researchers to addsubstantive knowledge about the topic under study by chang-ing the prior distributions (e.g., Armstrong & Dienes, 2013;Dienes, 2011; Guo et al., 2013).

With the JZS tests for correlation and partial correlation inhand, we created the default Bayesian hypothesis test formediation in three steps, as described in the next paragraphs.

Step 1: evidence for path α

The first step in the hypothesis test for mediation is to establishthe Bayes factor for a correlation between X and M, path α(see Fig. 1). This test can be formulated as a comparisonbetween two linear models:

ℳ0 : M ¼ β0 þ ε;ℳ1 : M ¼ β0 þ αX þ ε;

ð14Þ

where ε is the normally distributed error term. The default JZSBayes factor quantifies the extent to which the data supportℳ1 with path α versus ℳ0 without path α, as follows(Wetzels & Wagenmakers, 2012):

BF10 ¼ BFα

¼ P D ℳ1jð ÞP D ℳ0jð Þ

¼ n=2ð Þ1=2Γ 1=2ð Þ �

Z ∞

01þ gð Þ n−2ð Þ=2� 1þ 1−r2

� �g

�− n−1ð Þ=2g −3=2ð Þe−n= 2gð Þ dg;

ð15Þwhere n is the number of observations and r is the samplecorrelation.

For the proposed mediation test, we have to multiply theposterior probabilities of paths α and β, since both indepen-dent paths need to be present for mediation to hold. Hence, weneed to convert the Bayes factor for path α to a posteriorprobability. Under the assumption of equal prior odds, thisconversion is straightforward:

p α≠0 Djð Þ ¼ BFα

BFα þ 1: ð16Þ

Step 2: evidence for path β

The second step in the hypothesis test for mediation is toestablish the Bayes factor for a unique correlation betweenM and Y (without any influence from X), path β (see Fig. 1).Again, this test can be formulated as a comparison betweentwo linear models:

ℳ0 : Y ¼ β0 þ τX þ ε;ℳ1 : Y ¼ β0 þ τ 0X þ βM þ ε;

ð17Þ

where ε is the normally distributed error term. The default JZSBayes factor quantifies the extent to which the data supportℳ1 with path β versus ℳ0 without path β, as in a test forpartial correlation (Wetzels & Wagenmakers, 2012):

BF10 ¼ BFβ

¼ P D ℳ1jð ÞP D

���ℳ0

� �

¼

Z ∞

01þ gð Þ n−1−p1ð Þ=2 � 1þ 1−r21

� �g

�− n−1ð Þ=2g −3=2ð Þe−n= 2gð Þ dgZ ∞

01þ gð Þ n−1−p0ð Þ=2 � 1þ 1−r20

� �g

�− n−1ð Þ=2g −3=2ð Þe−n= 2gð Þ dg

;

ð18Þwhere n is the number of observations, r1

2 and r02 represent the

explained variance of ℳ1 and ℳ0, respectively, and p1=2and p0=1 are the number of regression coefficients or paths inℳ1 and ℳ0, respectively.

As before, we can convert the Bayes factor for β to aposterior probability under the assumption of equal prior odds:

p β≠0 Djð Þ ¼ BFβ

BFβ þ 1: ð19Þ

Step 3: evidence for mediation

The third step in the hypothesis test for mediation is tomultiply the evidence for α with the evidence for β to obtainthe overall evidence for mediation:

Evidence for Mediation ¼ p α≠0 Djð Þ � p β≠0 Djð Þ: ð20Þ

The resulting evidence for mediation is a posterior proba-bility that ranges from zero when there is no evidence formediation at all to one when there is absolute certainty thatmediation is present. We can also express the evidence formediation as a Bayes factor through a simple transformation:

BFmed ¼ Evidence for Mediation

1−Evidence for Mediation; ð21Þ

where a BFmed>1 indicates evidence for mediation and BFmed<1 indicates evidence against mediation.

Note that we can multiply the posterior probabilities, be-cause the estimates of path α and β are uncorrelated. This canbe demonstrated by inspecting the relevant element of the

Behav Res

Page 6: A default Bayesian hypothesis test for mediation

inverse of the information matrix—that is, the matrix ofsecond order derivatives of the parameters α and β, withrespect to the log likelihood function. This can be donenumerically, since most SEM programs supply this matrix,and it can be done analytically. These results can be found inthe supplemental materials.

Testing for full or partial mediation

An optional fourth step in the hypothesis test for medi-ation is to assess the evidence for full versus partialmediation. The relation between X and Y is fully medi-ated by M when αβ differs from zero and the directpath between X and Y, path τ ′ is zero. The evidence forτ ′ can be assessed with the JZS test for partial correla-tion as we did for path β (see Eq. 18). Note, however,that the specification of the null model has changed:

ℳ0 : Y ¼ β0 þ βM þ ε;ℳ1 : Y ¼ β0 þ τ 0X þ βM þ ε:

ð22Þ

With this model specification, the default JZS Bayes factorquantifies the extent to which the data supportℳ1 with pathτ ′ versus ℳ0 without path τ ′.

As before, the resulting JZS Bayes factor for τ ′ can beconverted to a posterior probability:

p τ 0≠0 Djð Þ ¼ BFτ 0

BFτ 0 þ 1: ð23Þ

Together, the Bayes factor for τ ′ and the Bayes factorfor mediation indicate whether mediation is full or partial:If the Bayes factor for mediation is substantially largerthan one and the Bayes factor for τ ′ is substantiallysmaller than one, there is evidence for full mediation.On the other hand, if the Bayes factor for mediation issubstantially larger than one and the Bayes factor for τ ′ issubstantially greater than one, there is evidence for partialmediation.

Simulation study

In order to provide an indication of how the mediation testperforms, we designed a simulation study. The goal of thesimulation study was to confirm that the Bayes factor drawsthe correct conclusion: When mediation is present, we expectBFmed to be higher than 1; when mediation is absent, weexpect BFmed to be lower than 1.

Creating the data sets

We assessed performance of the test in different scenarios. Theparameters α and β could take the values 0, .30, and .70; τ ′

was fixed to zero. We did not vary τ ′ since it has no influenceon the Bayes factor for mediation, which only concerns theeffect αβ. Furthermore, we chose four sample sizes: N=20,40,80, and 160. The 3×3 parameter values combined withthe four sample sizes resulted in 36 different scenarios. Foreach scenario, we created the corresponding covariancematrix of X, Y, and M, all with a variance of one. Thisstandardization has no bearing on the results, since they arescale free. We then used the covariance matrix to generatefor each scenario N multivariate normally distributed valuesfor X, M, and Y.1

Results

Figure 2 shows the natural logarithm of the Bayes factors formediation in the different scenarios. The different shades ofgray of the panels show the strength of the mediation thatgoverned the generated data: the darker the gray, the strongerthe mediation. In the scenarios in which there was no media-tion (α=0 and/or β=0), the Bayes factors indicated moderateto very strong evidence for the null model, depending on thesample size. In the scenario of strong mediation (α=.7 andβ=.7), the Bayes factors quickly increase from anecdotalevidence (N=20) to moderate evidence (N=40) and furtheron to very strong and extreme evidence for mediation. In thescenarios of moderate mediation (α=.7 and β=.3 and viceversa), the Bayes factors start to indicate evidence for media-tion from sample sizes of around 60. In the scenario of weakmediation (α=.3 and β=.3), the mediation is too weak for theproposed test to detect it with small sample sizes. In thosescenarios, the test starts to indicate evidence for mediationonly from a sample size of around 80 onward. In summary, theproposed test can distinguish between no mediation and me-diation, provided that effect size and sample size are suffi-ciently large.

Discussion

The results from the simulation study confirm that the JZSBayesian hypothesis test for mediation performs as advertised:When mediation is absent, the test indicates moderate tostrong evidence against mediation, and when mediation ispresent, the test indicates evidence for mediation, providedthat effect size and sample size are sufficiently large. As wasexpected, the evidence for mediation increases with effect sizeand with sample size.

Even though the default test performs well in a qualitativesense, it has one shortcoming that remains to be addressed:

1 We generated data that covaried exactly according to the input covari-ance matrix. Because the covariances of the data were equal to thecovariances of the population, there was no need to control for randomsampling, and we simulated only one experiment per scenario. The fullsimulation code is available in the supplemental materials.

Behav Res

Page 7: A default Bayesian hypothesis test for mediation

With the proposed method, it is not possible to perform aone-sided test. This is regrettable because, in many situa-tions, the researcher has a clear idea on the direction of thepossible paths α, β, and τ ′. In order to perform a one-sidedBayesian hypothesis test, the prior needs to be restrictedsuch that it assigns mass to only positive (or negative)values. This is not possible in the mediation test as outlinedabove.

Extension to one-sided tests

As was mentioned above, our default prior on a regressioncoefficient is a Cauchy(0,1) distribution. This prior instanti-ates a two-sided test, since it represents the belief that theeffect is just as likely to be positive as negative. In manysituations, however, researchers have strong prior ideasabout the direction of the effect (Hoijtink et al., 2008). Inthe Bayesian framework, such prior ideas are directlyreflected in the prior distribution. More specifically, supposethat we expect path α to be greater than zero and we seek atest of this order-restricted hypothesis against the null

hypothesis that α is zero. For this we consider the followingthree hypotheses:

ℳ0 : α ¼ 0;ℳ1 : α∼Cauchy 0; 1ð Þ;ℳ2 : α∼Cauchyþ 0; 1ð Þ;where Cauchy+(0,1) indicates that α can take values only onthe positive side of the Cauchy(0,1) distribution (i.e., it is afolded Cauchy distribution).

The test of interest features the comparison between theone-sided hypothesis ℳ2 versus the null hypothesis ℳ0;that is, we seek the Bayes factor BF20. This Bayes factor canbe derived in many ways—for instance, using relativelystraightforward techniques such as the Savage-Dickey densityratio (Dickey & Lientz, 1970; Wagenmakers et al., 2010;Wetzels, Grasman, & Wagenmakers, 2010) or relatively intri-cate techniques such as the reversible jump MCMC (Green,1995). Here, we apply a different method that is possibly themost reliable and the least computationally expensive (Morey& Wagenmakers, 2014; Pericchi, Liu, & Torres, 2008). Thismethod takes advantage of the fact that we can easily calculatethe two-sided Bayes factor, BF10. With this Bayes factor in

−4

−2

0

2

4

α = 0, β = 0

20 40 80 160N

ln(B

Fm

ed)

−4

−2

0

2

4

α = 0.3, β = 0

20 40 80 160N

ln(B

Fm

ed)

−4

−2

0

2

4

α = 0.7, β = 0

20 40 80 160N

ln(B

Fm

ed)

−4

−2

0

2

4

α = 0, β = 0.3

20 40 80 160N

ln(B

Fm

ed)

−4

−2

0

2

4

α = 0.3, β = 0.3

20 40 80 160N

ln(B

Fm

ed)

−4

−2

0

2

4

α = 0.7, β = 0.3

20 40 80 160N

ln(B

Fm

ed)

−4

−2

0

2

4

α = 0, β = 0.7

20 40 80 160N

ln(B

Fm

ed)

−4

−2

0

2

4

α = 0.3, β = 0.7

20 40 80 160N

ln(B

Fm

ed)

−5

0

5

10

15

20

25

30

α = 0.7, β = 0.7

20 40 80 160N

ln(B

Fm

ed)

Fig. 2 Performance of the defaultJZS Bayesian hypothesis test formediation in different scenarios.Each panel shows the naturallogarithm of the Bayes factor formediation for different values ofα and β and different samplesizes. The white panelscorrespond to scenarios in whichthere is no mediation; the graypanels to scenarios in which thereis mediation. The darker thepanel, the stronger the mediationthat is present. The horizontaldotted line at zero indicates theboundary that separates evidencefor the null model (below the line)and evidence for the mediationmodel (above the line). Note thatthe scaling in the scenario ofstrong mediation is different fromthe other scenarios to give a moreadequate overview of the results

Behav Res

Page 8: A default Bayesian hypothesis test for mediation

hand, we only need to apply a simple correction to derive thedesired one-sided Bayes factor BF10. Specifically, note thatthe Bayes factor is transitive:

BF20 ¼ BF21 � BF10; ð24Þwhich is immediately apparent from its expanded form

p D ℳ2jð Þp D ℳ0jð Þ ¼

p D ℳ2jð Þp D ℳ1jð Þ �

p D ℳ1jð Þp D ℳ0jð Þ: ð25Þ

Thus, the desired one-sided test on α requires only BF21

and BF10. We already have access to BF10, and this leaves thecalculation of BF21—that is, the Bayes factor in favor of theorder-restricted model ℳ2 over the unrestricted model ℳ1.As was shown by Klugkist, Laudy, and Hoijtink (2005), thisBayes factor equals the ratio of two probabilities that can beeasily obtained: The first is the posterior probability that α>0,under the unrestricted model ℳ1; the second is the priorprobability that α>0, again under the unrestricted modelℳ1. Formally,

BF21 ¼ p α > 0 ℳ1;Djð Þp α > 0 ℳ1jð Þ : ð26Þ

Since the prior distribution is symmetric around zero, thedenominator equals .5, and Eq. 26 can be further simplified to

BF21 ¼ 2⋅p α > 0 ℳ1;Djð Þ ð27ÞOne straightforward way to determine p(α>0|ℳ1,D) is

(1) to use a generic program for Bayesian inference such asWinBUGS, JAGS, or Stan; (2) implementℳ1 in the programand collect Markov chain Monte Carlo (MCMC) samplesfrom the posterior distribution of α; (3) approximate p(α>0|ℳ1,D) by the proportion of posterior MCMC samples forα that are greater than zero.2

In our implementation of the one-sided mediation tests, wemake use of Eqs. 24 and 27. In order to obtain BF21, weimplemented the unrestricted models in JAGS (Plummer,2009). We confirmed the correctness of our JAGS implementa-tion by comparing the analytical results for the two-sided Bayesfactor BF10 against the Savage-Dickey density ratio results basedon theMCMCsamples from JAGS (seeAppendix 2). The JAGScode itself is provided in Appendix 1, since it allows researchersto adjust the prior distributions if they so desire. Finally, note thatour one-sided mediation test can incorporate order-restriction onany of the paths simultaneously.

Example: the firefighter data

To illustrate the workings of the various mediation tests, wewill apply them to the same example data Yuan and

MacKinnon (2009) used, concerning the PHLAME firefighterstudy (Elliot et al., 2007). In this study, it was investigatedwhether the effect of a randomized exposure to one of threeinterventions (X) on the reported eating of fruits and vegeta-bles (Y) was mediated by knowledge of the benefits of eatingfruits and vegetables (M; see Eqs. 1, 2, and 3). The interven-tions were either a “team-centered peer-led curriculum” or“individual counseling using motivational interviewing tech-niques,” both to promote a healthy lifestyle, or a controlcondition. The correlation matrix of the data is shown inTable 2.

The conventional approach: the frequentist product method

Yuan and MacKinnon (2009) first reported the results of theconventional frequentist product method mediation analysis.This method tests whether the indirect effect αβ differs sig-nificantly from zero. The estimate for αβ was .056 with astandard error of .026 (estimated with the Sobel method;Sobel, 1982), with the 95 % confidence interval (.013,.116)(see Table 3) (MacKinnon, Lockwood, &Williams, 2004; theinterval takes into account thatαβ is not normally distributed).Since the 95 % confidence interval does not include zero,frequentist custom suggests that the test provides evidencethat the effect of X on Y is mediated by M.

The Yuan and MacKinnon (2009) approach: Bayesianparameter estimation

Next, Yuan and MacKinnon (2009) reported the results of theirBayesian mediation analysis, which is based on parameterestimation with noninformative priors. The mean of the poste-rior distribution of αβ was .056 with a standard error of .027.The 95% credible interval forαβwas (.011,.118) (see Table 3).These Bayesian estimates are numerically consistent with thefrequentist results. It should be stressed, however, that the 95%credible interval does not allow a test. As summarized byBerger (2006), “Bayesians cannot test precise hypotheses usingconfidence intervals. In classical statistics one frequently seestesting done by forming a confidence region for the parameter,and then rejecting a null value of the parameter if it does not liein the confidence region. This is simply wrong if done in aBayesian formulation (and if the null value of the parameter isbelievable as a hypothesis)” (p. 383; see also Lindley, 1957;Wagenmakers & Grunwald, 2006).

2 The approximation can be made arbitrarily close by increasing thenumber of MCMC samples.

Table 2 Correlationmatrix of the PHLAMEfirefighter data (N=354)

X Y M

X 1.00 .08 .18

Y .08 1.00 .16

M .18 .16 1.00

Behav Res

Page 9: A default Bayesian hypothesis test for mediation

The Bayes factor approach: the default Bayesian hypothesistest

We will now consider the results of the proposed Bayesianhypothesis test with the default JZS prior setup. First, weestimated the posterior distribution of αβ, using the methodof Yuan and MacKinnon (2009), but now with the JZS priorinstead of a noninformative prior. The resulting posterior dis-tribution had a mean of .056 and a 95 % credible interval of(.012,.116) (see Table 3). This is consistent with the results ofboth the frequentist test and the Bayesian mediation estimationroutine of Yuan and Mackinnon. As was expected, the choiceof the JZS prior setup over a noninformative prior setup doesnot much influence the results in terms of parameter estimation.

The advantage of the JZS prior specification is that we canalso formally test whether the effect differs from zero. Ouranalytical test indicates that the Bayes factor for path α is10.06, which corresponds to a posterior probability of 10.06/(10.06+1)=.91. The Bayes factor for path β is 2.68, whichcorresponds to a posterior probability of 2.68/(2.68+1)=.73. Ifwe multiply these posterior probabilities, we obtain the poste-rior probability for mediation: .91×.73=.66.This posterior prob-ability is easily converted to a Bayes factor: .66/(1−.66)=1.94.Hence, the data are about twice as likely under the model withmediation than under the model without mediation. In terms ofJeffreys’s evidence categories, this evidence is anecdotal or “notworth more than a bare mention.”

It is also possible to include an order-restriction in the medi-ation model at hand. According to the theory, we expect apositive relation between the mediator “knowledge of the ben-efits of eating fruits and vegetables” and the dependent variable“the reported eating of fruits and vegetables,” or in other words,we expect path β to be greater than zero. If we implement thisorder-restriction, our test indicates a newBayes factor for path βof 5.33, with a corresponding posterior probability of 5.33/(5.33+1)=0.84. If we multiply the posterior probability of αwith the new posterior probability of β, we obtain the newposterior probability of mediation: .91×.84=.76, with a corre-sponding Bayes factor for mediation of .76/(1−.76)=3.17. Withthe imposed order restriction, the observed data are now aboutthree times as likely under the mediation model than under themodel without mediation, which according to the Jeffreys’evidence categories constitutes evidence for mediation on theborder between “anecdotal” and “moderate.”

R package: BayesMed

In order to make our default Bayesian hypothesis tests avail-able, we built the R package BayesMed (Nuijten et al., 2014).R is a free software environment for statistical computing andgraphics (R Core Team, 2012), which makes it a good plat-form for our test.

BayesMed includes both the basic test for mediation(jzs_med) and the accompanying tests for correlation (jzs_cor)and partial correlation (jzs_partcor), as well as the associatedSavage-Dickey density ratio versions (jzs_medSD,jzs_corSD, and jzs_partcorSD, respectively). Furthermore,we added the possibility of estimating the indirect effect αβ,based on the procedure outlined in Yuan and MacKinnon(2009), but with a JZS prior setup. Finally, we also includedthe firefighter data. The use of the tests and their options aredescribed in the help files within the package.

Concluding comments

We have outlined a default Bayesian hypothesis test for me-diation and presented an R package that allows it to be appliedeasily. This default test complements the earlier work by YuanandMacKinnon (2009) on Bayesian estimation for mediation.In addition, we have extended the default tests by allowingmore informative, one-sided alternatives to be tested as well.Nevertheless, our test constitutes only a first step.

A next step could be to extend the test to multiple mediatormodels. This should be relatively straightforward: The media-tion model (Eqs. 2 and 3) needs to be changed to allowmultiplemediators. Next, the presence of each path can still be assessedin the same way by calculating the Bayes factor for each path(see Steps 1 and 2 above) and combining the separate Bayesfactors into an overall Bayes factor for mediation.

Another extension could be to add a scaling factor to theJZS prior to adjust the spread of the prior distribution.3 At themoment, the prior includes a Cauchy(0,r=1), but a smaller orlarger r would make the prior smaller or wider, respectively.

Other avenues for further development include, but are notlimited to, the following: (1) integrate the estimation andtesting approaches by using the estimation outcomes fromearlier work as a prior for the later test (Verhagen &Wagenmakers, in press); (2) explore methods to incorporatesubstantive prior knowledge (e.g., Dienes, 2011); (3) extendthe test to interval null hypotheses—that is, null hypothesesthat are not defined by a point mass at zero but, instead, by apractically meaningful area around zero (Morey & Rouder,2011); and (4) generalize the methodology to more complexmodels such as hierarchical models or mixture models.

3 We thank an anonymous reviewer for pointing this out to us.

Table 3 Three estimates of the mediated effect bαbβ for the PHLAMEfirefighter data set with associated 95 % confidence/credible intervals

bαbβ CI95%

Frequentist product method .056 (.013, .116)

Yuan & MacKinnon (2009) .056 (.011, .118)

Default Bayesian hypothesis test .056 (.012, .116)

Behav Res

Page 10: A default Bayesian hypothesis test for mediation

As for all Bayesian hypothesis tests that are based on Bayesfactors, users need to realize that the test depends on the spec-ification of the alternative hypothesis. In general, it is a goodidea to conduct a sensitivity analysis and examine the extent towhich the outcomes are qualitatively robust to alternative plau-sible prior specifications (e.g., Wagenmakers, Wetzels,Borsboom, & van der Maas, 2011). Such sensitivity analysesare facilitated by our JAGS code presented in Appendix 1.

In sum, we have provided a default Bayesian hypoth-esis test for mediation. This test allows users to quantifystatistical evidence in favor of both the null hypothesis(i.e., no mediation) and the alternative hypothesis (i.e.,

full or partial mediation). The test also allows informa-tive hypotheses to be tested in the form of order-restrictions. Several extensions of the methodology arepossible and await future implementation.

Acknowledgements This research was supported by an ERC grantfrom the European Research Council. Conor V. Dolan is supported bythe European Research Council (Genetics of Mental Illness; grant num-ber: ERC–230374). Ruud Wetzels is supported by the Dutch nationalprogram COMMIT.

AppendixesAppendix 1. JAGS code

JAGS code for correlation

Behav Res

Page 11: A default Bayesian hypothesis test for mediation

JAGS code for partial correlation

Appendix 2. Testing the correctness of our JAGSimplementation

To assess the correctness of our JAGS implementation, wecompared the analytical results for the two-sided Bayes factoragainst the Savage-Dickey density ratio results based on theMCMC samples from JAGS. The distribution that fit theposterior samples best4 is the nonstandardized t-distributionwith the following density:

p x ν;μ;σjð Þ ¼Γ

ν þ 1

2

� Γ

ν2

� � ffiffiffiffiffiffiffiffiffiffiffiffiπνσð Þp 1þ 1

νx−μσ

� �2� − νþ1

2

; ð28Þ

with ν degrees of freedom, location parameter μ, and scaleparameter σ. With the samples of the parameter of interest, wecan estimate ν, μ, and σ and, thus, the exact shape of thedistribution and the exact height of the distribution at the pointof interest.

We checked the fit of this distribution and the performanceof the SD method in a small simulation study. We consideredthe following sample sizes: N = 20, 40, 80, or 160. Wesimulated correlational data by drawing N values for X froma standard normal distribution, and conditional on X, wesimulated values for Y according to the following equation:

4 We compared the fit of four distributions: a nonstandardized t-distribution, anormal distribution, a nonparametric distribution estimated with the splineinterpolation function splinefun in R, and a nonparametric distribution esti-mated with the R function logspline that also uses splines to estimate the logdensity. All four distributions fitted reasonably well: The Bayes factors of theanalytical test and the SD method are similar with all different posteriordistributions. All four distributions are therefore included in the R packageBayesMed and can be used when applying the SD method.

Behav Res

Page 12: A default Bayesian hypothesis test for mediation

Y i ¼ β0 þ τX i þ ε; ð29Þwhere the subscript i denotes subject i and τ represents therelation between X and Y. For each of the four sample sizes, wegenerated 100 data sets, in each of which τ was drawn from astandard uniform distribution.

Next, we tested the correlation in each data set with boththe analytical Bayesian correlation test and the SD methodwith the nonstandardized t-distribution and compared theresults. The results are shown in Fig. 3. The figure shows thatthe proposed SD method performs well: The Bayes factors ofthe analytical test and the SDmethod are similar for all samplesizes and correlations.

References

Armstrong, A. M., & Dienes, Z. (2013). Subliminal understanding ofnegation: Unconscious control by subliminal processing of wordpairs. Consciousness and Cognition, 22, 1022–1040.

Berger, J. O. (2006). Bayes factors. In S. Kotz, N. Balakrishnan, C. Read,B. Vidakovic, & N. L. Johnson (Eds.), Encyclopedia of statisticalsciences, vol. 1 (2nd ed., pp. 378–386). Hoboken, NJ: Wiley.

Berger, J. O., & Delampady, M. (1987). Testing precise hypotheses.Statistical Science, 2, 317–352.

Berger, J. O., &Wolpert, R. L. (1988). The likelihood principle (2nd ed.).Hayward (CA): Institute of Mathematical Statistics.

Consonni, G., Forster, J. J., & LaRocca, L. (2013). Thewhetstone and thealum block: Balanced objective Bayesian comparison of nestedmodels for discrete data. Statistical Science, 28, 398–423.

Dickey, J. M., & Lientz, B. P. (1970). Theweighted likelihood ratio, sharphypotheses about chances, the order of a Markov chain. Annals ofMathematical Statistics, 41, 214–226.

Dienes, Z. (2008). Understanding psychology as a science: An introduc-tion to scientific and statistical inference. New York: PalgraveMacMillan.

Dienes, Z. (2011). Bayesian versus orthodox statistics: Which side areyou on? Perspectives on psychological. Science, 6, 274–290.

Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian statisticalinference for psychological research. Psychological Review, 70,193–242.

Elliot, D. L., Goldberg, L., Kuehl, K. S., Moe, E. L., Breger, R. K., &Pickering, M. A. (2007). The phlame (promoting healthy lifestyles:Alternative models’ effects) firefighter study: Outcomes of twomodels of behavior change. Journal of Occupational andEnvironmental Medicine, 49(2), 204–213.

Green, P. J. (1995). Reversible jump Markov chain Monte Carlo compu-tation and Bayesian model determination. Biometrika, 82, 711–732.

Guo, X., Li, F., Yang, Z., & Dienes, Z. (2013). Bidirectional transferbetween metaphorical related domains in implicit learning of form-meaning connections. PLoS ONE, 8, e68100.

Hoijtink, H., Klugkist, I., & Boelen, P. (2008). Bayesian evaluation ofinformative hypotheses. New York: Springer.

Iverson, G. J., Wagenmakers, E. J., & Lee, M. D. (2010). A modelaveraging approach to replication: The case of prep. PsychologicalMethods, 15, 172–181.

Jeffreys, H. (1961). Theory of Probability (3rd ed.). Oxford, UK: OxfordUniversity Press

−2 0 2 4 6

−2

0

2

4

6

N = 20

ln(BF) analytical

ln(B

F)

SD

−2 0 2 4 6

−2

0

2

4

6

N = 40

ln(BF) analytical

ln(B

F)

SD

−2 0 2 4 6

−2

0

2

4

6

N = 80

ln(BF) analytical

ln(B

F)

SD

−2 0 2 4 6

−2

0

2

4

6

N = 160

ln(BF) analytical

ln(B

F)

SD

Fig. 3 Natural logarithm of theBayes factors for correlationobtained with analyticalcalculations (x axis) or obtainedwith the SD method based on anonstandardized t-distribution(y axis) for different sample sizes(N). The graphs show fewerpoints as the samples grow larger,because in these situations, thereare more extreme Bayes factorsthat fall outside the axis limits.Werestricted the graphs, since it ismost important that the lowerBayes factors lie on the diagonal;it is not important whether aBayes factor is 2,000 or 3,000,since it is overwhelming evidencein any case

Behav Res

Page 13: A default Bayesian hypothesis test for mediation

Kass, R. E., & Wasserman, L. (1995). A reference Bayesian testfor nested hypotheses and its relationship to the Schwarzcriterion. Journal of the American Statistical Association,90, 928–934.

Klugkist, I., Laudy, O., & Hoijtink, H. (2005). Inequality constrainedanalysis of variance: A Bayesian approach. Psychological Methods,10, 477.

Kruschke, J. K. (2010). Doing Bayesian data analysis: A tutorialintroduction with R and BUGS. Burlington, MA: AcademicPress.

Lee, M. D., & Wagenmakers, E. J. (2013). Bayesian modeling forcognitive science: A practical course. Germany: CambridgeUniversity Press.

Lewis, S. M., & Raftery, A. E. (1997). Estimating Bayes factors viaposterior simulation with the Laplace–Metropolis estimator.Journal of the American Statistical Association, 92, 648–655.

Liang, F., Paulo, R., Molina, G., Clyde, M. A., & Berger, J. O. (2008).Mixtures of g priors for Bayesian variable selection. Journal of theAmerican Statistical Association, 103, 410–423.

Lindley, D. V. (1957). A statistical paradox. Biometrika, 44, 187–192.

MacKinnon, D. P., Fairchild, A., & Fritz, M. (2007). Mediation analysis.Annual Review of Psychology, 58, 593.

MacKinnon, D. P., Lockwood, C. M., & Hoffman, J. (1998). A newmethod to test for mediation. Paper presented at the annual meetingof the Society for Prevention Research, Park City, UT.

MacKinnon, D. P., Lockwood, C., Hoffman, J., West, S., & Sheets, V.(2002). A comparison of methods to test mediation and otherintervening variable effects. Psychological Methods, 7, 83–104.

MacKinnon, D. P., Lockwood, C. M., & Williams, J. (2004).Confidence limits for the indirect effect: Distribution of theproduct and resampling methods. Multivariate BehavioralResearch, 39, 99–128.

MacKinnon, D. P., Warsi, G., & Dwyer, J. H. (1995). A simulation studyof mediated effect measures.Multivariate Behavioral Research, 30,41–62.

Morey, R. D., & Rouder, J. N. (2011). Bayes factor approachesfor testing interval null hypotheses. Psychological Methods,16, 406–419.

Morey, R. D., & Wagenmakers, E. J. (2014). Simple relation betweenone–sided and two–sided Bayesian point–null hypothesis tests.Manuscript submitted for publication.

Myung, I. J., & Pitt, M. A. (1997). Applying Occam’s razor in modelingcognition: ABayesian approach.Psychonomic Bulletin & Review, 4,79–95.

Nuijten, M. B., Wetzels, R., Matzke, D., Dolan, C. V., & Wagenmakers,E. J. (2014). BayesMed: Default Bayesian hypothesis tests forcorrelation, partial correlation, and mediation. R package version1.0. http://CRAN.R-project.org/package=BayesMed

O’Hagan, A., & Forster, J. (2004).Kendall’s advanced theory of statisticsvol. 2B: Bayesian inference (2nd ed.). London: Arnold.

Overstall, A. M., & Forster, J. J. (2010). Default Bayesianmodel determination methods for generalised linear mixedmodels. Computational Statistics & Data Analysis, 54,3269–3288.

Pericchi, L. R., Liu, G., & Torres, D. (2008). Objective Bayesfactors for informative hypotheses: “Completing” the infor-mative hypothesis and “splitting” the Bayes factor. In H.Hoijtink, I. Klugkist, & P. A. Boelen (Eds.), Bayesian eval-uation of informative hypotheses (pp. 131–154). New York:Springer Verlag.

Plummer, M. (2009). JAGS version 1.0. 3 manual. URL: http://www-ice.iarc.fr/~martyn/software/jags/jags_user_manual. pdf

R Core Team. (2012). R: A language and environment for statisticalcomputing []. Vienna, Austria. APACrefURL http://www.R-project.org/ ISBN 3-900051-07-0

Rouder, J. N., & Morey, R. D. (2012). Default Bayes factors for modelselection in regression.Multivariate Behavioral Research, 47, 877–903.

Rouder, J. N., Morey, R. D., Speckman, P. L., & Province, J. M. (2012).Default Bayes factors for ANOVA designs. Journal ofMathematical Psychology, 56, 356–374.

Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G.(2009). Bayesian t tests for accepting and rejecting the null hypoth-esis. Psychonomic Bulletin & Review, 16, 225–237.

Schwarz, G. (1978). Estimating the dimension of a model. Annals ofStatistics, 6, 461–464.

Sellke, T., Bayarri, M. J., & Berger, J. O. (2001). Calibration of p valuesfor testing precise null hypotheses. The American Statistician, 55,62–71.

Semmens-Wheeler, R., Dienes, Z., & Duka, T. (2013). Alcohol increaseshypnotic susceptibility. Consciousness and Cognition, 22(3), 1082–1091.

Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effectsin structural equation models. Sociological Methodology, 13, 290–312.

Vandekerckhove, J, Matzke, D., & Wagenmakers, E. J. (in press). Modelcomparison and the principle of parsimony. In J. Busemeyer, J.Townsend, Z. J. Wang, & A. Eidels (Eds.), Oxford handbook ofcomputational and mathematical psychology. Oxford UniversityPress.

Venzon, D., & Moolgavkar, S. (1988). A method for computing profile-likelihood-based confidence intervals. Applied Statistics, 37(1), 87–94.

Verhagen, J., &Wagenmakers, E. J. (in press). A Bayesian test to quantifythe success or failure of a replication attempt. Journal ofExperimental Psychology: General.

Wagenmakers, E. J. (2007). A practical solution to the pervasive prob-lems of p values. Psychonomic Bulletin & Review, 14, 779–804.

Wagenmakers, E. J., & Grünwald, P. (2006). A Bayesian perspective onhypothesis testing. Psychological Science, 17, 641–642.

Wagenmakers, E. J., Lodewyckx, T., Kuriyal, H., & Grasman, R. (2010).Bayesian hypothesis testing for psychologists: A tutorial on theSavage–Dickey method. Cognitive Psychology, 60, 158–189.

Wagenmakers, E. J., Wetzels, R., Borsboom, D., & van der Maas, H. L. J.(2011). Why psychologists must change the way they analyze theirdata: The case of psi. Journal of Personality and Social Psychology,100, 426–432.

Wetzels, R., Grasman, R. P. P. P., & Wagenmakers, E. J. (2010). Anencompassing prior generalization of the Savage–Dickey densityratio test. Computational Statistics & Data Analysis, 54, 2094–2102.

Wetzels, R., Grasman, R. P. P. P., &Wagenmakers, E. J. (2012). A defaultBayesian hypothesis test for ANOVA designs. The AmericanStatistician, 66, 104–111.

Wetzels, R., Matzke, D., Lee, M. D., Rouder, J. N., Iverson, G. J., &Wagenmakers, E. J. (2011). Statistical evidence in experimentalpsychology: An empirical comparison using 855 t tests.Perspectives on Psychological Science, 6, 291–298.

Wetzels, R., Raaijmakers, J. G. W., Jakab, E., & Wagenmakers, E. J.(2009). How to quantify support for and against the null hypothesis:A flexible WinBUGS implementation of a default Bayesian t test.Psychonomic Bulletin & Review, 16, 752–760.

Wetzels, R., &Wagenmakers, E. J. (2012). A default Bayesian hypothesistest for correlations and partial correlations. Psychonomic Bulletin &Review, 19, 1057–1064.

Yuan, Y., & MacKinnon, D. P. (2009). Bayesian mediation analysis.Psychological Methods, 14, 301–322.

Zellner, A., & Siow, A. (1980). Posterior odds ratios for selected regres-sion hypotheses. In J. M. Bernardo, M. H. DeGroot, D. V. Lindley,&A. F.M. Smith (Eds), Bayesian statistics (pp. 585–603). Valencia:University Press.

Behav Res