on the use of mixed markov models for intensive ... · multivariatebehavioralresearch 749...

22
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=hmbr20 Download by: [KU Leuven Libraries] Date: 09 January 2018, At: 04:21 Multivariate Behavioral Research ISSN: 0027-3171 (Print) 1532-7906 (Online) Journal homepage: http://www.tandfonline.com/loi/hmbr20 On the Use of Mixed Markov Models for Intensive Longitudinal Data S. de Haan-Rietdijk, P. Kuppens, C. S. Bergeman, L. B. Sheeber, N. B. Allen & E. L. Hamaker To cite this article: S. de Haan-Rietdijk, P. Kuppens, C. S. Bergeman, L. B. Sheeber, N. B. Allen & E. L. Hamaker (2017) On the Use of Mixed Markov Models for Intensive Longitudinal Data, Multivariate Behavioral Research, 52:6, 747-767, DOI: 10.1080/00273171.2017.1370364 To link to this article: https://doi.org/10.1080/00273171.2017.1370364 © 2017 The Author(s). Published with license by Taylor & Francis Group© S. de Haan-Rietdijk, P. Kuppens, C. S. Bergeman, L. B. Sheeber, N. B. Allen, and E. L. Hamaker View supplementary material Published online: 28 Sep 2017. Submit your article to this journal Article views: 498 View related articles View Crossmark data

Upload: others

Post on 22-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

Full Terms & Conditions of access and use can be found athttp://www.tandfonline.com/action/journalInformation?journalCode=hmbr20

Download by: [KU Leuven Libraries] Date: 09 January 2018, At: 04:21

Multivariate Behavioral Research

ISSN: 0027-3171 (Print) 1532-7906 (Online) Journal homepage: http://www.tandfonline.com/loi/hmbr20

On the Use of Mixed Markov Models for IntensiveLongitudinal Data

S. de Haan-Rietdijk, P. Kuppens, C. S. Bergeman, L. B. Sheeber, N. B. Allen &E. L. Hamaker

To cite this article: S. de Haan-Rietdijk, P. Kuppens, C. S. Bergeman, L. B. Sheeber, N. B. Allen& E. L. Hamaker (2017) On the Use of Mixed Markov Models for Intensive Longitudinal Data,Multivariate Behavioral Research, 52:6, 747-767, DOI: 10.1080/00273171.2017.1370364

To link to this article: https://doi.org/10.1080/00273171.2017.1370364

© 2017 The Author(s). Published withlicense by Taylor & Francis Group© S. deHaan-Rietdijk, P. Kuppens, C. S. Bergeman, L.B. Sheeber, N. B. Allen, and E. L. Hamaker

View supplementary material

Published online: 28 Sep 2017.

Submit your article to this journal

Article views: 498

View related articles

View Crossmark data

Page 2: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

MULTIVARIATE BEHAVIORAL RESEARCH, VOL. , NO. , –https://doi.org/./..

On the Use of Mixed Markov Models for Intensive Longitudinal Data

S. de Haan-Rietdijka, P. Kuppens b, C. S. Bergemanc, L. B. Sheeberd, N. B. Allene, and E. L. Hamakera,b

aMethodology and Statistics, Faculty of Social and Behavioural Sciences, Utrecht University; bDepartment of Psychology, Faculty of Psychologyand Educational Sciences, KU Leuven; cDepartment of Psychology, University of Notre Dame; dOregon Research Institute; eDepartment ofPsychology, University of Oregon

KEYWORDSIntensive longitudinal data;state switching; mixedMarkov model; latent Markovmodel; time series analysis

ABSTRACTMarkov modeling presents an attractive analytical framework for researchers who are interested instate-switching processes occurring within a person, dyad, family, group, or other system over time.Markov modeling is flexible and can be used with various types of data to study observed or latentstate-switchingprocesses, and can include subject-specific randomeffects to account for heterogene-ity. We focus on the application of mixed Markov models to intensive longitudinal data sets in psy-chology, which are becoming ever more common and provide a rich description of each subject’sprocess. We examine how specifications of a Markov model change when continuous random effectdistributions are included, and how mixed Markov models can be used in the intensive longitudinalresearch context. Advantages of Bayesian estimation are discussed and the approach is illustrated bytwo empirical applications.

Psychological researchers from various fields have usedlongitudinal research to study within-person processesthat are characterized by switches between different states.Examples involve research into bipolar disorder, char-acterized by switches between manic and depressivestates (Hamaker, Grasman, & Kamphuis, 2016); recov-ery and relapse as seen in addiction (Warren, Hawkins,& Sprott, 2003; Shirley, Small, Lynch, Maisto, & Oslin,2010; Prisciandaro et al. 2012; DeSantis & Bandyopad-hyay, 2011); state-dependent affect regulation (de Haan-Rietdijk, Gottman, Bergeman, & Hamaker, 2016) andvarious other approaches in modeling affect dynamics(Hamaker, Ceulemans, Grasman, & Tuerlinckx, 2015);catastrophe theory applied to stagewise cognitive devel-opment (Van der Maas & Molenaar, 1992); and switchesin strategy use during cognitive task performance, withthe speed-accuracy trade-off as a well-known example(Wagenmakers, Farrell, & Ratcliff, 2004).

One analysis approach that can be valuable and that hassometimes beenused for suchprocesses isMarkovmodel-ing, which can be used when a person alternates betweendiscrete states. These states can be directly observed,but there are also latent Markov models (LMMs) inwhich a latent state variable is related to observed data.Since individual differences are to be expected in manypsychological applications, a particularly promisingframework is offered by the mixed Markov model, which

CONTACT S. de Haan-Rietdijk [email protected] Methodology and Statistics, Faculty of Social and Behavioural Sciences, Utrecht University ,Utrecht TC, the Netherlands.Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/hmbr.

includes continuous random effects (Altman, 2007; alsosee Seltman, 2002; Humphreys, 1998; Rijmen, Ip, Rapp,& Shaw, 2008a), and potentially covariates. This modelis very suitable for the increasingly common intensivelongitudinal data type (ILD; cf. Walls & Schafer, 2006),where 20 or more repeated measurements are obtainedper individual, because such data offer a rich descriptionof each individual process. In this paper, we focus onmodels with time-constant parameters and predictors,but note that ILD also lend themselves well to modelsinvolving time-varying parameters.

Markovmodels, includingmixedMarkovmodels, havebeen applied in various scientific fields, but there are rel-atively few examples of mixed Markov models in the psy-chological literature. This scarcity of applications mayhave to do with the models not being well-known toresearchers in this field, or perhaps with perceived lim-itations or challenges in the implementation of mixedMarkov models. We note that ILD are unique in thesense that they contain much information about eachindividual, so that models including individual differ-ences in temporal dynamics not only become viable, butof prime interest. While ILD is also suitable for separate(N = 1) modeling of each individual person’s data, dis-tinct advantages of themixedMarkovmodeling approachare that we can borrow strength across persons, obtainestimates of the average population parameters, and

© S. de Haan-Rietdijk, P. Kuppens, C. S. Bergeman, L. B. Sheeber, N. B. Allen, and E. L. Hamaker. Published with license by Taylor & Francis Group.This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/./), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built uponin any way.

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 3: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

748 S. DE HAAN-RIETDIJK ET AL.

include predictors to investigate their relationship withthe individual differences. In this paper, we show howmixed Markov models can be specified and implementedwith Bayesian estimation, and we illustrate the value ofthis approach for ILD in psychology by two empiricalexamples, one involving daily affect experience and theother family interactions. We also discuss the variousways that individual differences have been addressed inthe Markov literature.

This paper is structured as follows: First, we present theobserved and latent Markov models, and the inclusion ofcontinuous random effects. This is followed by consider-ations about the implementation, particularly the reasonswhy Bayesian estimation seems promising in this context.Thenwe turn to the literature anddiscuss the variouswaysthat researchers have approached between-person differ-ences in the Markov modeling framework. After that, weanalyze two empirical data sets for which Markov model-ing can address unique hypotheses and complement otheranalysis approaches.We conclude the paper with a discus-sion of limitations and recommendations for researchers.

TheMarkovmodels

In this first section, we present a brief introduction ofMarkovmodels, in which we distinguish between two sit-uations: Either researchers know the states of the systemover time, or these states are assumed to underlie the data.In the first case, we can use what we call the observedMarkov model (OMM), whereas in the latter, we need touse an LMM. In our discussion, we assume that themodelis concerned with the states a person is in, but keep inmind that the unit of analysis could also be a dyad, family,company, group, or something else.

The observedMarkovmodel (OMM)

When researchers know the state that a person is in at eachoccasion, they can use an OMM to investigate the tem-poral dynamics of the state-switching process. The statesmay correspond to the categories of a discrete observedvariable, or they may be created by discretizing contin-uous (or multivariate) data, as our empirical applicationwill illustrate. TheOMMis also referred to as amanifest orsimple Markov model or Markov chain (cf. Kaplan, 2008;Langeheine & Van De Pol, 2002).

With an OMM, we analyze the observed state transi-tions over time and estimate the transition probabilities.Figure 1 provides a graphical representation of an OMM,with on the right-hand side, a hypothetical example basedon alcohol use with two states. It can be seen that there arefour possible transitions (including not switching). Thetransition probabilities of a Markov model are denoted as

Figure . Graphical representation of an OMM with two states. Ifa person is in a given state i at time t , the parameters next to thearrows running from that state represent the average probabilitiesof being in each state j at time t + 1. The right part portrays fic-tional parameter values concerning alcohol use.

a matrix in which the element πi j represents the probabil-ity of transitioning from state i to j between consecutivemeasurements (with i ∈ [1, 2, . . . S] and j ∈ [1, 2, . . . S]).We can write πi j = p(stn = j|s(t−1)n = i), where stn rep-resents the state of person n at occasion t . The probabilityof being in this state is conditioned on the state at the pre-vious occasion. The likelihood for the data (s) from timet = 2 onwards is then given by

f (s|π) =T∏

t=2

N∏

n=1

f (stn|π), (1)

which is the product over all persons and all time points(starting at t = 2) of the categorical distribution for eachindividual observation, which is given by

f (stn|π) =S∏

i=1

S∏

j=1

(πi j)[stn= j]·[s(t−1)=i]. (2)

Here, the expressions within square brackets evaluateto 0 or 1 when they are false or true, respectively, and∑S

j=1 πi j = 1 for every i. Due to this logical restriction,weonly need to estimate S − 1 probabilities of each row, fora total of S(S − 1) probabilities. Note that the number ofprobabilities to be estimated increases exponentially withthe number of states.

The latentMarkovmodel (LMM)

Sometimes the state-switching process that researchersare interested in is not directly observed, or is observedwith methods that are expected to involve substantialmeasurement error. In these situations, researchers canuse the LMM, also known as the Hidden Markov model

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 4: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

MULTIVARIATE BEHAVIORAL RESEARCH 749

(HMM). For quite a long time, there have been two largelyseparate literatures for HMMs and LMMs, originatingfrom different disciplines, but the two names refer to thesame model type (Visser, 2011; Zhang, Jones, Rijmen, &Ip, 2010).

Because an LMM is used to study a latent state-switching process, this model always needs to have amea-surement (or conditional) model part, which links theunobserved states of the system to the observed outcomevariable(s). We can distinguish between two broad sce-narios for why and how this is done, which we will dis-cuss inmore detail below. In the first scenario, researchershave the option of applying an OMM directly to theirobserved categorical outcome (i.e., the state variable), butthey prefer using an LMM to takemeasurement error intoaccount. In contrast, in the second scenario, the LMMis the only one of the two models that can be applied toaddress the research questions at hand.

The first type of LMM is used when a state variablehas been observed, but there is reason to expect sub-stantial measurement error, which may cause bias in theestimated transition probabilities in an OMM (Vermunt,Langeheine, & Böckenholt, 1999). It may also be the casethat multiple indicators of the state are available, in whichcase an OMMwould have to be applied separately to eachindicator. In an LMMapplication of this type, the numberof latent states is given by the number of observed cate-gories (per indicator), and the measurement model con-cerns the probabilities of correct classification versus mis-classification for each true state and indicator. An exampleis found in Humphreys (1998), who analyzed three indi-cators of labor market status (employed or not employed)with a two-state LMM.

In the second type of LMM, the measurement modelpart does not servemerely to filter outmeasurement errorfrom observed states, but instead it is used to relate theobserved data to an underlying state-switching processwith a different meaning. For example, Altman (2007)used a two-state LMM to study the observed lesion countsin multiple sclerosis patients during (latent) recovery andrelapse states. In these kinds of LMMs, the number oflatent states is chosen based on theory, or based on the fitof models with differing numbers of states. Furthermore,the exact specification of themodel in these cases dependson the measurement level of the observed variable(s) andon how the states are assumed to influence them.

When it comes to differences between the interpre-tation of LMMs and OMMs, the study by Shirley et al.(2010) provides a nice example. As these researchersexplain, applying an LMM to their alcoholism treat-ment trial data corresponds to a different and less rigidclinical understanding of what constitutes a “relapse” inalcoholism recovery, because the latent state in an LMM

is differentiated from the observed categorical variable(drinking behavior). In contrast, an OMM would reflectthe assumption that a change in the observed categories iswhat matters and what is relevant for clinical practice.

Apart from the fact that the number of states in anLMM does not always follow from the data, as it does intheOMM, the specification of the transitionmodel is verysimilar, except that an LMM includes additional parame-tersπ1 representing the probabilities of starting out in thedifferent states at the first occasion. There is no fixed for-mat for the measurement part of an LMM, because it is aflexible model that can be applied to univariate or multi-variate, and continuous or discrete data. Its interpretationdepends on how the underlying states and their relation-ship with the observed data are conceptualized.

Including random effects inMarkovmodels

If we want to account for individual differences in dynam-ics, this can be done by including subject-specific ran-dom effects. Here, we followAltman’s (2007)mixed LMMspecification, modeling the logits (log-odds of the proba-bilities) instead of estimating the probabilities directly asdescribed above. This makes it possible to use regressionto predict the logits.

In a mixed LMM, the latent state at the first time pointis still modeled using fixed probabilities. But the statesfrom occasion t = 2 onwards are modeled, in both anOMM and LMM, using a categorical distribution wherethe transition probabilities depend on the individual n, sowe get

f (s|π) =T∏

t=2

N∏

n=1

S∏

i=1

S∏

j=1

(πi jn)[stn= j]·[s(t−1)=i]. (3)

The individual’s transition probabilities are derived fromtheir logits αisn, using

πi jn = exp(αi jn)∑S

s=1 exp(αisn), (4)

where

αi jn = μi j + εi jn (5)

for each person n and each i, j ∈ [1, . . . S], such that μi jis the average logit for transitioning from state i to state j,and εi jn is person n’s deviation from that average. The ran-dom deviations have a multivariate normal distributionwith a zero mean vector of length S(S − 1) and a covari-ance matrix �(ε) with S(S − 1) rows and columns. Foridentification, we set both μi j and εi j equal to 0 when-ever i = j, making stability the reference transition for thelogits.

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 5: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

750 S. DE HAAN-RIETDIJK ET AL.

To predict some of the individual differences in switch-ing probabilities, we can add one or more predictor vari-ables to Equation (5). If we have one predictor, this givesus

αi jn = μi j + βi j · xn + εi jn. (6)

For identification purposes, we also constrain βi j to be0 whenever i = j. The logit specification above, whichuses a reference category, can be used for ordinal andnon-ordinal state variables, but for ordinal state variables,other alternatives including local, global, and continua-tion logit specifications, may be more parsimonious andpowerful (cf. Bartolucci, Farcomeni, & Pennoni, 2013).Note that the measurement model part in an LMMcan also be specified to include random effect(s). Thesecond empirical application in this paper illustrates amixed LMM.

Model estimation: A Bayesian approach

We now discuss the implementation of (mixed) Markovmodels, focusing on a few general considerations aboutBayesian estimation, which is our preferred approach inthis context.

Why Bayesian?

The literature contains many successful applications ofMarkov models implemented using classical (frequentist)estimation methods, and some of these involve LMMswith covariates, latent subgroups, or random effects toaccount for individual differences.1 However, there havebeen conflicting opinions on the robustness of frequentistestimation for mixed LMMs, and in practice, many appli-cations have usedmixture LMMs (whichwill be discussedin the next section), rather than mixed LMMs. Seltman(2002) implemented a Bayesian mixed LMM with onerandom effect, and stated that the frequentist approachwas intractable. Altman (2007) countered this claim byshowing that classical estimation of several mixed LMMs,while computationally intensive, was feasible even withmultiple random effects. However, she concluded that thecomputational burden may become prohibitive with fouror more random effects. Regardless of the exact possibil-ities and limitations within the classical framework, weknow that Bayesian estimation presents a highly flexiblealternative approach.

For an extensive discussion of classical estimation of Markov models, read-ers are referred to the textbooks by Bartolucci et al. () and MacDonaldand Zucchini (), and examples of LMM applications involving varioustypes of random effects, and their computational details, can also be foundinMaruotti and Rocci (), Altman (), Jackson, Albert and Zhang (),and Crayen, Eid, Lischetzke, Courvoisier and Vermunt ().

Besides its robustness, three additional advantages ofBayesian estimation seem relevant for the use of mixedMarkov models in psychology. First, it does not rely onasymptotic distributions, making it more appropriate forsmall samples. Importantly, in a multilevel model, thesample size at the second level is a separate issue (Hox,2010). For instance, if there are only 30 persons in a dataset, the sample size for estimating an average transitionprobability, a random effects variance or a covariate effecton the transition logits is small, regardless of the length ofthe time series. A third feature of Bayesian methods thatwe believe is especially valuable for psychological researchis Bayesian multiple imputation, which is a state of theart technique for handling incomplete data without los-ing information or introducing bias (assuming, as mostcommon approaches do, that the data points are missingat random; cf. Schafer & Graham, 2002). Fourth, we notethat it is possible to extract the latent states from a fit-ted LMM without separate state decoding, as well as 95%credible intervals for various quantities, such as for thetransition probabilities in a mixed Markov model, wherethe original parameters are the less easily interpretablelogits.

One key characteristic of Bayesianmethods is that theyrequire the specification of prior distributions for modelparameters, indicating the range of plausible values thatthey can take. This feature can be used to incorporateprior information (from research) or beliefs into an anal-ysis, but it is also possible to choose vague (low infor-mative) prior distributions so that the model estimatesreflect the information in the current data and not priorbeliefs (cf. Lynch, 2007). While it is sometimes desirableto conduct sensitivity analyses to compare different priorsand to evaluate whether a prior is undesirably informa-tive about the model parameters, one can also consult theliterature for known characteristics of different prior dis-tributions and base the choice of priors on expert advice.In the empirical applications in this paper, we choose pri-ors that are recommended in the literature and that are asvague as possible.

Bayesian LMMs

While OMMs are quite straightforward to implement,there are a two points to consider about Bayesian LMMs,which could limit their usefulness in specific cases. First,there is no straightforward criterion for model fit ormodel selection that can be used to choose the numberof latent states when that number is not known a priori.Researchers working in the frequentist framework oftenuse theAkaike InformationCriterion (AIC; Akaike, 1974)or Bayesian Information Criterion (BIC; Schwarz, 1978)for model comparison and for choosing the number of

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 6: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

MULTIVARIATE BEHAVIORAL RESEARCH 751

latent states (see e.g., Bauer & Curran, 2003; MacDon-ald & Zucchini, 2009; Nylund, Asparouhov, & Muthén,2007; Bacci, Pandolfi, & Pennoni, 2014). In the Bayesianframework, the exact Bayes factor for model comparisonis difficult to implement, and both exact and approximateBayes Factors (such as the BIC approximation, cf. Kass& Raftery, 1995) yield results that depend on a specificprior distribution (Frühwirth-Schnatter, 2006). Anothercommonly used criterion is the Deviance InformationCriterion (DIC; Spiegelhalter, Best, Carlin, & Van DerLinde, 2002), but there is no consensus on the right wayto define the DIC for multilevel models (Celeux, Forbes,Robert, & Titterington, 2006) or for models includingdiscrete variables (Lunn, Spiegelhalter, Thomas, & Best,2009). A formal approach, and one that is highly consis-tent with Bayesian philosophy, is to estimate the numberof latent states by using Reversible Jump Markov ChainMonte Carlo estimation (RJMCMC; Green, 1995; alsoconfer Bartolucci et al., 2013). A downside of RJMCMCis that it requires custom programming and fine-tuning ofalgorithms.

Second, a potential issue that may arise duringBayesian estimation of LMMs (or other mixture mod-els) is what has been referred to as label switching(Celeux, Hurn, & Robert, 2000; Frühwirth-Schnatter,2006). Because the labels of the latent states (state “1,” “2,”and so on) are arbitrary, and because Bayesian estima-tion typically relies on iterative sampling from the pos-terior parameter distributions, it may happen that thelabels are switched during the course of estimation. If thisoccurs, it can often (but not always) be seen from the out-put because the posterior distributions of the parameterswill then be multimodal mixtures. For instance, if thereare two states, the posterior distributions for the averagetransition logits μ12 and μ21 will each have the same twopeaks, with (usually) one peak higher for one parameterand the other for the other parameter. As a result, theobtained posterior means, medians and 95% CIs of theparameters will be meaningless.2 In principle, the modesthat are visible in the plotted posterior densities could beused as (approximate) point estimates of the parameters(Frühwirth-Schnatter, 2006), but this only gives us a pointestimate, and we typically want to have some measure ofthe uncertainty about the parameter value.

A common strategy to prevent label switching, in prac-tice, is to impose order restrictions on the parameters fordifferent states to enforce a unique labeling. (cf. Albert& Chib, 1993; Richardson & Green, 1997). However, thisstrategy has been criticized on various accounts (Celeux,1998; Celeux et al., 2000; Stephens, 2000; Jasra, Holmes, &

The reason that classical maximum likelihood estimation methods do notpresent this problem is that they search for amode of the likelihood functionconditional on a given labeling (Frühwirth-Schnatter, ).

Stephens, 2005), themost important of which is that it cancause bias and inflate state differences. In fact, when thedata provide little or no support for a distinction betweentwo states, the posterior densities for those states’ param-eters logically should overlap, but an order restriction willeffectively obscure this “null” result bymodifying the pos-terior and enforcing a difference between the two param-eters. Thus, this approach is not recommended. It is betterto allow the possibility of label switching, and if it occurs,to consider whether the cause may be overparameteriza-tion, which can result in highly similar parameters withoverlapping posterior densities, or to consider whetherfrequentist estimation of that particular model is feasible.

In some LMM applications, label switching is unlikelyto occur. Jasra et al. (2005) have argued that label switch-ing is actually a necessary condition for model conver-gence, because a lack of label switching can be taken asa sign that the sampler has not covered the whole poste-rior distribution. However, when the parameters associ-ated with the different latent states are clearly separated,label switching is very unlikely to occur even after manyiterations, and in this case, the absence of label switch-ing need not indicate failed convergence. In other words,only if the uncertainty about the different states’ parame-ters causes substantial overlap in their posterior densities,is label switching expected to happen predictably withina realistic number of sampling iterations. In practice, thismeans that LMMs which are used to filter out measure-ment error are very unlikely to present with label switch-ing, since each latent state in such a model correspondsclosely to an observed category, making for clearly distin-guishable states. It is more of a risk for LMMs of the sec-ond type that we discussed, where there is no guaranteethat all the latent states will be clearly separated.

Software implementations

Bayesian Markov models can be implemented in theeasily accessible open-source programs OpenBUGS(Bayesian inference Using Gibbs Sampling; Lunn et al.,2009) and JAGS (Just Another Gibbs Sampler; Plummer,2013a). Using Bayesian estimation with these programsis relatively straightforward. The user can focus onspecifying the desired model and choosing the priordistributions, and the convergence of the model can beassessed using the plots and other output provided by theprogram. For the analyses in this paper, we use JAGS 4.0.0in combination with R 3.2.2 (R Development Core Team,2012) and the rjags package (Plummer, 2013b). By usingthis package, R can be used as the overarching program toprepare the data, run the analyses, check the convergence,and to store and further process the model results. OurJAGS model syntax is provided online as supplementary

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 7: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

752 S. DE HAAN-RIETDIJK ET AL.

material. A detailed discussion of Bayesian methods isbeyond the scope of this paper, but the interested readeris referred to the books by Kruschke (2014) and Lynch(2007).

Approaches to individual differences in theMarkov literature

In this section, we will look at the ways thatMarkovmod-els have been used in the literature, specifically, at the dif-ferent ways that researchers have accounted for individualdifferences in process dynamics. In psychological researchwith ILD, we expect individual differences in the pro-cess parameters, and the data contains enough informa-tion to allow these differences to be modeled. Therefore,from a substantive viewpoint, it is attractive to specify amixed Markov model, as discussed in a previous section.However, as we will see, researchers have used a numberof approaches for dealing with interpersonal variation inmodel parameters.

The first way that researchers have dealt with between-person differences is to include one or more covariatesin the model, without allowing for residual unexplainedindividual variation. This approach allows researchersto specify a deterministic relationship between modelparameters and a covariate (which may or may not betime varying). For example, Vermunt et al. (1999) appliedLMMS and OMMs to analyze educational panel dataconcerning pupils’ interest in physics. They included thevariables sex and school grades as covariates to accountfor gender differences and differences between high- andlow-performing pupils in the probability of starting totake an interest in physics (or losing interest). Since themodel did not include person-specific random effects orunexplained interpersonal differences, it only accountedfor group differences between the specified groups butnot for any other sources of variation between pupils.Some other examples of LMMs that include covariatesbut not random effects are found in Rijmen, Vansteelandtand De Boeck (2008b), Prisciandaro et al. (2012), andWall and Li (2009).

The second approach to taking individual variationinto account is to use a mixture Markov model (some-times also called a mixed Markov model), which tends tobe computationally easier, in the frequentist framework,than the mixed Markov model. A mixture Markov modeldistinguishes between a number of latent classes that dif-fer from each other with regard to the model param-eters, while within each class no individual differencesbetween persons are allowed. For instance, Crayen et al.(2012), analyzing ambulatory assessment data, identifiedtwo latent classes that differed in their mood regulationpattern during the day. This approach is also used inMaruotti (2011), who provides an illustration of amixture

LMMapplied to data concerning the relationship betweenpatent counts and research and development expendi-tures, and in various examples in Bartolucci et al. (2013).What may be considered a disadvantage of the mixtureMarkov modeling approach, is the assumption that thereare a limited number of homogeneous latent classes, whenmany psychological differences between people are prob-ably dimensional rather than categorical (Haslam, Hol-land, & Kuppens, 2012). It has been argued that mixturemodels need not reflect theoretical assumptions aboutdiscrete groups, but can serve as flexible non-parametricapproximations of underlying continuous random effects(cf. Maruotti & Rydén, 2009; Maruotti & Rocci, 2012).Such an approach involves choosing the number of latentsubgroups empirically, and in the frequentist framework,this can be done by using criteria such as the AIC or BIC(cf. Maruotti & Rocci, 2012). In the Bayesian framework,this approach would present a difficulty similar to choos-ing the number of latent states in an LMM, and it is notclear that a Bayesian mixture model is computationallyeasier than a mixed model.

The third approach, and the one that we focus onin this paper, is the inclusion of a continuous randomeffects distribution, by specifying a mixed Markov modelas described in a previous section. We want a multilevelmodel that encapsulates a unique description of each per-son’s process, which, in the case of ILD, contains enoughinformation that it could have been analyzed with anN = 1 model. It should be noted that a mixed Markovmodel with a normal distribution for the random tran-sition logits is not the same as modeling each person’sdata separately (and independently), because it involvesa distributional assumption for the random effects.A normal distribution on log-odds parameters can cor-respond to various distributional shapes for the impliedtransition probabilities (due to the non-linear relationshipbetween log-odds and probabilities), so that it is lessrestrictive an assumption than it may seem at first glance.Still, it is an assumption and it is worth pointing out thatthere are alternative, semi-parametric or non-parametric,approaches that involve less restrictive and more general-izable model specifications (cf. Maruotti & Rydén, 2009;Maruotti, 2011). Some examples of mixed LMM applica-tions using continuous random effects are found in thestudies by Altman (2007), Humphreys (1998), Rijmenet al. (2008a), DeSantis and Bandyopadhyay (2011), andShirley et al. (2010). Of these, only Humphreys (1998)and Altman (2007) included random effects in both partsof the LMM. Rijmen and colleagues (2008) and DeSantisand Bandyopadhyay (2011) included random effectsonly in the measurement model. In contrast, Shirley andcolleagues (2010) used random effects only in thetransition model.

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 8: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

MULTIVARIATE BEHAVIORAL RESEARCH 753

It is possible to allow for interpersonal variation in bothparts of an LMM. Bartolucci et al. (2013), in their bookabout LMMs, focus only on models with random effectsin either model part, and they warn readers that havingrandom effects in bothmodel parts would likely make themodel difficult to estimate and to interpret. We do notentirely agree with this advice; the studies of Humphreys(1998) and Altman (2007) are both convincing exampleswhere mixed LMMs with random effects in both modelparts were theoretically sensible as well as practically fea-sible. And in some psychological applications, it makes alot of sense to expect individual differences both in themeasurement part of the model and in the latent state-switching process. Although there may be limitations onthe complexity of models that can realistically be esti-mated for a given data set, we think researchers should notexclude a priori the possibility of including randomeffectsin both model parts of a mixed LMM. In our applicationof a mixed LMM in the next section, we fit a model withrandom transition logits as well as a random effect in themeasurement model part.

Empirical applications

In this section, we apply mixed Markov models to twoempirical data sets from intensive longitudinal research,for which it makes sense to focus on the dynamics of aprocess, and where between-person differences in thosedynamics are expected and may be linked to other per-son characteristics. These analyses provide an illustrationof how mixed Markov models can be applied in psychol-ogy, and how they can be implemented using Bayesianestimation. We use JAGS 4.0.0 (Plummer, 2013), R 3.2.2(R Development Core Team, 2012) and the rjags package(Plummer, 2013).

AnOMM for daily negative affect

The first data set involves daily self-report measures ofnegative affect (NA) obtained from the older cohort (ages50 and higher) of the Notre Dame Study of Health &Well-Being (see Bergeman & DeBoeck, 2014; Whitehead& Bergeman, 2014). We are interested in the NA subscaleof the PANAS (Watson, Clark, & Tellegen, 1988), whichthe participants filled out on 56 consecutive days. For ouranalysis, we select the N = 224 persons who had at mostsix missing values on this composite variable (the meanof the 10 specific NA items). Specifically, we want to studythe regulation of NA and how it is related to trait neu-roticism, which was also measured. The daily scores onthe NA scale ranged from 1 (very little or no NA) to 5(very intense NA) in 0.1 increments, but as noted beforeby Wang, Hamaker, and Bergeman (2012), some persons’

scores exhibited a floor effect because they repeatedlyreported experiencing no NA (i.e., for each of the NAitems on the scale they chose 1, the lowest possible valuewith the label “very slightly or not at all”). This indicatesthat, at least at the level of conscious experience, NA isoften absent or barely noticeable for a substantial numberof people.

If we want to account for this, modeling approachesthat focus on the intensity of NA as a continuous vari-able may not be optimal, both from a substantive and astatistical point of view. Rather than focusing immedi-ately on questions of affect intensity, we can distinguishbetween the propensity to experience any NA on the onehand, and the intensity of the affect, once it is experienced,on the other. A Markov model is appropriate for this databecause it can be used to investigate the frequency of tran-sitions between episodes with and without consciouslyexperienced NA. We will specify a mixed OMM, whichallows us to investigate individual differences in the tran-sition probabilities, and to study their relationship withneuroticism.

ModelFrom the original continuous NA variable, we create adichotomous variable indicating whether or not a personexperienced NA at a given time point (i.e., the state is 1when NA was the lowest possible value of 1, and the stateis 2 when the NA value was larger than 1). This dichoto-mous variable is then used as the data for the OMM, sothat we can investigate the transitions between experienc-ing and not experiencing NA. There was approximately2.2% missingness, which is dealt with automaticallyin JAGS through multiple imputation. This modernapproach to missing data amounts to imputing many dif-ferent plausible values for themissing datapoints (namely,a new value in each iteration of the estimation procedure,conditional on themodel parameter estimates in that iter-ation), such that the model estimates for the parametersend up accounting for the uncertainty about the missingvalues in the data (cf. Schafer & Graham, 2002). Thevariable neuroticism is centered and rescaled prior to theanalysis, for computational reasons discussed below. Thelikelihood for the observed states starting at time pointt = 2 is given by Equations 3, 4, and 6, with i, j taking onthe values 1 and 2 and with neuroticism as the predictorx. Combining this all gives us the following equation:

f (s|P) =56∏

t=2

224∏

n=1

2∏

i=1

2∏

j=1

× exp(μi j + βi j · xn + εi jn)([stn= j]·[s(t−1)=i])

exp(μi1 + βi1 · xn + εi1n + μi2 + βi2 · xn + εi2n),

(7)

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 9: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

754 S. DE HAAN-RIETDIJK ET AL.

where the expressions within square brackets form theexponent of the numerator and evaluate to 1 if they aretrue, and to 0 otherwise. The model parameters denotedby P are μ12, μ21, β12, β21, and the three unique elementsof the random effects covariance matrix �(ε).

PriorsFor the intercepts μ12 and μ21 and for the regressioncoefficients β12 and β21, we follow the approach for logitparameters recommended by Gelman, Jakulin, Pittau andSu (2008), by specifying independent Cauchy prior distri-butions (or, equivalently, Student’s t distributionswith onedegree of freedom) with scale parameter 10 for the inter-cepts and 2.5 for the regression coefficients, and locationparameter 0 in all cases. In accordancewith this advice, werescaled neuroticism to have a mean of 0 and a varianceof 0.25.

To determine whether there are non-negligible indi-vidual differences in the transition probabilities, weneeded to ensure that the prior distribution for the tworandom logit variances is suitably uninformative. Thecommon choice of a multivariate prior for a covariancematrix is the Inverse-Wishart (IW) distribution with anidentity matrix and degrees of freedom equal to the num-ber of random effects, but this prior can be undesirablyinformative and cause overestimationwhen the true valueof a variance is smaller than approximately 0.1 (Schu-urman, Grasman, & Hamaker, 2016). Therefore, we firstspecified a model with separate random effects, that is,a model in which the random effects are not correlated,but come from univariate normal distributions. This way,we could follow Gelman’s (2006) recommendation for avariance term in a hierarchical model, by specifying uni-form prior distributions over the interval from 0 to 100for each of the standard deviations σ12 and σ21. These pri-ors cannot bias small variance estimates upwards, so wecan detect it if one of the random effects variances is veryclose to zero, indicating that the model is overspecified.After finding that both of the variance estimates were wellover 0.1, we switched to a model specification using theIWprior formultivariate random effects, which takes intoaccount that the random effects will be correlated to someextent. Our JAGS model syntax for the final model withthe multivariate random effects is part of the supplemen-tary material accompanying this paper.

ResultsAfter 10,000 burnin iterations, we ran 50,000 additionaliterations, using a thinning parameter of 10 so that onlyevery 10th was stored for inference (a common approachto reduce autocorrelation in the samples without increas-ing the demands on computer memory). We inspected

Table . Results for the level- parameters of the OMM.We use theposterior medians as point estimates, and the SD and % centralcredible intervals (CCIs) represent the uncertainty. Note that theparameters �(ε)11 and �(ε)22 are the variances of the randomlogit deviations ε12 and ε21, respectively, and �(ε)12 is the covari-ance between these two random effects.

Empirical OMM estimates

Median SD % CCI

μ12 − . . [−.,−.]�(ε)11 . . [., .]β12 . . [., .]μ21 − . . [−., .]�(ε)22 . . [., .]β21 − . . [−.,−.]�(ε)12 − . . [−.,−.]

the trace plots, which showed no trend, and the den-sity plots, which looked unimodal, both indicating ade-quate convergence. Most importantly, we ran a separatechain with different (randomly generated) starting val-ues, which arrived at the same estimates. This provides astronger reassurance about convergence, because it showsthat the estimates are independent of the starting valuesand are unlikely to reflect only a local maximum.3 Theresults below are based on the 5,000 stored samples of thefirst chain.

Table 1 contains the estimates for the level-2 modelparameters.We use the medians of the Bayesian posteriordistributions as point estimates, and the SDs and 95% cen-tral credible intervals (CCIs) as indications for the uncer-tainty about the parameters (Lynch, 2007).

The means for the transition logits, estimated at−0.92and −0.17, correspond to transition probabilities of 0.28and 0.46, respectively. This implies that a person withaverage neuroticism has a 0.28 chance of transitioningfrom the non-negative to the negative state from oneday to the next, and a (1 − 0.46 =) 0.54 chance of stay-ing in the negative state. The β parameters reflect theeffect of trait neuroticism on the transition logits, andbecause we scaled neuroticism to have an SD of 0.5, eachβ represents the effect of a two-SD increase in neuroti-cism. Since the 95% CCIs for both parameters excludezero, we can be fairly certain that higher scores on neu-roticism predict a higher chance of starting to experi-ence NA, as well as a higher change of continuing to

Note that randomly generating all the starting values is not effective in all sit-uations, because depending on themodel complexity and on howwrong thestarting values are, the sampling algorithmmayget stuck and fail to convergeto the posterior distribution. Convergence can be aided by choosing realisticstarting values (for some parameters) based, for instance, on ML estimationof (a part of) the model, or a reasonable order of magnitude. In that case,to check that the results are independent of the starting values, one can stilluse different sets of starting values that reflect a range of possible values; forinstance, amedium-sized positive regression effect and amedium-sized neg-ative one both have a realistic order of magnitude, but are different enoughin their interpretation.

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 10: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

MULTIVARIATE BEHAVIORAL RESEARCH 755

Figure . Predicted transition probabilities for individuals withvarying levels of trait neuroticism (M = 24.45, SD = 5.42 in thesample), based on the point estimates (posterior medians) of themodel parameters given in Table . It can be seen that individualswith higher trait neuroticism are more likely to start experiencingnegative affect (i.e., theirπ12 is larger) and, once this happens, theyare also less likely to stop experiencing it (i.e., their π21 is smaller).

experience it. To gauge the relevance of these effects, wefill in different values for neuroticism (x) in Equation(4) and calculate the corresponding predicted transitionprobabilities. The result of this is presented in Figure 2,showing that the model implies substantial differencesbetween more and less neurotic individuals, with moreneurotic individuals having a higher propensity to expe-rience (continued) NA. In addition, there are substantialunexplained personal differences in the transition prob-abilities, as indicated by the estimates for the variancesof the logit deviation terms, given in Table 1. The distri-bution of the model-implied transition probabilities forthe individual persons is presented in Figure 3, show-ing that π12 ranges from 0.01 to 0.96 and π21 rangesfrom 0.004 to 0.99, and that there is a negative correla-tion between the two probabilities. This shows that someindividuals rarely started to experience NA, while othersrarely stopped experiencing it. The dots in the middle ofthe plot represent persons who switch more frequentlybetween episodes in which they do and do not reportexperiencing NA.

Model fitAs we discussed in a previous section, the fit of a BayesianMarkov model cannot be assessed by a standard fit crite-rion. However, we can use posterior predictive checks toevaluate how well the model captures specific aspects ofthe data. Shirley et al. (2010) demonstrated several suchprocedures for assessing the fit of Markov models, and

Figure . Scatterplot of the transition probabilities for the individ-ual persons in the data, obtained by using Equations () and ()with the estimated model parameters. The plot shows that thereis substantial interpersonal variance, and that those persons witha higher probability of transitioning into the negative state (π12)usually also had a lower probability of transitioning out of it (π21).

here we use a very similar approach. The general idea ofa posterior predictive check is that the sampled param-eter values can be used to simulate new data under theassumption that the model is true, and then the distri-bution of a certain statistic for the simulated data can becompared with the same statistic for the actual, empiricaldata. If the empirical data is “extreme” compared to themodel-generated data sets, this indicates that the modeldoes not adequately capture an aspect of the empiricaldata, or in other words, that there is misfit. A nice fea-ture of this procedure is that uncertainty about the modelparameters is taken into account, because different sam-ples from the posterior distribution of the parameters areused to generate the simulated data.

In a first check, we assessed the model fit with regardto the proportion of days on which participants experi-enced NA, that is, the proportion of days spent in state2. We used the posterior samples of the fixed parameterstogether with the empirical neuroticism scores and start-ing states (onwhich themodel estimates are conditioned),to simulate random transition logits and “observed” timeseries for a new “sample” of participants in each of the5,000 replications. For each simulated time series, we cal-culated the proportion of days spent in state 2, and thenfor each replication, we calculated the mean and stan-dard deviation of this proportion over persons. The meanand standard deviation for the empirical dataset were then

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 11: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

756 S. DE HAAN-RIETDIJK ET AL.

Figure . Results of posterior predictive check for the OMM. Thered lines represent the empirical mean and SD of the proportion ofdays that participants experienced NA. The histograms representthe model predictions, which take into account the uncertaintyabout the estimated model parameters.

compared with the posterior predictive distributions ofthese statistics.

The results of the first check are illustrated in Figures 4and 5. The predictive distribution of themean proportionof days spent in state 2 ranges approximately from 0.35to 0.55 and it can be seen in Figure 4 that the meanfor the empirical data falls in the middle part of thedistribution. Similarly, the SD of the proportion of daysspent in state 2 for the empirical data falls in the middlepart of the posterior predictive distribution, which rangesapproximately from 0.28 to 0.36. Together these twographs indicate that the model fit, in terms of capturingthis aspect of the empirical data, is adequate, because theempirical data is not at all “extreme” under the posteriorpredictive distribution. Figure 5 shows the posteriorpredictive distribution of the proportion of days spentin state 2 for individual persons, instead of focusing onthe between-persons mean or SD. This figure shows thatthe posterior predictive distribution closely correspondsto the observed distribution in the empirical data. Themodel underestimates the occurrence of (nearly) constant

Figure . Distribution of the proportion of days spent in state for the empirical (in red) or model-predicted (in gray) time seriesfor the participants. The close overlap indicates adequate modelfit. The model slightly underestimates the occurrence of (nearly)constant NA experience, as evidenced by the larger red bar at theextreme right of the graph.

Figure . Results of posterior predictive check for the OMM. Thered lines represent the empirical mean and SD of the number ofstate switches in days. The histograms represent themodel pre-dictions, which take into account the uncertainty about the esti-mated model parameters.

NA experience, but overall it captures the diversity in howoften participants reported experiencing NA quite well.

In a second check, we used the same simulated data,but we focused on the number of state switches during 56days. As can be seen in Figures 6 and 7, the mean andSD of the empirical sample again fall near the middle ofthe posterior predictive distributions, and the distribu-tions for individual persons also overlap closely betweenthe model prediction and the empirical data. All of thisindicates that the model fit is also adequate with regardto this statistic. Note that the empirical distribution hasa high peak representing people who switched states 0–2 times, and the model’s posterior predictive distributioncaptures this closely.

A way to illustrate (more than evaluate) the fit ofthe model, again following the example of Shirley andcolleagues (2010), is to look at model-generated state tra-jectories for a few persons and compare them with theiractual observed state trajectory. Figure 8 shows observedand predicted data for three persons characterized byhigh, low, and average state switching, respectively. Thepredicted time series in each case was generated using

Figure . Distributionof thenumber of state switches in the empir-ical (in red) or model-predicted (in gray) time series for the partic-ipants. The close overlap shows that the model adequately cap-tures the diversity among the study participants in how often theyswitched states.

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 12: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

MULTIVARIATE BEHAVIORAL RESEARCH 757

Figure . Observed (left) and predicted (right) states for three per-sons with widely differing switching rates. In generating the pre-dicted data, the posteriormedians were used as point estimates ofthe transition logits. Note that the specific day at which a personis predicted to switch states is not indicative of model fit, since notime-varying predictors were used; it is the overall pattern (switch-ing frequency, time spent in specific states) that should matchbetween the observed and predicted data.

the median of the posterior samples for that person’stransition logits, together with their observed startingstate and their neuroticism score. Because our model didnot include time-varying predictors, a person’s predictedstate at a specific day is not particularly meaningful orindicative of model fit, but rather the overall pattern ofthe predicted data (e.g., the switching frequency and thelength of time spent in a particular state) should matchthat of the observed time series. If themodel had includedone or more time-varying predictors or if the transitionlogits were non-constant over time, then it would havebeen appropriate to check whether the predicted statesmatched the observed states at (roughly) the same day.From the three examples in Figure 8, it is clear that ourmodel fits varying data patterns well: It can account forpeople who never switched states in the 56 days of thestudy, as well as for people who switched occasionally oralmost constantly (or who were more likely to make onetransition than the opposite one). For the first person, themodel predicts that they are always in state 2 (experienc-ing NA) and this matches their observed data. The modelpredicts that the second and third person spend 59% and25% of the days in state 2, respectively, which is close tothe observed values of 57% and 27%. And the numberof state switches in the predicted data for the secondand third person is 34 or 17, respectively, versus 32 or13 observed switches. We can also see that the modelcorrectly predicts that the third person goes through longstretches of time in state 1, while rarely staying in state 2for more than one or two days. The fact that the overallpatterns in these predicted time series look similar to

the observed data patterns illustrates that the model isflexible and that it fits the empirical data well.

ConclusionBy using the OMM with a covariate and random effect,we had a simple and intuitive way of analyzing thesedata that accounted for the meaningful zero inflation andaddressed the question of whether trait neuroticism isrelated to affect experience over time, as reflected in thepropensity of an individual to report experiencing NA atall. As expected, we found that more neurotic personsare more likely to start experiencing NA, and less likelyto stop experiencing it. The additional unexplained indi-vidual differences we found illustrate the importance ofallowing for random effects in the model. Posterior pre-dictive checks of the model fit indicated that the randomeffects enabled the model to adequately capture the widerange of diversity inNAexperience among individual per-sons.

An LMM for observed family interactions

We now turn to our second data set, which comes fromKuppens, Allen and Sheeber (2010), who observed ado-lescents (N = 141; 94 females, 47 males; mean age =16 years) and their parents while they participated invarious 9-minute discussion tasks designed to elicit dif-ferent interactions and emotions. The conversation taskthat we focus on here involved discussing positive andnegative elements in the upbringing of the adolescent,and therefore it was expected to evoke both positive andnegative interactions. The researchers used the Living inFamily Environments coding system (LIFE; Hops, Biglan,Tolman, Arthur, & Longoria, 1995) to code the behav-ior of each individual during the conversation in real-time. The codes were then summarized in a time seriesvariable, categorizing the behavior of each family mem-ber during each second (T = 539) as either angry, dys-phoric, happy, or neutral (for more details, see Kuppenset al., 2010). We approach the resulting data set as atrivariate categorical time series, where the behavior ofthe mother, the father, and the adolescent are treated asindicators that reflect the family’s state at each time point.As such, in the transition model, we have at level 1 thetime points, and at level 2 the family. In addition to theobservational behavior data, the researchers measuredwhether the adolescents met criteria for clinical depres-sion (there were 72 depressed and 69 non-depressedadolescents).

For our analysis, we used a mixed LMM that allows usto study the temporal dynamics of the family interactionand the between-family differences therein, as well aslook at stable differences in observed behavior between

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 13: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

758 S. DE HAAN-RIETDIJK ET AL.

individual adolescents, specifically between depressedand non-depressed adolescents.

Sheeber et al. (2009) and Kuppens et al. (2010) ana-lyzed these datawith a focus on the adolescents, using var-ious specialized multilevel regression techniques to study,for example, the relationship between depression andemotional reactivity of the adolescents. However, as analternative and supplementary approach, here we modelthe behavior of the family, considering the interactionbetween family members and analyzing them as a system;this is where the LMMcomes in as a suitablemodel.Whena family is participating in an interaction task designed toevoke some difficult emotions, it makes sense to expectthat the family will switch between positive and negative(and neutral) interaction states. We expect that familiesdiffer in this regard, with some families being more proneto conflict than others, and we want to use a model thatcan take this into account through random effects in thetransitionmodel part. At the same time,we expect that thespecific behavior of depressed adolescentsmay differ fromthat of non-depressed adolescents. A mixed LMM canincorporate both of these expectations, if it has randomeffects in the transition model part and random effectsplus the predictor depression in the measurement modelequations for the adolescent’s behavior. We would expectthat depressed adolescents are more likely to behave dys-phorically or angrily and less likely to behave happily, con-trolling for family states. In other words, we expect to findstable differences in behavior between depressed andnon-depressed adolescents, that cannot be explained away bydifferences in family dynamics.

Some of the adolescents participated in the task withonly one parent (42 participated with only their mother,and 4 with only their father). In our analyses, we onlyuse the data for those families (N = 95) where two par-ents participated in the task, because it seems reason-able to expect that interactions between an adolescentand one parent may differ qualitatively from interactionsbetween an adolescent and two parents, especially giventhe focus of the discussion task (namely, the upbring-ing of the adolescent). Furthermore, there may be differ-ences between single-parent and two-parent families thatwouldmake it inappropriate to treat them as interchange-able “units” within the same multilevel model.4 The pro-portion of missing values for the selected families was0.01%, 0.10%, and 0.32% for the adolescents, mothers,and fathers, respectively. These few missing values were

Results for an analysis including all families are available on request fromthe corresponding author. There were small differences in the results (withone behavior probability differing by ., but most by less than .–.)which could, however, simply result from sampling variation and loss of data.Overall, the conclusions from the two analyses are similar.

handled in JAGS using Bayesian multiple imputation, inthe same way as in the OMM discussed above.

ModelFirst, we compared LMMs with two and three states, inboth cases with random effects in the transition model,but only fixed effects in the measurement model. As wediscussed earlier, there is no straightforward statisticalcriterion to decide on the number of latent states in aBayesian LMM, so we compared the estimates for thetwo models to determine which one offered the mostuseful substantive description of the data. We concludedthat the three-state model made more sense, because thethree states could be clearly interpreted as correspond-ing to positive, negative, and neutral family interactions,whereas the two-state model seemed to lump togetherneutral and negative interactions, making it less insight-ful. Hence, we focus here on the specifications and resultsfor the three-state LMM.5

After deciding on the number of latent states, we addedrandom effects in the measurement model for the ado-lescent’s behavior, to allow for differences in the behav-ior of the adolescents. These random effects are state-independent, implying that some adolescents are alwaysmore likely to act angrily than others, or that some arealways more likely to act happily than others, and so on.We also included depression as a binary predictor in themeasurement model for the adolescents, to reflect ourhypothesis about stable differences in behavior betweendepressed and non-depressed adolescents.

Conditional on the latent states (s), the distribution forthe observed data is given by

f (Mo, Fa, Ad|M, F,A, s) =4∏

b=1

N∏

n=1

T∏

t=1

(Msb)[Motn=b] ·

(Fsb)[Fatn=b] · (Asbn)[Adtn=b],

(8)

where b stands for the behavior category, and N and Tare the number of families and time points, respectively.The parametersMsb, Fsb, andAsbn refer to the probabilitiesof behavior b given the current family state s; note thatthe probabilitiesAsbn are person-specific. The expressionswithin square brackets evaluate to 0 or 1 depending onwhether they are true. The full likelihood of the data andthe latent family states together is given by

f (Mo, Fa, Ad, s|M, F,A, π, π1)= f (Mo, Fa, Ad|M, F,A, s) · f (s|π, π1), (9)

Results for the two-statemodel are available on request from the correspond-ing author.

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 14: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

MULTIVARIATE BEHAVIORAL RESEARCH 759

where the distribution of the latent states, conditional onthe transition probabilities, and the starting state proba-bilities (π1), is specified as

f (s|π, π1) =N∏

n=1

3∏

i=1

3∏

j=1

(π1 j)[s1n= j]

T∏

t=2

(πi jn)[stn= j]·[s(t−1)=i]

(10)with

πi jn = exp( αi jn)∑3s=1 exp( αisn)

(11)

and

αi jn = μi j + εi jn (12)

for each family n and each i, j ∈ [1, . . . 3], such that μi jis the average logit for the transition from state i to statej, and εi jn is family n’s deviation. For identification, we setμi j = εi jn = 0 whenever i = j, with the result that stayingin the same state is always the reference transition for thelogits. Thus, only 3 · 2 = 6means and deviations need tobe estimated, and the six deviations have a multivariatenormal distribution with a zero mean vector of length 6,and a 6 × 6 covariance matrix �(ε).

In the measurement model part, the behaviors (b) 1,2, 3, and 4 represent angry, dysphoric, happy, and neu-tral behavior, respectively. The probabilities Asbn for theadolescents’ observed behavior are derived from person-specific logits, that is,

Asbn = exp(αsbn)∑4c=1 exp(αscn)

. (13)

These logit parameters are modeled as

αsbn = μsb + βb · xn + εbn, (14)

for each state s and for each behavior b, where μsbrepresents the logit intercept and βb is the coefficientof the centered binary predictor depression, which isindependent of the state, and εbn represents the stable(state-independent) individual deviation in the logits forbehavior b. To identify the measurement model part, wespecify ε4n = μs4 = β4 = 0 for each family state s, mak-ing neutral behavior the reference category for the (log)odds. The three random deviations have a multivariatenormal distribution with zero means and a 3 × 3 covari-ance matrix �(beh).

PriorsAnuninformativeDirichlet prior distributionwith hyper-parameters [1, 1, 1] was used for the initial state proba-bilities, and similarly, uninformative Dirichlet priors withhyperparameters [1, 1, 1, 1] were used for the mothers’and fathers’ probabilities of exhibiting the four behav-iors conditional on each latent family state. Similar to our

specifications for the OMM, for the intercepts and regres-sion coefficients (in both model parts), we specified inde-pendent Cauchy prior distributions with scale parameter(σ ) 10 for the intercepts and 2.5 for the regression coeffi-cients, and with location parameter 0 in all cases. We cen-tered the predictor depression to have amean of 0, but didnot rescale it, in line with the advice of Gelman and col-leagues (2008) concerning binary predictors. As we didin the OMM application, we first fitted a model with uni-variate random effects to avoid the risk of overestimatingvery small variances, andwhenwe found that the varianceestimates were all well over 0.1, we switched to a multi-variate specification that allows the random effects withineach model part to be correlated. The JAGS model syn-tax for the final model, that is, for the three-state mixedLMMwithmultivariate random effects, is provided in theAppendix.

ResultsAfter a burnin period of 10,000 iterations, we used 50,000additional iterations and a thinning parameter of 5 (stor-ing every fifth value). Based on the trace plots, unimodaldensity plots and a second chain using different (random)starting values which arrived at a different state labelingbut the samemodel results, we concluded that the conver-gence of themodelwas adequate. The labeling of the statesappeared to remain constant over all iterations within achain,whichmakes sense as the stateswerewell separated.The results presented below are based on the 10,000 storedsamples of the first chain.

To interpret the three family states, we look at the fixedparameter estimates for the measurement model part,which are given in Table 2. We see that the first state isthe only one where there is a high likelihood of observinghappy behavior (see the third column), so we refer to itas the positive state. The second state we interpret as theneutral state, because it is almost always characterized byneutral behavior of the parents, although the adolescents’behavior varies a bit more. The third state has the highestprobabilities of angry and dysphoric behavior of all threestates, so it can be interpreted as a state of negative familyinteraction.

Looking at the average transition probabilities, given inTable 3, it can be concluded that the probabilities of stay-ing in the same state are quite high for all three states. Thismakes sense when considering the short coding intervals,because family interaction states will tend to last longerthan a single second. We also note that the average tran-sition probabilities, unsurprisingly, indicate that familiesare unlikely to switch directly between positive and neg-ative interactions (states 1 and 3), that is, they are more

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 15: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

760 S. DE HAAN-RIETDIJK ET AL.

Table . Point estimates for the probabilities of different behaviortypes of each familymember, conditional on the state the family isin. The probabilities for the average depressed and non-depressedadolescents are derived from the posteriormedians of the averagelogits and the regression coefficients.

Behavior probabilities in state (positive interaction)

Angry Dysphoric Happy Neutral

Average non-depressed adolescent . . . .Average depressed adolescent . . . .Mother . . . .Father . . . .

Behavior probabilities in state (neutral interaction)

Average non-depressed adolescent . . . .Average depressed adolescent . . . .Mother < . < . < . .Father < . . < . .

Behavior probabilities in state (negative interaction)

Average non-depressed adolescent . . . .Average depressed adolescent . . . .Mother . . < . .Father . . < . .

likely to make this transition by moving through a neu-tral interaction (state 2) first. In addition to the aver-age transition probabilities, we have the estimated vari-ance terms from �(ε) that provide a measure of howmuch the individual families differ in their transitionprobabilities. These estimates range from 0.28 (for thetransition from state 2 to 3) to 1.09 (for the transition fromstate 3 to 2), but it is difficult to gauge the extent of the dif-ferences in probabilities between families from these logitvariances. Therefore, it may be more insightful to derivethe transition probabilities for the individual families, forthe transitions with the smallest logit variance (the tran-sition from state 2 to 3) and the largest logit variance (thetransition from state 3 to 2). For the first one, we find thatthe minimum and maximum estimated probabilities fora family to make this transition (i.e., for switching fromthe neutral to the negative interaction state) are 0.014 and0.102, respectively. For the second one, the smallest esti-mated probability ofmaking this transition (i.e., of switch-ing from a negative to a neutral interaction) is 0.004, whilethe highest probability is 0.279. These numbers give an

Table. Point estimates for thefixedeffects in the transitionmodelpart. For the sake of interpretability, we present the implied prob-abilities, not the logits themselves. The value in row i, column jrepresents the probability, for an average family, of transitioningfrom state i to j.

Average transition probabilities

Transition to: State State State

State . . .State . . .State . . .

indication of how the estimated random logit distributiontranslates into varying transition probabilities across fam-ilies. Clearly, there are non-negligible differences in inter-action dynamics between the families.

To address our hypothesis about depressed and non-depressed adolescents, we inspect the behavior probabil-ities for the average adolescent in each group (based onthe parameters μsb and βb of Equation 14), which arereported in Table 2. Although the differences in proba-bilities are modest, we see that depressed adolescents are,on average, more likely to act angrily or dysphoricallythan non-depressed adolescents. Another way of inves-tigating differences between the two groups is to inspectthe 95% CIs of the β regression coefficients. For exam-ple, the 95% CI for β1 is [−0.32, 0.77], and since zero isincluded in this interval, we cannot conclude that the oddsratio of angry versus neutral behavior was different fordepressed adolescents. However, the conclusion is differ-ent for the odds ratio of dysphoric versus neutral behav-ior, because the 95% CI for β2 is [0.14, 0.94], indicatingthat depressed adolescentsweremore likely to exhibit dys-phoria than non-depressed adolescents. The odds ratioof happy versus neutral behavior was not clearly differ-ent between the two groups, with β3 having a 95% CI of[−0.47, 0.33]. Note that depressed adolescents were lesslikely to behave neutrally, based on the model-impliedprobabilities given in Table 2, but we cannot use a 95% CIto evaluate this difference directly, as neutral behavior wasthe reference category for the odds. Furthermore, we notethat there was substantial variation of the behavior prob-abilities across adolescents, on top of the differences thatwere predicted by depression. The three variance parame-ters of the measurement model logits were 2.21, 1.55, and0.93, and these are quite large even in comparison to thelargest values contained in the 95% CIs for the three β

parameters described above. Therefore, our results indi-cate that depression only explained a modest part of theperson-specific variability in behavior of the adolescents.

Model fitTo assess the fit of the model, we used posterior predic-tive checks similar to the first two checks we used forthe OMM. Since the latent states in an LMM cannot bedirectly compared to “true” states in the data (as we didfor the states in the OMM in Figure 8), the posterior pre-dictive checks here focus on the observable outcomes pre-dicted by the LMM and known for the empirical sample:The affective behavior of family members.

We first checked whether the model adequatelydescribes the proportion of negative behavior for moth-ers, fathers, and adolescents – distinguishing betweendepressed and non-depressed adolescents. In a similarway as we did for theOMM,we used the samples from the

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 16: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

MULTIVARIATE BEHAVIORAL RESEARCH 761

posterior distribution of the fixed effects and the randomeffects mean and covariance matrix to generate new ran-dom effects from the model, and new “observed” behav-ior for simulated families based on those random effects,conditional on the depression (or non-depression) of theobserved adolescents. In other words, we generated datathat reflects what kind of behavior we would expect toobserve if the model is true and if we observed differentsamples of families like the ones in the empirical study.

Figure 9 shows the results for the posterior check ofthe mean and standard deviation (over persons) of theproportion of negative behavior, for each family memberseparately. Note that the distributions for the two groupsof adolescents are more spread out because the modelestimates for either group are based on less informationthan the estimates concerning the parents’ behavior.As the Figure shows, the mean proportion of negativebehavior for mothers, fathers, and for both groups of

Figure . Results of posterior predictive check for the LMM. Thered lines represent the empirical mean and SD of the propor-tion of negative affective behavior. The histograms represent themodel predictions, taking into account sampling variation as wellas uncertainty about estimates.

Figure . Distribution of the proportion of negative affectivebehavior over persons, in the empirical data (in red) and in themodel-predicted data (in gray).

adolescents are captured well by the model, becausein each case the statistic for the empirical data falls inthe middle part of the posterior predictive distribution.When it comes to the standard deviation, the model fit isadequate for the adolescents, indicating that the randomeffects allow the model to capture the individual differ-ences between adolescents in how often they showedangry or dysphoric (negative) behavior. The standarddeviation for the observed mothers and fathers was largerthan expected under the posterior predictive distribution,indicating that the model underestimates the diversitybetween individual parents in how often they behavednegatively. This indicates that additional random effectsfor the behavior of the parents’ might have improvedthe model further, but note that this would makethe model very complicated and possibly empiricallyunderidentified.

Figure 10 shows the distribution of the proportion ofnegative behavior for individual people in the empiricalsample (in red) and in the posterior predictive data (ingray), and this makes clear that the misfit for the mothersand fathers results from “extremes” in the empiricalsample: Parents who displayed negative behavior veryrarely (if at all), or very often, while the model predictsmore moderate proportions of negative parent behavior.This misfit may occur because some families spent moretime in negative interaction states than expected underthe model, or because some individual parents deviatedfrom the expected behavior. Overall the fit of the modelis adequate, but we have to be cautious in interpreting theresults given that we know the model does not describeall families equally well. In an in-depth empirical study, aresult like this could be a reason to look closer at the data,and possibly other characteristics, of the individual fami-lies who are not described as well by the model as others.

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 17: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

762 S. DE HAAN-RIETDIJK ET AL.

Figure . Results of posterior predictive check for the LMM.The red lines represent the empirical mean and SD of the pro-portion of happy family interactions (moments when at least twofamily members behaved happily). The histograms represent themodel predictions, taking into account sampling variation as wellas uncertainty about estimates.

For a secondposterior predictive check of themodel fit,we used the same posterior predictive data, but focused onthe proportion of “happy family interactions.” We definethese as interactions where at least two of the three fam-ily members exhibited happy behavior. Figure 11 showsthe results for the mean and SD over families of the pro-portion of happy interactions. It appears that neither themean nor the SD of the empirical sample are extremeunder the model prediction, indicating adequate fit withregard to this data characteristic.

If we consider the distribution for the individual fami-lies in Figure 12, we see that there is somemisfit due to anunderestimation of the number of families who rarely ornever displayed happy interactions according to our cri-terion. Similar to the first check, we can interpret theseresults as showing that the model does a good job ofdescribing the average family in the sample and of captur-ing a large part of the inter-family differences, but it doesnot do a perfect job of describing themost “extreme” fam-ilies in the sample. Therefore, our results should be inter-preted with caution, and in an empirical study, it would

Figure . Distribution of the proportion of happy interactionsfor empirical (in red) or model-predicted (in gray) families. In theempirical data, there were more families at the left end of the dis-tribution, i.e., families where it rarely (or never) occurred that twofamily members behaved happily at the same time.

make sense to further examinewhat other variablesmightbe related to such differences.

ConclusionOur analysis has illustrated that by applying a mixedLMM to this problem, we can investigate inter-family andinter-individual differences in a single analysis, by allow-ing for random effects in the transition model and in the(adolescent’s) measurement model. In this analysis, weused a predictor variable in one part of the model, but anice feature of the LMM is that predictors can potentiallybe used in both model parts, to represent diverse psycho-logical accounts of how other characteristics relate to theprocess dynamics.

Substantively, our results indicated that depressed ado-lescents are more likely to behave angrily or dysphori-cally, compared with non-depressed adolescents. Thesedifferences cannot simply be attributed to possible dif-ferent family dynamics, as those were to a large extenttaken into account by the random effects in the transi-tion part of our LMM. In conclusion, this example showsthat an LMM can be used to address various, quite spe-cific hypotheses using ILD, by offering the chance to studythe temporal dynamics of a latent process while allowingfor between-person (or between-unit) variation of severalkinds. We also illustrated how posterior predictive checkscan be used in the Bayesianmodel fitting context to inves-tigate how well the model captures particular aspects ofthe empirical data, indicating where and how there ismis-fit of the model to the data.

Label switching was not an issue for our LMM applica-tion. The main practical challenges of the (mixed) LMMframework that we encountered in this analysis were thelack of an accessible, formal procedure for choosing thenumber of latent states empirically in the Bayesian con-text, and the computational intensity that made the anal-ysis time consuming (i.e., taking several days), butwe notethat the latter issue is heavily dependent on the size ofthe data set and on the complexity of a given model for-mulation. It can be expected that further technologicaldevelopments will increase computing speed so that thecomputational demand of Bayesian estimation of mixedLMMs will become less relevant for future applications.Furthermore, JAGS aswell as other software packages thatcan implement (some) Bayesian models, such as Mplus(Muthén & Muthén, 1998–2015), are constantly beingdeveloped so that more efficient estimation algorithmsfor Markov models, or even RJMCMC implementationsfor determining the number of states in an LMM, areexpected to become available and easily accessible.6 There

Stan (Carpenter et al., ) is another recently developed Bayesian languagethat can be used with R, but it cannot fit models with discrete parameters,such as latent Markov models or other mixture models.

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 18: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

MULTIVARIATE BEHAVIORAL RESEARCH 763

is also ongoing research in the area of Bayesian modelevaluation and comparison which leads to promising newapproaches for assessing model fit from various perspec-tives (e.g., Vehtari, Gelman, & Gabry, 2017).

Discussion

In this paper, we set out to introduce the relativelyunknown class of Markov models to psychologicalresearchers, and to provide an introduction to howmixedMarkov models may be used for the analysis of ILD. Wehave discussed how mixed Markov models can be spec-ified, and have argued that while frequentist estimationhas been used for these and similar models, Bayesianestimation has several advantages that are relevant inthis research context. Two empirical applications demon-strated how mixed Markov models can be applied ina Bayesian framework and what kind of research ques-tions they can address. Note that these analyses served asbasic illustrations of the modeling approach, and did notinvolve in-depth or formal consideration of alternativemodels, or follow-up analyses of individual time series.

Markov models appear to be very flexible and can suitdifferent types of data and research questions in the con-text of studying temporal processes. MixedMarkov mod-els are especially promising, in our view, because theycan be used to study individual differences in dynam-ics through the inclusion of person-specific randomeffects. In this paper, we discussed that the implementa-tion of some Markov models, namely LMMs and espe-cially mixed LMMs, can be computationally demandingand that the Bayesian approach, while advantageous insome regards, does not offer a straightforward way tochoose the number of latent states empirically. Despitethese limitations, this modeling framework provided use-ful insights for both of our empirical data sets, andwe feel that it deserves more consideration in intensivelongitudinal research in psychology. We hope that thispaper contributes to (further) familiarizing psychologi-cal researchers with the possibilities offered by Markovmodeling.

At this point, we consider a few limitations of thepresented framework along with possibilities for adapt-ing the general approach to different research situations.First, in the models that we presented, the state transi-tion probabilities were constant over time. This assump-tion makes sense in the context of ILD analysis, wherethe focus is typically on short-term reversible variabil-ity in stationary processes, rather than on an expectedpattern of growth or decline. Note, however, that it ispossible to relax this model assumption in situationswhere the parameters describing the process are expectedto change over time, or where a continuous observed

variable shows a time trend above and beyond the fluc-tuations explained by state switching. The specificationof time-varying parameters is covered in Bartolucci et al.(2013), along with other possibilities such as specifying ameasurement model for ordinal observed categories, ordealing with the scenario where a specific state transi-tion is theoretically impossible. MacDonald and Zucchini(2009) also present several variations of LMM applica-tions. While allowing the same model parameter to bedifferent for each person ánd each time point necessar-ily leads to an unidentified model, it should be possible tocombine some degree of time heterogeneity, which couldinvolve a time-varying predictor, with person-specificrandom effects or with discrete components as in themix-ture Markov model. As a more general matter, it shouldbe noted that the Markov models treated in this paperrepresent but one instance of a broad and diverse class ofmodeling approaches and thatMarkov state switching canalso be combined with other models, such as autoregres-sive models for ILD, to allow for time-varying parametersin those models. For example, Markov-switching vector-autoregressive models are treated in detail in Fiorentini,Planas andRossi (2016), and a Bayesian approach for suchmodels is discussed in Harris (1997). General overviewsof various Markov switching and state-space models aregiven in Frühwirth-Schnatter (2006) and Kim andNelson(1999).

A second consideration is that the Markov modelsdiscussed in this paper come with the assumption ofequally spaced measurements. This assumption will beviolated when data is collected using the Experience Sam-pling Method (ESM; Larson & Csikszentmihalyi, 1983),where the time intervals between measurements havevarying lengths. If a discrete-time Markov model is fit-ted to such a data set, it can be expected that the esti-mates of the transition probabilities are somewhat morenoisy and that, consequently, there is less power to detectcovariate effects. A better alternative for unequally spacedmeasurements may be to switch to a continuous-time for-mulation of theMarkovmodel, where the temporal struc-ture of the state-switching process is captured in a transi-tion intensity matrix instead of a probability matrix (cf.Karlin & Taylor, 1975). Continuous-time Markov mod-els can be implemented in JAGS using the inbuilt dmstatedistribution (Plummer, 2013), or with classical estimationusing the R package msm (Jackson, 2013), but the lat-ter cannot (yet) handle random effects in the transitionmodel.

An ongoing discussion in longitudinal research, moregenerally, focuses on how to determine an optimal timelag between measurements for modeling particularrelationships (for example, Reichardt, 2011; Dorman& Griffin, 2015). When the time intervals between

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 19: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

764 S. DE HAAN-RIETDIJK ET AL.

measurements are short, there could be relationshipsbetween measurements farther apart, and this could beinvestigated by using higher order models, where the cur-rent state is modeled as depending on multiple precedingstates (MacDonald & Zucchini, 2009). When measure-ment intervals are outside an optimal range (whether tooshort or too long), relationships or lagged effects maybe obscured or spuriously induced, so it is important toconsider in the design phase of a study how quickly andhow often the state switches and effects of interest areexpected to occur. For instance, while mood fluctuationsmay take place over hours or days, behavioral observa-tion needs to use a suitably fine scale to capture real-timeinteractions. In the context of the optimal lag problem, itis again relevant to note that continuous-time modelinghas been proposed as a way to avoid the lag-dependencyinherent in all discrete-time modeling of equally spaceddata (Deboeck & Preacher, 2016).

In conclusion, although there are some unresolvedquestions, we believe that intensive longitudinal meth-ods combined with Markov modeling techniques can beof great value for psychological research, as it enablesus to study the dynamics of state-switching processesand to shed light on between-person differences in thesedynamics. We expect that further developments in soft-ware and computing power, as well as in statisticalapproaches to Bayesian model evaluation, will contributeto solving current challenges in applying Bayesian mixedLMMs and facilitate easier implementation of Markovmodels.

Article Information

Conflict of Interest Disclosures: Each author signed aform for disclosure of potential conflicts of interest. Noauthors reported any financial or other conflicts of inter-est in relation to the work described.

Ethical Principles: The authors affirm having followedprofessional ethical guidelines in preparing this work.These guidelines include obtaining informed consentfrom human participants, maintaining ethical treatmentand respect for the rights of human or animal partici-pants, and ensuring the privacy of participants and theirdata, such as ensuring that individual participants cannotbe identified in reported results or from publicly availableoriginal or archival data.

Funding: This work was supported by Grant VIDI 452-10-007 from the Netherlands Organization for ScientificResearch, and the empirical data sets used in this studyhad been previously collected with the support of grants1 R01 AG023571-A1-01 from the National Institute on

Aging and NIMH R01 MH65340 from the National Insti-tute of Mental Health.

Role of the Funders/Sponsors: None of the funders orsponsors of this research had any role in the design andconduct of the study; collection, management, analy-sis, and interpretation of data; preparation, review, orapproval of the manuscript; or decision to submit themanuscript for publication.

Acknowledgments: The authors would like to thankthe editor and the reviewers at Multivariate BehavioralResearch for their comments on prior versions of thismanuscript. The ideas and opinions expressed hereinare those of the authors alone, and endorsement by theauthors’ institutions or the Netherlands Organization forScientific Research or the National Institute on Aging orthe National Institute of Mental Health is not intendedand should not be inferred.

ORCID

P. Kuppens http://orcid.org/0000-0002-2363-2356

References

Akaike, H. (1974). A new look at the statistical model iden-tification. IEEE Transactions on Automatic Control, 19(6),716–723. doi: s10.1109/TAC.1974.1100705

Albert, J. H., & Chib, S. (1993). Bayes inference via Gibbs sam-pling of autoregressive time series subject to Markov meanand variance shifts. Journal of Business & Economic Statis-tics, 11(1), 1–15. doi: 10.1080/07350015.1993.10509929

Altman, R.M. K. (2007).Mixed hiddenMarkovmodels. Journalof the American Statistical Association, 102(477), 201–210.doi: 10.1198/016214506000001086

Bacci, S., Pandolfi, S., & Pennoni, F. (2014). A comparison ofsome criteria for states selection in the latentMarkovmodelfor longitudinal data. Advances in Data Analysis and Classi-fication, 8(2), 125–145. doi: 10.1007/s11634-013-0154-2

Bartolucci, F., Farcomeni, A., & Pennoni, F. (2013). LatentMarkov models for longitudinal data. Boca Raton, FL:Chapman & Hall/CRC.

Bauer, D. J., & Curran, P. J. (2003). Distributional assumptionsof growth mixture models: Implications for overextractionof latent trajectory classes. Psychological Methods, 8(3), 338.doi: 10.1037/1082-989X.8.3.338

Bergeman, C. S., & Deboeck, P. R. (2014). Trait stress resis-tance and dynamic stress dissipation on health and well-being: The reservoir model. Research in Human Develop-ment, 11(2), 108–125. doi: 10.1080/15427609.2014.906736

Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich,B., Betancourt, M., Brubaker, M. A., Guo, J., Li, P., & Rid-dell, A. (2017). Stan: A probabilistic programming lan-guage. Journal of Statistical Software, 76(1), 1–32. doi:10.18637/jss.v076.i01

Celeux, G. (1998). Bayesian inference for mixture: The labelswitching problem. In R. Payne & P. Green (Eds.), Comp-stat: Proceedings inComputational Statistics 13th Symposium

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 20: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

MULTIVARIATE BEHAVIORAL RESEARCH 765

held in Bristol, Great Britain, 1998, (pp. 227–232), Heidel-berg: Physica-Verlag HD

Celeux, G., Forbes, F., Robert, C. P., & Titterington, D. M.(2006). Deviance information criteria for missing datamodels. Bayesian Analysis, 1(4), 651–673. doi: 10.1214/06-BA122

Celeux, G., Hurn, M., & Robert, C. P. (2000). Computationaland inferential difficulties with mixture posterior distri-butions. Journal of the American Statistical Association,95(451), 957–970. doi: 10.1080/01621459.2000.10474285

Crayen, C., Eid, M., Lischetzke, T., Courvoisier, D. S., & Ver-munt, J. K. (2012). Exploring dynamics in mood regulation– Mixture Latent Markov modeling of ambulatory assess-ment data. Psychosomatic Medicine, 74(4), 366–376. doi:10.1097/PSY.0b013e31825474cb

de Haan-Rietdijk, S., Gottman, J. M., Bergeman, C. S., &Hamaker, E. L. (2016). Get over it! A multilevel thresh-old autoregressive model for state-dependent affectregulation. Psychometrika, 81, 217–241. doi:10.1007/s11336-014-9417-x

Deboeck, P. R., & Preacher, K. J. (2016). No need to be discrete:A method for continuous time mediation analysis. Struc-tural Equation Modeling: A Multidisciplinary Journal, 23(1),61–75. doi: 10.1080/10705511.2014.973960

DeSantis, S. M., & Bandyopadhyay, D. (2011). Hidden Markovmodels for zero-inflated Poisson counts with an applicationto substance use. Statistics in Medicine, 30(14), 1678–1694.doi: 10.1002/sim.4207

Dormann, C., & Griffin, M. A. (2015). Optimal time lags inpanel studies. Psychological Methods, 20(4), 489–505. doi:10.1037/met0000041

Fiorentini, G., Planas, C., & Rossi, A. (2016). Skewness andkurtosis of multivariate Markov-switching processes. Com-putational Statistics & Data Analysis, 100, 153–159. doi:10.1016/j.csda.2015.06.009

Frühwirth-Schnatter, S. (2006). Finite mixture and Markovswitchingmodels. NewYork, NY: Springer Science and Busi-ness Media.

Gelman, A. (2006). Prior distributions for variance parametersin hierarchical models (comment on article by Browne andDraper). Bayesian Analysis, 1(3), 515–534. doi: 10.1214/06-BA117A

Gelman, A., Jakulin, A., Pittau, M. G., & Su, Y. S. (2008). Aweakly informative default prior distribution for logisticand other regression models. The Annals of Applied Statis-tics, 2(4), 1360–1383. doi: 10.2307/30245139

Green, P. J. (1995). Reversible jump Markov chain MonteCarlo computation and Bayesian model determination.Biometrika, 82(4), 711–732. doi: 10.2307/2337340

Hamaker, E. L., Ceulemans, E., Grasman, R. P. P. P., & Tuer-linckx, F. (2015). Modeling affect dynamics: State of the artand future challenges. Emotion Review, 7(4), 316–322. doi:10.1177/1754073915590619

Hamaker, E. L., Grasman, R. P. P. P., & Kamphuis, J. H. (2016).Modeling BAS dysregulation in bipolar disorder: Illustrat-ing the potential of time series analysis. Assessment, 23(4),436–446. doi: 10.1177/1073191116632339

Harris, G. R. (1997). Regime switching vector autoregressions: ABayesian Markov chain Monte Carlo approach. Proceedingsof the 7th International AFIR Colloquium, Cairns, Aus-tralia, vol. 1, pp. 421–451.

Haslam,N.,Holland, E., &Kuppens, P. (2012). Categories versusdimensions in personality and psychopathology: A quanti-tative review of taxometric research. PsychologicalMedicine,42(5), 903–920. doi: 10.1017/S0033291711001966

Hops, H., Biglan, A., Tolman, A., Arthur, J., & Longoria, N.(1995). Living in family environments (Life) coding system:Manual for coders (revised). Eugene, OR: Oregon ResearchInstitute.

Hox, J. J. (2010).Multilevel analysis. Techniques and applications(2nd ed.). Oxford: Routledge.

Humphreys, K. (1998). The latentMarkov chain withmultivari-ate random effects. Sociological Methods & Research, 26(3),269–299. doi: 10.1177/0049124198026003001

Jackson, C. H. (2014). Multi-state modelling with R: Themsm package. Version 1.3.2. [Computer software manual].Retrieved from https://cran.r-project.org/package=msm

Jackson, J. C., Albert, P. S., & Zhang, Z. (2015). A two-statemixed hidden Markov model for risky teenage drivingbehavior. The Annals of Applied Statistics, 9(2), 849–865.doi: 10.1214/14-AOAS765

Jasra, A., Holmes, C. C., & Stephens, D. A. (2005).Markov chainMonte Carlo methods and the label switching problem inBayesianmixturemodeling. Statistical Science, 20(1), 50–67.doi: 10.2307/20061160

Kaplan, D. (2008). An overview of Markov chain meth-ods for the study of stage-sequential developmental pro-cesses. Developmental Psychology, 44(2), 457–467. doi:10.1037/0012-1649.44.2.457

Karlin, S., & Taylor, H.M. (1975).A first course in stochastic pro-cesses. New York, USA: Academic Press.

Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal ofthe American Statistical Association, 90(430), 773–795. doi:10.1080/01621459.1995.10476572

Kim, C.-J., & Nelson, C. R. (1999). State-space models withregime switching: Classical and Gibbs-sampling approacheswith applications (2nd ed.). Cambridge, MA: MIT Press.

Kruschke, J. (2014).DoingBayesian data analysis: A tutorial withR, JAGS, and Stan. Cambridge, MA: Academic Press.

Kuppens, P., Allen, N. B., & Sheeber, L. B. (2010). Emo-tional inertia and psychological maladjustment.Psychological Science, 21(7), 984–991. doi:10.1177/0956797610372634

Langeheine, R., & Van de Pol, F. (2002). Latent Markov chains.In J. A.Hagenaars&A. L.McCutcheon (Eds.),Applied latentclass analysis (pp. 304–341). Cambridge: Cambridge Uni-versity Press.

Larson, R., &Csikszentmihalyi,M. (1983). The experience sam-pling method. In H. T. Reis (Ed.), Naturalistic approaches tostudying social interaction. New directions formethodology ofsocial and behavioral science, Vol. 15. San Francisco: Jossey-Bass, 41–56.

Lunn, D., Spiegelhalter, D., Thomas, A., & Best, N. (2009).The BUGS project: Evolution, critique and future direc-tions. Statistics in Medicine, 28(25), 3049–3067. doi:10.1002/sim.3680

Lynch, S. M. (2007). Introduction to applied Bayesian statisticsand estimation for social scientists. New York, NY: SpringerVerlag.

MacDonald, I. L., & Zucchini, W. (2009). Hidden Markov mod-els for time series: An introduction using R. Boca Raton, FL:Chapman & Hall/CRC.

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 21: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

766 S. DE HAAN-RIETDIJK ET AL.

Maruotti, A. (2011). Mixed hidden Markov models for longi-tudinal data: An overview. International Statistical Review,79(3), 427–454. doi: 10.1111/j.1751-5823.2011.00160.x

Maruotti, A., & Rocci, R. (2012). A mixed non-homogeneoushidden Markov model for categorical data, with applica-tion to alcohol consumption. Statistics in Medicine, 31(9),871–886. doi: 10.1002/sim.4478

Maruotti, A., & Rydén, T. (2009). A semiparametric approachto hidden Markov models under longitudinal obser-vations. Statistics and Computing, 19(4), 381–393. doi:10.1007/s11222-008-9099-2

Muthén, L. K., & Muthén, B. O. (1998–2015). Mplus user’sguide (7th ed.), [Computer software manual]. Los Angeles,CA. Retrieved from https://www.statmodel.com/download/usersguide/Mplus%20user%20guide%20Ver_7_r6_web.pdf.

Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007).Deciding on the number of classes in latent class analysisand growth mixture modeling: A Monte Carlo simulationstudy. Structural Equation Modeling, 14(4), 535–569. doi:10.1080/10705510701575396

Plummer, M. (2013a). JAGS Version 3.4.0 user manual [Com-puter software manual]. Retrieved from http://mcmc-jags.sourceforge.net/.

Plummer, M. (2013b). Package ‘rjags’. The Comprehensive RArchive Network [Computer software manual]. Retrievedfrom http://cran.r-project.org/.

Prisciandaro, J. J., DeSantis, S. M., Chiuzan, C., Brown, D.G., Brady, K. T., & Tolliver, B. K. (2012). Impact ofdepressive symptoms on future alcohol use in patientswith co-occurring bipolar disorder and alcohol depen-dence: A prospective analysis in an 8-week randomizedcontrolled trial of acamprosate. Alcoholism: Clinical andExperimental Research, 36(3), 490–496. doi: 10.1111/j.1530-0277.2011.01645.x

R Development Core Team (2012). R: A Language andEnvironment for Statistical Computing [Computer softwaremanual]. Vienna, Austria. Retrieved from http://www.R-project.org/.

Reichardt, C. S. (2011). Commentary: Are three wavesof data sufficient for assessing mediation? Multi-variate Behavioral Research, 46(5), 842–851. doi:10.1080/00273171.2011.606740

Richardson, S., & Green, P. J. (1997). On Bayesian analy-sis of mixtures with an unknown number of components(with discussion). Journal of the Royal Statistical Soci-ety: Series B (Statistical Methodology), 59(4), 731–792. doi:10.1111/1467-9868.00095

Rijmen, F., Ip, E. H., Rapp, S., & Shaw, E. G. (2008a). Qualitativelongitudinal analysis of symptoms in patients with primaryandmetastatic brain tumours. Journal of the Royal StatisticalSociety: Series A (Statistics in Society), 171(3), 739–753. doi:10.1111/j.1467-985X.2008.00529.x

Rijmen, F., Vansteelandt, K., & De Boeck, P. (2008b). Latentclass models for diary method data: Parameter estimationby local computations. Psychometrika, 73(2), 167–182. doi:10.1007/s11336-007-9001-8

Schafer, J. L., & Graham, J. W. (2002). Missing data: Our viewof the state of the art. Psychological Methods, 7(2), 147–177.doi: 10.1037/1082-989X.7.2.147

Schuurman, N. K., Grasman, R. P. P. P., &Hamaker, E. L. (2016).A comparison of Inverse-Wishart prior specifications for

covariance matrices in multilevel autoregressive models.Multivariate Behavioral Research, 51(2–3), 185–206. doi:10.1080/00273171.2015.1065398

Schwarz, G. (1978). Estimating the dimension of amodel. The Annals of Statistics, 6(2), 461–464. doi:10.1214/aos/1176344136

Seltman, H. J. (2002). Hidden Markov models for analysis ofbiological rhythm data. In C. Gatsonis, R. E. Kass, B. Carlin,A. Carriquiry, A. Gelman, I. Verdinelli, & M. West (Eds.),Case studies in Bayesian statistics (Vol. V, pp. 397–405). NewYork, NY: Springer.

Sheeber, L. B., Allen, N. B., Leve, C., Davis, B., Shortt, J. W.,& Katz, L. F. (2009). Dynamics of affective experience andbehavior in depressed adolescents. Journal of Child Psychol-ogy and Psychiatry, 50(11), 1419–1427. doi: 10.1111/j.1469-7610.2009.02148.x

Shirley, K. E., Small, D. S., Lynch, K. G., Maisto, S. A., & Oslin,D. W. (2010). Hidden Markov models for alcoholism treat-ment trial data. The Annals of Applied Statistics, 4(1), 366–395. doi: 10.2307/27801591

Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde,A. (2002). Bayesian measures of model complexity andfit. Journal of the Royal Statistical Society. Series B, Sta-tistical Methodology, 64(4), 583–639. doi: 10.1111/1467-9868.00353

Stephens, M. (2000). Dealing with label switching in mixturemodels. Journal of the Royal Statistical Society: Series B (Sta-tistical Methodology), 62(4), 795–809. doi: 10.1111/1467-9868.00265

Van der Maas, H. L., & Molenaar, P. C. (1992). Stagewise cog-nitive development: An application of catastrophe theory.Psychological Review, 99(3), 395–417. doi: 10.1037/0033-295X.99.3.395

Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesianmodel evaluation using leave-one-out cross-validation andWAIC. Statistics and Computing, 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4

Vermunt, J. K., Langeheine, R., & Bockenholt, U. (1999).Discrete-time discrete-state latent Markov models withtime-constant and time-varying covariates. Journal of Edu-cational and Behavioral Statistics, 24(2), 179–207. doi:10.3102/10769986024002179

Visser, I. (2011). Seven things to remember about hiddenMarkov models: A tutorial on Markovian models for timeseries. Journal of Mathematical Psychology, 55(6), 403–415.doi: 10.1016/j.jmp.2011.08.002

Wagenmakers, E.-J., Farrell, S., & Ratcliff, R. (2004). Estima-tion and interpretation of 1/fα noise in human cogni-tion. Psychonomic Bulletin & Review, 11(4), 579–615. doi:10.3758/BF03196615

Wall, M. M., & Li, R. (2009). Multiple indicator hiddenMarkov model with an application to medical utilizationdata. Statistics in Medicine, 28(2), 293–310. doi:10.1002/sim.3463

Walls, T. A., & Schafer, J. L. (2006). Models for inten-sive longitudinal data. USA: Oxford UniversityPress.

Warren, K., Hawkins, R. C., & Sprott, J. C. (2003). Substanceabuse as a dynamical disease: Evidence and clinical impli-cations of nonlinearity in a time series of daily alco-hol consumption. Addictive Behaviors, 28(2), 369–374. doi:10.1016/S0306-4603(01)00234-9

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018

Page 22: On the Use of Mixed Markov Models for Intensive ... · MULTIVARIATEBEHAVIORALRESEARCH 749 (HMM).Forquitealongtime,therehavebeentwolargely separate literatures for HMMs and LMMs, originating

MULTIVARIATE BEHAVIORAL RESEARCH 767

Watson, D., Clark, L. A., & Tellegen, A. (1988). Developmentand validation of brief measures of positive and nega-tive affect: The PANAS scales. Journal of Personality andSocial Psychology, 54(6), 1063–1070. doi: 10.1037/0022-3514.54.6.1063

Whitehead, B. R., & Bergeman, C. S. (2014). Ups and downsof daily life: Age effects on the impact of daily appraisalvariability on depressive symptoms. The Journals of

Gerontology Series B: Psychological Sciences and SocialSciences, 69(3), 387–396. doi: 10.1093/geronb/gbt019

Zhang, Q., Snow Jones, A., Rijmen, F., & Ip, E. H. (2010).Multivariate discrete hidden Markov models for domain-based measurements and assessment of risk factorsin child development. Journal of Computational andGraphical Statistics, 19(3), 746–765. doi:10.1198/jcgs.2010.09015

Dow

nloa

ded

by [

KU

Leu

ven

Lib

rari

es]

at 0

4:21

09

Janu

ary

2018