theory testing: combining psychometric meta-analysis and structural equations modeling

21
PERSONNEL PSYCHOLOGY 1995,48 THEORY TESTING: COMBINING PSYCHOMETRIC MODELING META-ANALYSIS AND STRUCTURAL EQUATIONS CHOCKALINGAM VISWESVARAN FIorida International University DENIZ S. ONES Department of Management University of Houston This paper presents an overview of a useful approach for theory test- ing in the social sciences that combines the principles of psychomet- ric meta-analysis and structural equations modeling. In this approach to theory testing, the estimated true score correlations between the constructs of interest are established through the application of meta- analysis (Hunter & Schmidt, 1990), and structural equations modeling is then applied to the matrix of estimated true score correlations. The potential advantages and limitations of this approach are presented. The approach enables researchers to test complex theories involving several constructs that cannot all be measured in a single study. Deci- sion points are identified, the options available to a researcher are enu- merated, and the potential problems as well as the prospects of each are discussed. Over the years the importance of theory testing has been increasingly emphasized (e.g., Campbell, 1990; Schmidt, 1992; Schmitt & Landy, 1993). This is consistent with the prediction of Schmidt and Kaplan (1971) that as a nascent field matures, scientists unencumbered by the need to constantly prove the value of their profession to the general society and in the pantheon of sciences, devote more attention to the explanation of the processes underlying the observed relationships and engage more frequently in explicitly articulating the theories that guide their practice. To explicate the underlying processes and theories which Both authors contributed equally; order of authorship is arbitrary. An earlier version of this paper was presented in J. S. Phillips (Chair), Someproblems and innovative solutions in structural equations modeling used for management theory building. Symposium conducted at the 54th annual meeting of the Academy of Management, Dallas, TX. We thank Frank Schmidt for his collaboration on an earlier manuscript. We also acknowledge Jack Hunter for his pioneering work on combining meta-analysis and path analysis. We thankfully acknowledge the contributions of three anonymous reviewers; this manuscript has greatly benefited from all their extensive comments. Correspondence and requests for reprints should be addressed to Chockalingam Viswesvaran, Department of Psychology, Florida International University, Miami FL 33199 or e-mail at: [email protected]. COPYRIGHT 8 1995 PERSONNEL PSYCHOLOGY, INC 865

Upload: chockalingam-viswesvaran

Post on 21-Jul-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

PERSONNEL PSYCHOLOGY 1995,48

THEORY TESTING: COMBINING PSYCHOMETRIC

MODELING META-ANALYSIS AND STRUCTURAL EQUATIONS

CHOCKALINGAM VISWESVARAN FIorida International University

DENIZ S . ONES Department of Management

University of Houston

This paper presents an overview of a useful approach for theory test- ing in the social sciences that combines the principles of psychomet- ric meta-analysis and structural equations modeling. In this approach to theory testing, the estimated true score correlations between the constructs of interest are established through the application of meta- analysis (Hunter & Schmidt, 1990), and structural equations modeling is then applied to the matrix of estimated true score correlations. The potential advantages and limitations of this approach are presented. The approach enables researchers to test complex theories involving several constructs that cannot all be measured in a single study. Deci- sion points are identified, the options available to a researcher are enu- merated, and the potential problems as well as the prospects of each are discussed.

Over the years the importance of theory testing has been increasingly emphasized (e.g., Campbell, 1990; Schmidt, 1992; Schmitt & Landy, 1993). This is consistent with the prediction of Schmidt and Kaplan (1971) that as a nascent field matures, scientists unencumbered by the need to constantly prove the value of their profession to the general society and in the pantheon of sciences, devote more attention to the explanation of the processes underlying the observed relationships and engage more frequently in explicitly articulating the theories that guide their practice. To explicate the underlying processes and theories which

Both authors contributed equally; order of authorship is arbitrary. An earlier version of this paper was presented in J. S. Phillips (Chair), Someproblems and innovative solutions in structural equations modeling used for management theory building. Symposium conducted at the 54th annual meeting of the Academy of Management, Dallas, TX. We thank Frank Schmidt for his collaboration on an earlier manuscript. We also acknowledge Jack Hunter for his pioneering work on combining meta-analysis and path analysis. We thankfully acknowledge the contributions of three anonymous reviewers; this manuscript has greatly benefited from all their extensive comments.

Correspondence and requests for reprints should be addressed to Chockalingam Viswesvaran, Department of Psychology, Florida International University, Miami FL 33199 or e-mail at: [email protected].

COPYRIGHT 8 1995 PERSONNEL PSYCHOLOGY, INC

865

Page 2: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

866 PERSONNEL PSYCHOLOGY

guide a scientific field of inquiry, scientists need to: (a) specify the im- portant constructs (b) specify the relationships between the constructs of interest, (c) develop operational measures for the constructs, (d) collect data, (e) estimate the covariation across the measures, and (9 empiri- cally test whether the specified model fits the collected data.

The purpose in this paper is to describe a heuristic framework for the- ory testing that combines the principles of psychometric meta-analysis and structural equations modeling. The suggested procedure has emerg- ed over the past 10 years, and our objective in this paper is to provide a presentation of the ideas and issues in one place. We outline the general steps involved in the suggested procedure of combining psychometric meta-analysis and structural equations modeling, and discuss some po- tential problems and pitfalls to avoid in implementing this procedure to test theories, and clarify some popular misconceptions about the pro- posed framework.

A major advantage of combining psychometric meta-analysis and structural equations modeling in theory testing is that not all relation- ships specified by a theory need to be included in each primary study. For example, 10 studies might report the relationship between two con- structs A and B; 10 other studies could report the relationship betweenB and C; and 5 other studies could report the correlation between A and D, B and D, and C and D. The true score correlations between A, B, C, and D, can be meta-analytically estimated and used to test a theory involving all four constructs, although no individual study has included all four constructs.

There have been some previous studies in the literature, albeit too intermittent given the potential we see for this approach, that have combined meta-analysis and path analysis to test substantive theories. Hunter (1983) employed this approach to test the relationship between cognitive ability, job knowledge, and overall job performance. This framework was also used by Schmidt, Hunter, and Outerbridge (1986) to test a causal model involving general mental ability, experience, and su- pervisory ratings of performance. Premack and Hunter (1988) tested a theory of voting behavior, cumulating results across 14 studies, and then using the matrix of estimated true score correlations in path analysis. Hom, Caranikas-Walker, Prussia, and Griffeth (1992) used this strat- egy to test alternate models of withdrawal behaviors, Peters, Hartke, and Pohlmann (1985) used the approach to test a contingency theory of leadership. Brown and Peterson (1993) used this approach to test the an- tecedents and consequences of job satisfaction in sales personnel. Ones (1993) used this approach to establish the construct validity of person- ality and integrity measures, whereas Viswesvaran (1993) employed this approach to test a theory of job performance.

Page 3: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

VISWESVARAN AND ONES 867

TABLE 1 Summary of Steps in the Proposed Approach for Theory Testing

Measurement Model 1. Identify important constructs and relationships 2. Identify different measures used to operationalize each construct 3. Obtain all studies reporting either (a) correlations between conceptually

distinct operational measures or (b) artifact information on any of the conceptually distinct operational measures (identified in step 2)

4. Conduct psychometric meta-analyses and estimate the true score correlations between the measures (identified in step 2)

5. Use factor analysis to test the measurement model Causal Model 6. Estimate the correlations between the constructs (forming composites

7. Use path analysis with the estimated true score correlations to test of the different operationalizations of the same construct)

proposed theory

These examples, however, have not comprehensively articulated a clear and coherent set of principles that researchers interested in com- bining psychometric meta-analysis and structural equations modeling should take into consideration. The various principles and issues in- volved are scattered in different articles. There is a need to bring to- gether the issues and principles in one source. Further, there are some issues (e.g., empty cells, sample sizes varying across cells, ill-defined ma- trices, the communalities to use in the diagonals) that have never been discussed in the early work combining psychometric meta-analysis and structural equations modeling to test theories. This paper is a first in addressing these issues.

In the following sections, we first turn to a discussion of the steps involved in the framework combining psychometric meta-analysis and structural equations modeling when the aim is to test substantive theo- ries. We provide illustrative examples of each step from the literature for the suggested procedure. Finally, we analyze the potential problems and limitations that can be encountered in applying this procedure in theory testing, and explore the options available to researchers to resolve them.

Steps Involved in the Suggested Procedure

The general steps involved in the proposed procedure for theory test- ing are summarized in Table 1. The first step is to identify the different constructs to be included in the theory being tested. The second step is to identify the different operationalizations used in the literature for each construct identified in the first step. The third step is to locate all individ- ual studies that report either a correlation among and between any of the different operationalizations or artifact information on the operational

Page 4: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

868 PERSONNEL PSYCHOLOGY

measures identified in the second step. The fourth step is to use psycho- metric meta-analysis to estimate the true score correlations between the different operational measures identified in the second step. The fifth step is to use factor analysis to test the measurement model for each construct identified in the first step (using the true score correlations es- timated between the operational measures identified for that construct in the second step). Assuming the measurement model is found ade- quate, the sixth step is to estimate the true score correlations between the constructs using composites of the different operationalizations for each construct. The seventh and final step is to use path analysis to test theories and explain the estimated true score correlations among the constructs identified in the first step.

Identifiing Constructs and Operational Measures for Each Construct (Steps 1 & 2)

The first step involves the identification of relevant constructs and the hypothesized relationships among them. Although this step appears to be similar to those found in traditional approaches to theory test- ing, an important advantage is that the researcher is not constrained by the constructs for which he or she can collect data in a primary study. The ability of meta-analysis to synthesize data across studies has the po- tential to enable the researcher to include many more constructs. This translates to an ability to test complex and interrelated theories involving many constructs, and has the promise of providing a fuller and complete understanding of the phenomenon under investigation.

The second step is to identify for each construct all operational mea- sures that have been used by previous researchers. When results are cumulated across studies, care should be taken to specify conceptu- ally distinct constructs and conceptually distinct operational measures of each construct. Identifying the different constructs, and different op- erationalizations of each of those constructs, has to be guided by the- ory. Up to a point, it can be proposed that the more fine-grained, nar- row, and explicitly defined the measures are, the greater the conceptual clarity and interpretability of empirical results. However, any advan- tage in interpretability of empirical results that narrowly defined mea- sures have over more broadly defined measures is offset by considera- tions of availability of data and robustness of resulting estimates. When results are cumulated across studies, intercorrelations between some of the narrowly defined measures may not be available, thus necessitating the analysis at a level at which the measures are defined more broadly. Also, the usefulness of a construct for making generalizable inferences is likely to increase when the constructs are defined more broadly. This

Page 5: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

VISWESVARAN AND ONES 869

is essentially the bandwidth-fidelity dilemma discussed in the literature (e.g., Cronbach, 1960).'

Further, it is important to note that measures that assess diflereent constructs (or construed as conceptually distinct operational measures) from one theoretical perspective may assess the same construct from the perspective of another theory (Hunter & Schmidt, 1990, p. 448). Consider for example, a researcher who wants to answer the question of whether smoking cessation methods-regardless of the type-are effective, in comparison to the effectiveness of other psychotherapy in- terventions. The researcher can combine the results from studies using different methods of smoking cessation as similar to each other. But to examine whether nicotine chewing gum is more effective than smoke aversion methods, the researcher has to analyze the studies examining the effectiveness of these two methods (nicotine chewing gum and smoke aversion techniques) individually. In fact, Viswesvaran and Schmidt (1992) used meta-analytic procedures to address both questions; across all cessation methods the average successful quit rate was 25% and suc- cess rates varied from 7% to 48% across 15 different cessation programs. Thus, whether two constructs are conceptually distinct (or whether two operational measures are to be construed as conceptually distinct) de- pends on the purpose of the analysis. It is important to note that this is a concern in all meta-analyses, and is not an issue peculiar to this frame- work-conducting path analysis on a matrix of meta-analyzed correla- tions. Nevertheless, the researcher using this framework for theory test- ing has to exercise caution so that the statistical analyses conducted are rooted in appropriate theoretical frameworks.

Hunter and Schmidt (1990) suggest that, at least initially, meta- analysis in a given area should probably be narrow and focused enough to correspond to the major constructs recognized by researchers in that area. Then, as understanding develops, later meta-analyses may become broader in scope. Consider the following sequence of events in the his- tory of personnel selection. Initially, researchers were interested in pre- dicting specific job performance dimensions. It was widely believed that

'The bandwidth-fidelity dilemma has been misconstrued in the existing literature. As described by Cronbach (1960) the dilemma is that in psychological testing there is an in- evitable tradeoflbetween attaining a high degree of precision in measurement of any one at- tribute or characteristic and obtaining information about a large number of characteristics. But, there is nothing inherent in broad constructs that precludes high fidelity assessment of broad constructs (Ones & Viswesvaran, in press; EL. Schmidt, personal communication, June, 1995). The breadth of constructs to be used is dictated by: (a) the phenomenon that is to be predicted or explained; and, (b) whether narrower conceptualization of broad con- structs, although postulated to be conceptually distinct, can be operationally defined so as to be empirically and operationally distinct. Bandwidth and fidelity are, in fact, indepen- dent dimensions.

Page 6: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

870 PERSONNEL PSYCHOLOGY

the validity of a predictor varied across situations and criteria (i.e., differ- ent determinants explained different performance dimensions in differ- ent situations). Thus, it was reasonable to examine the specific criteria as different constructs. At this point, meta-analytic cumulation focused on the specific criteria as distinct constructs, and the broadly defined con- struct of job performance, with the various specific criterion measures as construct deficient measures of this broad construct, was ignored as irrel- evant. Theories of work performance had to be developed for each spe- cific criterion and no general theory of work was feasible. As evidence accumulated that the validity of many predictors generalized across sit- uations (e.g., Pearlman, Schmidt, & Hunter, 1980) and criteria (Nathan & Alexander, 1988), interest shifted to the development of theories of work (e.g., Campbell, Gasser, & Oswald, in press; Schmidt & Hunter, 1992) that focused on the general factor of job performance that is com- mon to all the specific criteria. At this stage, it becomes reasonable to ask whether the different criteria are manifestations of a general con- struct, and if so, theory building should focus at this broader construct of job performance.

Most of the previous studies combining meta-analysis and structural equations modeling (e.g., Brown & Peterson, 1993; Peters et al., 1985) as a data analytic technique in theory building have focused on well- defined constructs (Le., the constructs that have been defined in the lit- erature). The care that needs to be exercised at this stage is illustrated in Viswesvaran (1993). To test the structure of job performance, he had to define conceptually distinct job performance dimensions. Viswesvaran identified 10 conceptually distinct dimensions and explicitly stated the definition of each. In other words, the boundaries of each of the 10 job performance dimensions were delineated. For example, leadership was defined as the ability to inspire, to bring out extra performance in others, to motivate others to scale great heights, and to have a stature in one’s profession. Illustrative examples included performance appraisal state- ments such as “gets subordinates to work efficiently,” “stimulates subor- dinates effectively,” and “maintains authority easily and comfortably.” Effort was defined as amount of work an individual expends in striving to do a good job. Initiative, attention to duty, alertness, resourceful- ness, enthusiasm about work, industriousness, earnestness at work, per- sistence in seeking goals, dedication, personal involvement in the job, effort and energy expended on the job, were some other terms used to define this dimension of job performance. These two definitions were then used to identify which studies were relevant for inclusion.

A similar approach was taken by Ones (1993) in clustering the differ- ent personality scales into one of the Big Five dimensions of personality.

Page 7: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

VISWESVARAN AND ONES 871

Ones used definitions of the Big Five factors to decide which of the nu- merous studies reporting correlations between personality scales should be included in assessing the correlation between any two dimensions of Big Five factors of personality. Other previous efforts in combining psy- chometric meta-analysis and structural equations modeling (e.g., Hom et al., 1992; Hunter, 1983) have primarily confined themselves to well defined constructs such as job knowledge and absenteeism. However, it is important to note that even the previous studies that focused on well-established constructs, constructs for which there was a consensus on what studies to include in meta-analysis, have relied on implicit def- initions of their constructs in deciding which studies to include in the meta-analysis (although they did not explicitly define their constructs).

Locating Relevant Studies (Step 3)

Once the different operationalizations have been identified for each construct, the next step is to identify all relevant studies that provide ei- ther (a) correlation between operational measures of interest, or (b) ar- tifact information on the operational measures identified. Hunter and Schmidt (1990, p. 490) classify the different methods of locating research studies under three categories: examining indices to documents, search- ing existing bibliographies, and querying scholars who might be familiar with appropriate studies. Cooper (1984) provides a thorough discus- sion of questions and issues in searching for studies. A key distinction to note here is the separation between the question of methodological weaknesses and the question of relevant and irrelevant studies. Elimi- nating studies on the basis of methodological weaknesses is problematic on two grounds. First, as pointed out by Cooper (pp. 63-65), the inter- coder agreement on methodological quality of studies is very low, illus- trating the subjectivity of such assessments. Haring et al. (1981) found that intercoder agreement was lowest in coding judgmental items such as methodological quality of studies. Hattie and Hansford (1984) as well as Jackson (1980) present data that further underscores the subjectivity in excluding studies based on methodological quality. Second, instead of eliminating studies on subjective premises, the researcher could test the effect of methodological quality on the inferred conclusions. Use of psy- chometric meta-analysis enables the researcher to test the hypothesis of methodological inadequacy after accounting for the effects of statistical artifacts and substantive moderators.

The question of relevant and irrelevant studies is more subtle, and an answer to the question whether a particular study is relevant depends on the specific hypotheses, theories, and purposes of the investigator.

Page 8: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

872 PERSONNEL PSYCHOLOGY

This is essentially linked to the step where conceptually distinct con- structs (Step 1 in Table 1) and conceptually distinct operational measures (Step 2 in Table 1) were identified based on theoretical considerations. Thus, whether or not a study is relevant for inclusion in the meta-analysis is a theoretical question. The question of methodological rigor, how- ever, is an empirical issue that can be tested using meta-analysis.

Obtaining all studies reporting either (a) correlations between con- ceptually distinct operational measures, or (b) artifact information on any of the conceptually distinct measures (Step 3), are standard proce- dures in all meta-analyses (this step is not unique to the framework pro- posed here). As such, all eight previous studies that have used variations of the procedure presented in this paper to test complex theories com- bining psychometric meta-analysis and structural equations modeling, are illustrative of this step. Furthermore, all meta-analyses conducted to date have to make decisions regarding the relevant studies to include (e.g., McDaniel, Whetzel, Schmidt, & Maurer, 1994; Pearlman et al., 1980; Rosenthal & Rubin, 1978; Smith & Glass, 1977).

Estimating the Tnce Score Correlations Between the Conceptually Distinct Operational Measures (Step 4)

Once all relevant studies have been identified, psychometric meta- analysis (Schmidt et al., 1993) can be used to estimate the true score correlations among and between the different operationalizations. In most cases, researchers may not have the necessary information to cor- rect each observed correlation individually. Artifact distributions will probably have to be employed to correct for some artifacts, such as unre- liability in the measures being correlated. A discussion of issues involved in constructing artifact distributions for unreliability corrections is pre- sented in the Appendix. Other statistical artifacts such as dichotomiza- tion can be corrected individually, because the information required for dichotomization corrections is usually available in the individual studies.

In cumulating results across studies, it is possible that some studies may report correlations between measures that have been specified in Step 2 (see Table 1) as denoting the same operational measure (i.e., conceptually not distinct). The researcher then may need to form lin- ear composites across the conceptually similar (as specified in Step 2 in Table 1) measures in individual studies. Linear composites across con- ceptually similar measures define the variance in the measure as the variance common across all individual conceptually similar measures. Composite correlations are more construct valid (Jensen, 1980), and the use of composite correlations does not distort (Hunter & Schmidt, 1990)

Page 9: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

VISWESVARAN AND ONES 873

the sampling error variance estimates in a meta-analysis. Note that, us- ing the component correlations individually as independent estimates or using the average of the component correlations in a meta-analysis will distort the sampling error variance estimate.

We encourage researchers utilizing the theory testing framework outlined in this manuscript to compute and use composite correlations as the composite measures of any given construct are more construct valid than the individual component correlations. For computing com- posite correlations item level data are not required, and intercorrela- tions among the components suffice.2 Granted that most studies use one of the operationalizations of a construct (i.e., most of the studies re- port cross-correlations), it is still possible that the meta-analyst will find a few studies that report correlations among different operationalizations of a construct. This will enable the computation of a composite corre- lation. Further, as suggested by an anonymous reviewer, subject matter experts (e.g., Schmidt, Hunter, Croll, & McKenzie, 1983) can be used to estimate the missing correlation if no study was found reporting a cor- relation between two operationalizations. Generalizability coefficients can also be fruitfully employed here. Consider a hypothetical case. We have a construct that has been measured with an infinite number of oper- ational measures. Ghiselli, Campbell and Zedeck (1981) point out that a linear composite of the infinite number of operational measures can be obtained based on the average intercorrelation between the opera- tional measures. Now suppose we have wcomponents that constitute the composite being estimated. Of the n(n- 1)/2 intercorrelations, some are missing. It is possible to compute a generalizability coefficient using the average of the available intercorrelations. Of course, this is valid only to the extent that the average of the available intercorrelations precisely

’Item-level data are not required to compute composite correlations. The intercorrela- tions among the component measures suffice for the computation of a composite correla- tion. In fact, in a meta-analytic framework, all the intercorrelations need not be estimated in one single primary study. For example, let us say the correlation between two constructs, X and Y, is being investigated. Let X be measured by two measures 11 and 12. Similarly, let y be measured by two measures yI and 92. Now let us say 10 studies report the cor- relation between 11 and y l , 20 studies between zl and y2,15 studies between x2 and y l , and 25 studies between x2 and y2. Also, say that 10 of these studies report a correlation between 11 and 12, and a dozen or so studies between y l and y2. A 4 x 4 matrix of meta- analyzed correlations can be constructed. Now, the composite correlation of (11 +12) with (yl+y2) is expressed as

Note, however, that the above expression based on correlations makes use of standardized scores on the measures. When the scores are not standardized, the use of the variance- covariance matrix is warranted. Nunnally (1978) in his chapter on linear composites elabo- rates on this procedure. It is also possible to assign differential weights to each component in the composites. In addition, based on the same principle, a composite correlation can also be computed when there are more than two operationalizations of a construct.

(r , l , l+r , l ,2+~,2~l+r ,2yz) /~( ’+’r~l ,~) (2+’r ,~ , ,* )

Page 10: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

874 PERSONNEL PSYCHOLOGY

estimates the average of an infinite number of operational measures of the intended construct.

Obtaining an estimate of a true score correlation is a standard step in all meta-analyses. The framework outlined here is applicable regard- less of the meta-analytic procedure used. As such, all previous studies that have used variations of the procedure presented in this paper to test complex theories combining psychometric meta-analysis and structural equations modeling, are illustrative of this step. For example, Brown and Peterson (1993) estimated the true score correlations between role con- flict, role ambiguity, job satisfaction, and organizational commitment. Schmidt et al. (1986) obtained estimates of the true score correlation between ability, supervisory ratings of job performance, job knowledge, and work sample scores. However, composite correlations are not used in all studies. Ones (1993) and Viswesvaran (1993) were the only two studies among the eight examples to use composite correlations. Peters et al. (1985) used average correlations when faced with conceptual repli- cations.

Testing the Measurement Model (Step 5)

Once the true score correlations between the conceptually distinct measures identified in the second step in Table 1 have been estimated using psychometric meta-analysis, factor analysis can be employed to test the measurement model. Here the different operationalizations identified for each construct are hypothesized to be caused by the latent variable denoting that construct.

The adequacy of the measurement model may be captured by (a) the product rule, and (b) parallelism test (Hunter & Gerbing, 1982). The product rule states that the correlation between any two measures of the same construct is the product of the factor loadings of the two measures on the underlying construct. The parallelism test stipulates that the dif- ferent measures of the same construct will have similar patterns of cor- relations with measures of other constructs. Other traditional overall fit indices are also appropriate (Joreskog & Sorbom, 1986). Illustrative ex- amples of this step can be found in Ones (1993) and Viswemaran (1993). Hom et al. (1992) and Brown and Peterson (1993) combined the testing of a measurement model with path analysis as is customary with most applications of LISREL. The other four studies combining psychometric meta-analysis and structural equations modeling have focused on path analysis and have not explicitly tested a measurement model.

Page 11: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

VISWESVARAN AND ONES 875

Path Analysis (Steps 6 & 7)

Once an adequate fit has been established for the measurement model, the true score correlations between the constructs of interest to the theory being investigated have to be estimated. To obtain these es- timates of the true score correlations between constructs, the estimated true score correlations between the operationalizations of the different constructs are needed. Linear composites can be formed of the differ- ent operationalizations of each construct, and the true score correlations between the different constructs can be estimated. This constitutes the sixth step in Table 1. Path analysis can then be applied to the matrix of estimated true score correlations between constructs to test competing theories.

Different estimation methods have been used in structural equations modeling (e.g., centroid factor analysis, maximum likelihood estima- tion) both in testing the measurement model and in testing the structural model (i.e., path analysis). The procedure outlined and recommended in this paper to test theories-combining psychometric meta-analysis and structural equations modeling-is applicable regardless of the estima- tion method used in structural equations modeling. An interesting ex- ample is Hom et al. (1992), who tested four alternate models of with- drawal behaviors using the maximum likelihood estimation procedures of EQS (Bentler, 1989). Peters et al. (1985) did not resort to any param- eter estimation procedures but computed partial correlations based on meta-analyzed correlations to test a contingency theory of leadership. Hunter (1983) is a good example of using path analysis with a matrix of meta-analyzed correlations, although he did not employ maximum like- lihood parameter estimation methods. Schmidt et al. (1986) provide yet another example of this step.

Problems and Prospects in the Proposed Approach

The use of meta-analysis for theory building and explanation (Cook et al., 1992) has enabled researchers to address comprehensive and so- cially meaningful phenomena. Unencumbered by the need to abstract a simple system from a complex context and with the effects of statistical artifacts greatly mitigated, the researcher is able to see broader patterns in relationships between constructs (Lipsey & Wilson, 1993). Frame- works have been developed to apply the principles of meta-analysis to cumulate research studies utilizing a variety of research designs (e.g., Becker & Hedges, 1992). Using psychometric meta-analysis to estimate the true score correlations across constructs (analyzed in several differ- ent individual studies), and then employing a two stage process with tests

Page 12: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

876 PERSONNEL PSYCHOLOGY

of a measurement model in the first stage and a path model in the second stage, results in a powerful approach for theory building. In the previous section, we have outlined the heuristics of this approach.

In applying the proposed procedure for theory testing outlined, the researcher is likely to encounter some decision points and potential problems unique to this framework. Encounter with these decision points necessitates judgment calls on the part of the researcher. Judg- ment calls are inherent in all research projects and, in particular, there are several accounts of judgment calls in a meta-analysis (e.g., Campion, 1993; Hattie & Hansford, 1984; Wanous, Sullivan, & Malinak, 1989). Given that there are also many meta-analysis books which detail all the decisions to be made in a meta-analysis, we focus only on the potential limitations and issues unique to the framework presented in this paper.

First, the researcher may have to contend with the problem of empty cells (i.e., there exists no study reporting a correlation between two con- structs of interest). Consider the case of testing a theory involving 25 constructs. To test this theory, we need estimates of 300 true score cor- relations. A literature search locates 500 studies, and each study reports correlations between some of the 25 constructs. However, in trying to fill the 300 cells in the estimated true score correlation matrix, we find that 10 cells are empty. That is, no study is located that reports a correlation between the constructs needed for those 10 correlations.

The options available to the researcher are: (a) design a primary study to collect data with sufficiently large sample size, such that the effects of sampling error are reduced, to obtain stable estimates of the correlations not reported in the literature; (b) use the average (across all correlations) correlation in the empty cells; (c) look for patterns of correlations and impute values in the missing cells of the matrix; and (d) modify the test of the theory to include only the constructs for which a full matrix of estimated true score correlations is available in the liter- ature. A final option, as suggested by an anonymous reviewer, is to use Subject Matter Experts to estimate the missing correlation (e.g., Schmidt et al., 1983). The SMEs may estimate the missing values so as to satisfy the constraints (e.g., the triangular inequality, McNemar, 1962, p. 162) based on other available correlation^.^ The research on missing values and imputational techniques can also provide guidance here (Raymond

3Specifically, McNemar (1962, p. 167) presents the constraints that correlations com- puted on the same sample should satisfy. In a correlation matrix based on the same sam- ple, the magnitude of one correlation places constraints on the magnitudes of other cor- relations. When true score correlations are estimated based on several samples, these constraints may not be satisfied. In that case, the eigenvalues computed from such ill- defined matrices with relaxed constraints can become negative and the statistics derived under those conditions are likely to be distorted (Tabachnick & Fidell, 1989, p, 65).

Page 13: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

VISWESVARAN AND ONES 877

& Viswesvaran, 1993). An interesting possibility is to use regressional approaches to examine patterns of correlations for imputation. The use of average correlation is appropriate only when the focus is on testing for a general factor. But, it is inappropriate when the relationships between variables within a nomological network are being estimated. The rela- tive attractiveness of each option will probably depend on the proportion of empty cells in the original matrix as well as the ease of data collec- tion. Monte Carlo studies are needed to investigate the boundary con- ditions for each option. Ones (1993) collected primary data (option-a), Viswesvaran (1993) used the average correlation (option-b), and Hom et al. (1992) confined themselves to testing theories for which full ma- trices were available (option-d) when faced with empty cells in applying the framework. To date none of the other options have been utilized.

A second potential problem that the researcher may have to address is that the sample size used to estimate the true score correlations varies across the correlations. Each cell in the matrix of true score correla- tions is obtained from a separate meta-analysis. The number of studies that contribute to each meta-analysis will vary and so will the total sam- ple size across these studies. The sample size in the cells of the matrix could vary from 10s to 1000s. When a matrix of correlations is subject to structural equations modeling, the sample size is usually used to test the significance of the path coefficients or to establish confidence inter- vals for each estimated parameter as well as to estimate some model fit indices (e.g., chi-squared values). Several researchers (e.g., Carver, 1978, 1993; Cohen, 1994; Kish, 1959; Oakes, 1986; Rozeboom, 1960; Schmidt, in press) argue for abandoning the use of significance testing. It is important to note, however, that the criticism of significance test- ing (e.g., Cohen, 1994; Schmidt, in press) is about using standard errors inappropriately. The construction of confidence intervals based on ap- propriate standard errors is altogether a different matter. Standard er- ror estimates are needed to construct confidence intervals. One poten- tial problem in the framework proposed here involves the estimation of the standard error estimate associated with each path coefficient. What sample size should a researcher specify in conducting a path analysis with a matrix of meta-analyzed correlations?

One approach is to use the harmonic mean of the sample sizes across the different cells in the intercorrelation matrix to test for precision of parameter estimates. Use of the harmonic mean is consistent with the literature on unweighted analysis of variance, and is consistent with the overall degree of precision existing in the available data. Another

Because eigenvalues can be construed as representing the variance explained by a given factor, negative eigenvalues are similar to obtaining negative variance components in analysis of variance.

Page 14: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

878 PERSONNEL PSYCHOLOGY

approach is to rely on constructive replications (from several meta- analyses-either from the same domain with a different set of primary studies or from meta-analyses of related domains). We also urge sci- entists to synthesize their findings with the existing web of knowledge, in assessing the magnitude of the path coefficients. Previous efforts in combining meta-analysis and structural equations modeling have either (a) not computed sampling error estimates for path coefficients because they were assumed to be population values, or (b) used the same sample size in each cell so as to apply existing approaches to compute the sam- pling error estimates. For example, Schmidt et al. (1986) did not com- pute sampling error of path coefficients but used in their meta-analytic cumulation all available studies, which resulted in unequal sample sizes in each cell of the intercorreiation matrix submitted to structural equa- tions modeling. Such an approach assumes that the estimated true score correlations represent population values. On the other hand, Hom et al. (1992) in their meta-analytic cumulation used only those studies that assessed all variables included in the path models. Thus, in the Hom et al. study, the same sample size was used in the estimation of each meta-analyzed correlation, enabling them to apply existing approaches to compute sampling error estimates.

The third question that confronts the researcher combining psycho- metric meta-analysis and path analysis to test theories is a determination of what values to use in the diagonal of the matrix. In a meta-analytically obtained matrix that is to be subjected to path analysis, what communal- ities are appropriate?

When estimated true score correlations are used in a path analysis one might question whether we need to use ones in the diagonal. In tra- ditional analysis, reliability estimates provide an upper bound, because a variable cannot correlate more with another variable than with itself (i.e., the reliability coefficient). Researchers using the framework pre- sented here for theory testing could consider the use of squared multiple correlations in the diagonal as one viable option. For example, Hom et al. (1992) employed this option. Another viable option is the use of relia- bility corrected generalizability coefficients in the diagonal (Ones, 1993; Viswesvaran, 1993). The answer to this question will depend on the type of reliability coefficient (Schmidt & Hunter, in press) used in the meta- analysis (whether specific error variance, transient error variance, and variance due to random response are assigned to error variance or true variance).

Page 15: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

VISWESVARANAND ONES 879

A fourth concern4 in using this framework is that, given a fair amount of variability in the mean correlation (after accounting for variation due to statistical artifacts), the bivariate relationship (captured by the mean correlation) in that cell could be moderated by a third variable (perhaps even one of the other variables included in the path model). Finkelstein, Burke and Raju (in press) have presented equations for computing the variance associated with the estimated true score correlation (see also Hunter & Schmidt, 1990), a variance that is not due to sampling error but reflects the variance in the true score correlations for each cell of the matrix that is input into the path analysis. Under these circumstances, the test of the path model may possibly be affected.

One approach to address this fourth issue is to conduct a moderator analysis until the variability in the estimated true score correlations (i.e., the bivariate relationships) are reduced to zero. All moderators, thus identified, should be included in the path analysis. Another approach a researcher can take is to subject a matrix of intercorrelations composed of lower 90% confidence intervals of the meta-analyzed correlations to structural equations modeling. Similarly, an intercorrelation matrix of upper 90% confidence intervals of the meta-analyzed correlations can be subjected to structural equations modeling. The fit of these two models can be compared to the fit obtained using mean correlations.'

Finally, researchers employing the procedure to theory testing pre- sented in this paper may face an increased likelihood of encountering correlation matrices that are ill-defined (for which inverses cannot be defined). In other words, the matrix of estimated true score correla- tions fails to be positive definite. Another way of stating this argument

41n structural equations, either with latent or with observed variables, interactions are not easily specified, because current conceptualizations (Lisrel, EQS) do not allow the di- rect specification of interactions. Although interactions are easier to specify in path analy- sis that use separate regression equations to estimate path coefficients, because interaction terms are easily included in the linear regression model, such path coefficients estimated using separate regressions are known to be biased. Similarly, attempts to test interactions by analysis of variance of manifest variables tests the interactions out of the context of the full model (and further fail to correct for measurement error). The key point to note is that this fourth limitation is not unique to our framework, but pertains to all applications of path analyses.

'This approach can be generalized by combining the principles of path analysis and operations research with inequality constraints. In other words, the structural equations are the objective functions. The objective is to estimate the path coefficients and the corresponding point estimate of the mean correlations so as to minimize the standard error. The bivariate correlations and the path coefficients to be estimated in the path analysis are the unknowns-the variables whose values are to be determined so as to minimize the objective function (i.e., minimize the structural equation standard error).

Page 16: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

880 PERSONNEL PSYCHOLOGY

in practical terms is that when data are combined from different samples (a) the constraints on the covariance between variables may be violated, thereby resulting in (b) nonpositive definite matrices. For example, the McNemar (1962, p. 162) rules may not be satisfied.

It is important to note, however, that this is only apotential problem, and there is no empirical evidence that this potential problem is realized in any of the real world applications of the proposed framework. Fur- ther, Little and Rubin (1987), in comparing pairwise and listwise dele- tion of cases, state that pairwise deletion results in different sample sizes for each correlation, but as long as the process ofgenerating the missing observations is ignorable, pairwise deletion will not result in nonpositivity of the intercorrelation matrix. Further, as the sample size increases, the likelihood of encountering negative eigenvalues with pairwise deletion decreases. Because meta-analytically estimated true score correlations are typically based on large samples, this concern with nonpositivity may not stand up to close scrutiny. Of the previous studies that have com- bined psychometric meta-analysis and structural equations modeling to test theories, Hom et al. (1992) used listwise deletion and included in their meta-analytic cumulation only those studies that assessed correla- tions between all variables included in the path models. On the other hand, Hunter (1983), Ones (1993), and Viswesvaran (1993) by including in their meta-analytic cumulation all available studies, exploited the full power and potential of this procedure to test complex theories involving several variables and competing causal mechanisms, the data for all of which cannot possibly be obtained in any one single primary study. None of these applications encountered nonpositivity.

Several avenues of fruitful research regarding the use of this tech- nique can be identified. Some of these avenues include (a) conducting a variety of simulation studies to investigate the conditions under which the analysis works properly, (b) finding the minimum number of correla- tions and sample sizes needed for robust estimates from meta-analyses, (c) examining what variance in the sample size per correlation is accept- able, and (d) investigating whether the analysis is robust to the wide va- riety of violations of assumptions that may occur in real data. The pro- posed framework will be a useful addition to the methodological arsenal of scientists to the extent future simulations and applications bear out the robustness and reliability of these procedures. Perhaps the interest

The most likely values of the path coefficients as well as the associated matrix of mean intercorrelations are to be estimated. The point estimate from this operations research formulation is the estimate of the bivariate relationship when the best fit for the multivari- ate analyses of the hypothesized path model is considered. Further, this point estimate from this operations research formulation is consistent with the variability associated with the estimated mean. In any event, the viability of the approach outlined in this footnote to the fourth limitation is a fruitful avenue for future research.

Page 17: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

VISWESVARAN AND ONES 881

kindled by the publication of this manuscript will spur researchers to con- duct the suggested simulations and to undertake the studies combining psychometric meta-analyses and structural equations modeling.

In conclusion, this paper presentes an approach for theory testing using the principles of psychometric meta-analysis to quantitatively es- timate the true score correlations between the constructs, and using these estimated true score correlations with structural equations mod- eling to test theories. By including linear composites of different op- erational measures for each construct, this framework enables the re- searcher to employ more construct valid measures. More important, the data synthesizing capabilities of meta-analysis facilitates the testing of realistic and meaningful theories involving several constructs that are not all measured in the same individual study. We believe that combin- ing psychometric meta-analysis and structural equations modeling facil- itates building theories of work behavior that capture the richness and complexity of real world phenomena, a richness and complexity uncap- turable in individual studies.

REFERENCES

Becker BJ, Hedges LV. (1992). Special issue on meta-analysis (Editorial). Journal of Educational Statistics, 17,277-278.

Bentler PM. (1989). Theory and implementation of EQS: A structural equations program [Computer software]. Los Angeles: BMDP Statistical Software.

Brown SP, Peterson RA. (1993). Antecedents and consequences of salesperson job satisfac- tion: Meta-analysis and assessment of causal effects. Journal of Marketing Research, 30,63-77.

Burke MJ. (1994). An empirical estimation of the effect of second order sampling error on ASVAB-training proficiency validity estimates. Brooks Air Force Base, TX: Arm- strong Laboratory.

Campbell JP. (1990). The role of theory in industrial and organizational psychology. In Dunnette MD, Hough LM (Eds.), Handbook of industrial and organizationalpsy- chology (2nd ed., Vol. 1, pp. 39-73). Palo Alto, CA Consulting Psychologists Press.

Campbell JP, Gasser MB, OswaId FL. (in press). The substantive nature of job perfor- mance variability. In Murphy KR (Ed.), Individual differences and behavior in orga- nizations. San Francisco, CA: Jossey-Bass.

Campion MA. (1993). Article review checklist: A criterion checklist for reviewing re- search articles in applied psychology. PERSONNEL PSYCHOLOGY, 46,705-718.

Carver RP. (1978). The case against statistical significance testing. Haward Educational Review, 48,378-399.

Carver RP. (1993). The case against statistical testing revisited. Journal of Experimental Education, 61,287-292.

Cohen J. (1994). The earth is round ( p < .05). American Psychologist, 49,997-1003. Cook TD, Cooper HM, Cordray DS, Hartmann H, Hedges LV, Light RJ, Louis TA,

Cooper HM. (1984). The integrative research review: A systematic approach. Beverly Hills, Mosteller F. (1992). Meta-analysis for explanation: A casebook. New York Sage.

C A Sage.

Page 18: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

882 PERSONNEL PSYCHOLOGY

Cronbach U. (1960). Essentials ofpsychological testing (2nd ed.). New York Harper & Row.

Finkelstein LM, Burke MJ, Raju NS. (in press). Age discrimination in simulated employ- ment contexts: An integrative analysis. Journal of Applied Psychology.

Ghiselli EE, Campbell JP, Zedeck S. (1981). Measurement theory for behavioral sciences. New York W. H. Freeman.

Haring MJ, Okun MA, Stock WA, Miller W, Kinney C, Ceurvorst WR. (1981, April). Reliability issues in meta-analysis. Paper presented at the annual meeting of the American Educational Research Association, Los Angeles.

Hattie JA, Hansford BC. (1984). Meta-analysis: A reflection on problems. Australian Journal of Psychology, 36,239-254.

Horn PW, Caranikas-Walker F, Prussia GE, Griffeth RW. (1992). A meta-analytical struc- tural equations analysis of a model of employee turnover. Journal of Applied Psy-

Hunter JE. (1983). A causal analysis of cognitive ability, job knowledge, job performance, and supervisor ratings. In Landy FJ, Zedeck S, Cleveland J (Eds.), Performance measurement and theory (pp. 257-266). New Jersey: Erlbaum.

Hunter JE, Gerbing DW. (1982). Unidimensional measurement, second-order factor anal- ysis and causal models. In Staw BM, Cummings LL (Eds.), Research in organiza- tional behavior (Vol. 4, pp. 267-320). Greenwich, CT. JAI Press.

Hunter JE, Schmidt FL. (1990). Methods of meta-analysis: Correcting for error and bias in research findings. Newbury Park, CA: Sage.

Jackson GB. (1980). Methods for integrative reviews. Review ofEducafiona1 Research, 50, 438-460.

Jensen AR. (1980). Bias in mental testing. New York Free Press. Joreskog KG, Sorbom D. (1986). LISREL 6 user’s guide: Analysis of linear shuctural

relationships by the method of muximum likelihood (4th ed.). Uppsala, Sweden: University of Uppsala.

Kish L. (1959). Some statistical problems in research design. American Sociological Review,

Lipsey MW, Wilson DB. (1993). The efficacy of psychological, educational, and behavioral treatment: Confirmation from meta-analysis. American Psychologist, 48, 1181- 1209.

chology, 77,890-909.

24,328-338.

Little RJA, Rubin DR. (1987). Statisticalanalysis with missing data. New York: Wiley. McDaniel MA, Whetzel DL, Schmidt FL, Maurer SD. (1994). The validity of employ-

ment interviews: A comprehensive review and meta-analysis. Journal of Applied Psychology, 79,599-616.

McNemar Q. (1962). Psychological statistics (3rd ed.). New York: Wiley. Mosier CI. (1943). On the reliability of a weighted composite. Psychometrih, 8,161-168. Nathan BR, Alexander RA. (1988). A comparison of criteria for test validation: A meta-

analytical investigation. PERSONNEL PSYCHOLOGY, 41,517-535. Nunnally JC. (1978). Psychometric theory. New York: McGraw-Hill. Oakes M. (1986). Statistical inference: A commentary for the social and behavioral sciences.

New York Wiley. Ones DS. (1993). The construct validity ofinte&y tests. Unpublished doctoral dissertation,

University of Iowa, Iowa City. Ones DS, Viswesvaran C. (in press). Bandwidth-fidelity dilemma in personality measure-

ment for personnel selection. Journal of Organizational Behavior. Pearlman K, Schmidt FL, Hunter JE. (1980). Validity generalization results for tests used

to predict job proficiency and training success in clerical occupations. Journal of Applied Psychology, 65,373-406.

Peters LH, Hartke DD, Pohlmann JT. (1985). Fiedler’s contingency theory of leadership:

Page 19: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

VISWESVARAN AND ONES 883

An application of the mieta-analysis procedures of Schmidt and Hunter. Psycho- logical Bulletin, 97,274-285.

Premack SL, Hunter JE. (1988). Individual unionization decisions. Psychological Bulletin, 103,223-234.

Rajaratnam N, Cronbach LJ, Gleser GC. (1965). Generalizability of stratified parallel tests. Psychometrika, 30,39-56.

Raju NS, Burke MJ, Normand J, Langlois GM. (1991). A new meta-analytic approach. Journal of Applied Psychology, 76,432-446.

Raymond MR, Viswesvaran C. (1993). Least squares models to correct for rater effects in performance assessment. Journal of Educational Measurement, 30,253-268.

Rosenthal R, Rubin DB. (1978). Issues in summarizing the first 345 studies of interper- sonal expectancy effects. Behavioral and Brain Sciences, 3,410415.

Rozeboom WW. (1960). The fallacy of the null hypothesis significance test. Psychological Bulletin, 57,416428.

Schmidt FL. (1992). What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology. American Psychologist, 47,1173-1 181.

Schmidt FL. (in press). Quantitative methods and cumulative knowledge in psychology: Implications for the training of researchers. Psychological Methods.

Schmidt FL, Hunter JE. (1992). Development of a causal model of processes determining job performance. Current Directions in Psychological Science, I , 89-92.

Schmidt FL, Hunter JE. (in press). Measurement error in industrial/organizational psy- chology research: Lessons from 26 research scenarios. Psychological Methods.

Schmidt FL, Hunter JE, Croll PR, McKenzie RC. (1983). Estimation of employment test validities by expert judgment. Journal ofApplied Psychology, 68,590-601.

Schmidt FL, Hunter JE, Outerbridge AN. (1986). The impact of job experience and ability on job knowledge, work sample performance, and supervisory ratings of job performance. Journal of Applied Psychology, 71,432439.

Schmidt FL, Kaptan LB. (1971). Composite versus multiple criteria: A review and reso- lution of the controversy. PERSONNEL PSYCHOLOGY, 24,419-434.

Schmidt FL, Law K, Hunter JE, Rothstein HR, Pearlman K, McDaniel M. (1993). Refine- ments in validity generalization methods: Implications for the situational specificity hypothesis. Journal of Applied Psychology, 78,3-12.

Schmitt N, Landy FJ. (1993). The concept of validity. In Schmitt N, Borman WC (Eds.), Personnel selection in organizations (pp. 275-309). San Francisco: Jossey Bass Pub- lishers.

Smith ML, Glass GV (1977). Meta-analysis of psychotherapy outcome studies. American

TabachnickBG, Fidell LS. (1989). Usingmultivariatestatistics (2nd ed). New York Harper

Viswesvaran C. (1993). Modeling job performance: Is there a general factor? Unpublished

Viswesvaran C, Schmidt FL. (1992). A meta-analytic comparison of the effectiveness of

Wanous JP, Sullivan SE, Malinak J. (1989). The role of judgment calls in meta-analysis.

Psychologist, 32,752-760.

Collins.

doctoral dissertation, University of Iowa, Iowa City.

smoking cessation methods. Journal of Applied Psychology, 77,554-561.

Journal of Applied Psychology, 74,259-264.

Page 20: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

884 PERSONNEL PSYCHOLOGY

APPENDIX

Constructing Artifact Distributions for Unreliability Corrections

In constructing the artifact distributions to correct for unreliability, care should be taken to include the appropriate reliability estimates. The appropriate reliability coefficient depends on the relationship of inter- est and the manner in which the observed correlation is obtained. For example, if the researcher is interested in the relationship between age and managerial motivation and if the observed correlation is obtained between a measure of managerial motivation as captured in a sentence completion test (with objective scoring keys) and age, then the appro- priate reliability coefficient will be one that assigns variance specific to the items in the sentence completion test, variance due to transient re- sponses to the test across administrations, and variance due to random response, to error variance (Schmidt & Hunter, in press).

The artifact distributions could be conceptualized either as a distri- bution of population artifacts (Hunter & Schmidt, 1990) or as a dis- tribution of sample-based estimates (Raju, Burke, Normand, & Lan- glois, 1991). The procedure suggested here to test complex theories- combining psychometric meta-analysis and structural equations model- ing-will work for both cases, and the seven steps will be identical. The difference will be only in the formulae used to cumulate the observed correlations and estimate the true score correlation as well as the stan- dard deviation of the estimated true score correlation (Raju et al., 1991 for formulae). If meta-analytic researchers are interested in obtaining a sample-based reliability estimate, when data on only mean and stan- dard deviation are available, a procedure developed by Burke (1994) can be utilized. Burke developed a procedure to compute sample-based KR-21 reliability estimates using mean, standard deviation, and number of items. Although KR-21 underestimates KR-20 and standardized al- pha, it does provide a useful sample-based estimate of reliability. When meta-analytic researchers are faced with a situation where only descrip- tive statistics are available in a study, Burke’s procedure (assuming that the scoring of the items does not preclude this procedure) could be prof- itably employed. (We thank an anonymous reviewer for bringing this useful technique to our attention.)

In computing the reliability of a composite, three options are avail- able. The three options are standardized alpha, stratified alpha, and Mosier reliability estimate. First, the researcher can form a standard- ized alpha to estimate the reliability of the composite. When the meta- analytic researcher has information only on the means, standard devi- ations, and number of items included for each component, an estimate

Page 21: THEORY TESTING: COMBINING PSYCHOMETRIC META-ANALYSIS AND STRUCTURAL EQUATIONS MODELING

VISWESVARAN AND ONES 885

of the reliability of the composite measure can be obtained by employ- ing stratified alpha (Rajaratnam, Cronbach, & Gleser, 1965). In order to apply stratified alpha, the meta-analytic researcher needs to compute the reliability estimate for each component, where each component re- liability estimate is estimated by KR-21 (Burke, 1994). Both the stan- dardized alpha and the stratified alpha assign the variance specific to components to error variance. The Mosier reliability estimate (Mosier, 1943), on the other hand, is an estimate of the reliability of a composite measure when the specific variance associated with each component is to be treated as true variance. Because reliability is the ratio of true to observed variability, the Mosier reliability of a composite will be higher than the standardized alpha. A standardized alpha can be computed us- ing the intercorrelations among the components that make the compos- ite. A Mosier reliability estimate can be obtained from the intercorrela- tions among the components making the composite and the reliability of each component. However, if only the means, variance, and number of items are available, KR-21 can be computed as an approximate estimate of alpha for each component, and a stratified alpha can be estimated for the composite, but Mosier reliability estimates cannot be computed. See also Hunter and Schmidt (1990, pp. 460-462) for further computational details on Mosier reliability.