Scaling and Dimensional Analysis William Jacoby Michigan State University The terms "scaling" and "dimensional analysis" refer to a wide variety of research strategies and procedures. The common element among them is that they all seek to provide quantitative and/or geometric representations of the internal structure in a set of data. Researchers apply these techniques for three main reasons: (1) Simple data reduction— summarizing a large set of variables with a smaller number of composite measures; (2) examining dimensionality— testing the underlying sources of variation in a dataset; and (3) measurement— obtaining empirical representations of the underlying (and usually unobservable) dimensions, which can be employed as analytic variables in other statistical procedures. On a less formal note, researchers will often find that dimensional analysis is very beneficial for "conceptualizing" the contents of their data. In addition, these techniques usually provide visual displays that are very useful for presenting analytical results to other people. Thus, for a variety of reasons, scaling and dimensional analysis are useful additions to the social scientist's "repertoire" of research strategies. READING MATERIAL Unfortunately, there is no single textbook that covers all of the topics in this course. In addition, many of the texts that are available have certain drawbacks that limit their usefulness for our purposes: They tend to be very expensive; they usually assume a high level of mathematical sophistication; they often contain sections that are out of date. Because of these considerations, we will rely primarily on several shorter works. Students should consider purchasing at least some of the following texts (although I strongly recommend waiting until after the first workshop session before doing so):

    Scaling and Dimensional Analysis

    William Jacoby Michigan State University

    The terms “scaling” and “dimensional analysis” refer to a wide variety of research strategies and procedures. The common element among them is that they all seek to provide quantitative and/or geometric representations of the internal structure in a set of data. Researchers apply these techniques for three main reasons: (1) Simple data reduction— summarizing a large set of variables with a smaller number of composite measures; (2) examining dimensionality— testing the underlying sources of variation in a dataset; and (3) measurement— obtaining empirical representations of the underlying (and usually unobservable) dimensions, which can be employed as analytic variables in other statistical procedures. On a less formal note, researchers will often find that dimensional analysis is very beneficial for “conceptualizing” the contents of their data. In addition, these techniques usually provide visual displays that are very useful for presenting analytical results to other people. Thus, for a variety of reasons, scaling and dimensional analysis are useful additions to the social scientist's “repertoire” of research strategies.

    READING MATERIAL Unfortunately, there is no single textbook that covers all of the topics in this course. In addition, many of the texts that are available have certain drawbacks that limit their usefulness for our purposes: They tend to be very expensive; they usually assume a high level of mathematical sophistication; they often contain sections that are out of date. Because of these considerations, we will rely primarily on several shorter works. Students should consider purchasing at least some of the following texts (although I strongly recommend waiting until after the first workshop session before doing so): Readings will also be taken from the following works:

    SOFTWARE CONSIDERATIONS With very few exceptions, the methods covered in this workshop are computationally intensive. Therefore, appropriate software is required to perform most of the analyses. Fortunately, most of the widely-available statistical packages (e.g., STATA, SAS, SPSS, SYSTAT) contain routines for carrying out the major techniques (e.g., factor analysis, multidimensional scaling, correspondence analysis). But, there are a few special-purpose programs that will also be used for particular applications. These will be introduced as necessary, in class, and they will all be available on the ICPSR Summer Program network. Handouts and examples will generally present analyses in STATA (with some exceptions where necessary). But, participants are also encouraged to try to perform the analyses using the R statistical computing environment.

    TOPICS AND READING ASSIGNMENTS The topics covered in this workshop fall within three major sections, although they are not identified as such in the syllabus. The first section covers several scaling strategies appropriate for analyzing data that are obtained from a "dominance" process. That is, each datum indicates the extent to which a stimulus object exceeds (or fails to exceed) some standard of comparison (e.g., another stimulus object, a unit mark on a measurement scale, etc.). Specific procedures to be covered in this part of the workshop include summated rating scales, cumulative scaling, and factor analysis. The second section moves on to methods for dealing with "proximity" data. This is information that indicates how "close" or "similar" one object is to another. Here, the workshop will focus on unfolding models, multidimensional scaling, and correspondence analysis. The third section covers data theory and some general considerations related to dimensional analysis. Here, there is less focus on specific scaling techniques and more attention to a general framework for integrating the material that has already been covered. This last part of the workshop is particularly important for understanding when different scaling techniques can and should be employed in empirical research. And hopefully, it will leave workshop participants with a relatively optimistic view of the nature, quality, and potential for accurate measurement of important concepts in the social and behavioral sciences. In the outline, entries marked with a double asterisk should be considered essential readings. Those with a single asterisk are recommended works. Unmarked entries are supplemental readings which generally cover specific aspects of the respective topics in greater detail. I. Introduction and Some General Considerations

    ** Jacoby (1991), Chapters 1 and 2. ** Jacoby, William G. (1999) “Levels of Measurement and Political Research: An

    Optimistic View.” American Journal of Political Science 43: 271-301.

    ** Young and Hamer (1987), Chapter 3. ** Weller and Romney (1990), Chapter 1. ** Lattin et al. (2003), Chapter 1. ** Bartholomew et al. (2002), Chapter 1. II. Classification and Clustering: A Very Brief Introduction ** Bartholomew et al. (2002), Chapter 2. ** Bailey (1994), Chapters 1-3.

    III. Summated Rating Scales A. Scale Construction

    ** McIver and Carmines (1981), pp. 22-26. * DeVellis (1991), Chapters 2 and 5.

    Spector, Paul E. (1992) Summated Rating Scale Construction. Sage University Paper. B. Scale Assessment and Reliability ** McIver and Carmines (1981), pp. 26-40. * DeVellis (1991), Chapter 3. * Bollen (1989), Chapter 6, especially pp. 206-223. * Traub, Ross E. (1994) Reliability for the Social Sciences: Theory and Applications.

    C. Magnitude Scaling

    ** Lodge, M. and B. Tursky (1981) “On the Magnitude Scaling of Political Opinion in

    Survey Research.” American Journal of Political Science 25: 376-419. ** Jacoby (1991), pp. 53-58.

    IV. The Cumulative Scaling Model in a Single Dimension A. Guttman Scaling

    B. Mokken Scaling

    Scale and Parametric Item Response Theory.” Political Analysis 11: 139-163. ** Jacoby (1991), pp. 44-46. ** Mokken, Robert J. and Charles Lewis (1982) “A Nonparametric Approach to the

    ** Sijtsma, K.; P. Debets; I.W. Molenaar (1990) “Mokken Scale Analysis for

    * Gillespie, Michael; Elizabeth M. Tenvergert; Johannes Kingsma (1987) “Using

    C. Rasch Models

    * Meijer, Rob R.; Klaas Sijstma; Nico G. Smid (1988) “Theoretical and Empirical

    V. Preparation for Multidimensional Models A. Brief Overview of Matrix Algebra

    B. Vector Geometry and Linear Models

    ** Lattin et al. (2003), pp. 19-32. ** Wickens (1995), Chapters 1-5.

    C. The Basic Structure of a Matrix

    VI. Multidimensional Summaries of Multivariate Data A. Principal Component Analysis

    A. Principal Component Analysis ** Bartholomew et al. (2002), Chapter 5. ** Lattin et al. (2003), Chapter 4. ** Weller and Romney (1990), Chapter 3. * Dunteman (1989), Chapters 1-6, 8.

    B. The Biplot: Simultaneous Graphical Representation of Variables and Observations

    ** Jacoby, William G. (1998) Statistical Graphics for Visualizing Multivariate Data. Sage Publications. Sage. Chapter 7.

    VII. Factor Analysis A. The Common Factor Model

    B. Estimation of the Factor Model

    D. Construction of Factor Scales

    E. Principal Components Analysis Compared to Factor Analysis

    F.Introduction to Confirmatory Factor Analysis

    VIII. Vector Analysis of Preferences

    IX. Spatial Distance Models for Analyzing Proximity Data

    X. The Unidimensional Unfolding Model and Related Approaches A. Unfolding Analysis

    B. Proximity and Parallelogram Scaling

    * Cliff, Norman; Linda M. Collins; Judith Zatken; Dannie Gallipeau; Douglas J.

    * Hoijtink, Herbert (1991) “The Measurement of Latent Traits by Proximity Items.”

    * Hoijtink, Herbert and Ivo W. Molenaar (1994) “An Item Response Model with Single Peaked Item Characteristic Curves: The PARELLA Model.” Quality & Quantity 28: 99-116.

    XI. Multidimensional Scaling: Basic Models and Procedures A. Classical Multidimensional Scaling

    B. Weighted Multidimensional Scaling

    ** Lattin et al. (2003), pp. 235-243. ** Kruskal and Wish (1978), pp. 60-73. ** Arabie, Carroll, and DeSarbo (1987), pp. 1-53. * Davison (1983), Chapter 6.

    XII. Multidimensional Scaling: Additional Considerations A. Interpretation of Multidimensional Scaling Results

    B. Data for Multidimensional Scaling Analyses

    C. Hypothesis Testing and Confirmatory Multidimensional Scaling

    XIII. Multidimensional Unfolding

    ** Poole (2005), Chapters 2-5. * Carroll, J.D. (1972) “Individual Differences and Multidimensional Scaling.” In

    * Poole, Keith T. (2000) “Nonmetric Unfolding of Binary Choice Data.” Political

    Political Analysis 9 (Summer 2001). A Special Issue on “Estimating Legislators’ Preferences with Roll Call Data.”

    XIV. Variations on Multidimensional Scaling Models and Estimation Procedures

    XV. Optimal Scoring Approaches A. Correspondence Analysis

    ** Weller and Romney (1990), Chapters 5-8. ** Dunteman (1989), Chapter 13. ** Young (1994). * Greenacre (1984), Chapters 3-7.

    B. The Alternating Least Squares, Optimal Scaling Strategy

    * De Veaux, Richard D. (1990) “Finding Transformations for Regression Using the

    XVI. General Theoretical Concerns A. Comparison of Scaling Strategies and Relations Between Methods

    Cheung, K C. and L. C. Mooi. (1994) “A Comparison Between the Rating Scale Model and Dual Scaling for Likert Scales.” Applied Psychological Measurement 18: 1-13.

    B. Measurement Theory

    C. Data Theory

    D. Dimensionality

  • 272