lanning icps: community structure of jpsp

1
4 6 5 7 0 ` Kevin Lanning Florida Atlantic University, USA narrative and code at http://bit.ly/StructureOfPsychology The network structure of social and personality psychology a bibliometric approach abstract / introduction The structure of social-personality psychology includes (but is not limited to) constructs, scholars, papers and the links among them. This project is a case study of part of this network, the 2014 volumes of Journal of Personality and Social Psychology (JPSP). Using techniques borrowed from contemporary bibliometrics and data science, I find (a) that the network cannot be simply or easily parsed into discrete Aristotelian regions, but that (b) a model which allows communities to overlap illuminates core concepts and their relationships. I also (c) examine the sections of the journal and find (d) that there is no clear trend indicating that the three sections – or the two areas of personality and social psychology – are either converging or growing apart. I used bibliometric couplings (shared references) to explore relations among the 118 papers published in JPSP in 2014. These were linked by 7,248 references (i.e., all of those which included DOI identifiers). Citation information was manually extracted from the proprietary PsycInfo database, then examined using open source software. The network was derived using Gephi, communities were extracted using C-finder, and text analysis was performed using the R text mining (tm) package. Much of the code and a narrative describing this project is posted on Github at http:// bit.ly/StructureOfPsychology. JPSP Section(s) Density w/in Attitudes (a) 39.1% w/in Interpersonal (i) 26.9 w/in Personality (p) 24.3 Between a & i 21.2 Between a & p 14.4 Between i & p 15.5 Between (a & i) & p 15.0 All sections 20.6 JPSP papers and citations constitute a biphasic, directed space which can be represented as [118 sources -> 7248 target references <- 118 sources]. I reduced this citation network to the single-mode structural network of [118 source articles – 118 source articles] shown in Figure 1. Here, all 118 papers are connected in a dense space of 1421 edges, forming a (very) small world in which papers are separated by an average of 1.9 links. JPSP is split into three sections. Within this network, articles are more closely connected to papers in the same than in different sections (Table 1). But attempts to partition the network into simple, discrete communities lacked robustness, with results dependent on initial random seeds. This failure reflects the lack of clear demarcations between areas of scholarship (e.g., Campbell, 1969), and opens the door to a richer approach to community structure. methods a structural network communities is social-personality a single area? discussion If the publication of Allport (1937) marked the birth of personality psychology, and Mischel (1968) marked its (partial) demise, then the 1981 partitioning of JPSP into separate sections marked a resurrection of sorts for the study of personality as a separate endeavor. Today, the autonomy of personality psychology may be questioned: On the one hand, there are few graduate programs in personality psychology proper, reducing the field to an optional appendage at the end of ‘social-.’ On the other, research in the area continues to prosper. Should ‘social-personality’ be construed as a single area, as two separate ones (social and personality), or as three (the three sections of JPSP)? A weak form of validity for the two and three-area models would require that the connectedness (or density) of papers within areas is greater than that between areas; a strong form would require that a particular model is superior to any other, including one with overlapping categories. The citation data in JPSP can be used to help evaluate these claims, and provides support for the weak form of the two- and three-area models. That is, papers in the personality section are more frequently connected with each other (24%) than with papers in the social sections (15%). Further, papers in all three areas are more closely connected than are papers between areas (Table 1). In order to provide some context for this effect, I compiled reference data for four additional occasions, covering a total of 25 years of the development of the field. The most visible trend is that there has been greater connectedness over time for all sections of the journal, an effect which is at least partially due to increased network size as more references with DOI identifiers have become available. This effect notwithstanding, two additional findings are of note. First, over this period, the connectedness of the personality section with the remaining sections has been consistently lower than connections between and within the remaining sections. Second, there is no evidence that personality and social psychology are converging, rather, the distance between them may be increasing. Such an effect, if real, may not be an echo of the acrimony of an earlier time, but a manifestation of the general trend towards differentiation in science. A citation is a social act, as is a written word or the act of reading a scientific paper (or poster). Knowledge is a social product, a connection in a complex network. Yet the network that is social-personality psychology is at present poorly understood, mapped by anecdote and tradition rather than by strong data. The present study applies diverse methods to the understanding of this network and provides new connections between psychology and data science, such as that between psychological categories and network communities. It is limited in that it thus far focuses on papers and citations within a single journal, and primarily for a single year. With additional data, a category structure could be achieved which could take us beyond arbitrary keywords, increase synergistic connections among us, and foster the creation of intellectual and social capital. 1 3 2 1 3 2 4 6 5 7 0 extraction interpretation Word clouds are shown in Figure 3. These are individually coherent, particularly given the relatively small number of papers upon which they are based. Collectively, the word clouds attest to the depth and diversity of contemporary work in social and personality psychology. In the top section, Category 1 (existence) includes terms linked primarily with the literature in terror management (denial, existential, mortality, salience, ego, body), as well as humility, appearance, and scheduling. Terms in Group 3 (love and marriage) include not just partners, wives, and husbands, but also unresponsive, guilt and hurt, reflecting a concern not just with interpersonal processes but more specifically with problems in dyadic relationships. Category 2 (personality measurement) includes terms depicting both measurement (dimensions, indices, correlates, situation, and taxonomy) and content (autism, parenting, and adjustment). In the bottom section, Category 4 (social games) includes aggression, cooperation, and regret; while Category 5 (free will) is concerned with morality, generosity, and punishment. These two communities are not directly linked, though each overlaps with Category 6 (perspective taking), which includes terms related to wealth and social class. Category 7 includes core concepts from social cognition such as implicit, entity, and framing. while Category 8 appears concerned with social distance, including terms such as movement, approaching, aversion and trust. As with many other concepts and categories, scholarly communities are defined by family resemblance rather than by a set of individually necessary and jointly sufficient attributes (Rosch & Mervis, 1975). That is, there are typically no methods, theories, etc. which are shared by and uniquely characteristic of all papers within a particular area of scholarship. Rather, membership in categories is graded, and an exemplar (here, a paper) may belong to more than one category (area of scholarship). These principles can largely be instantiated by developing a bottom-up, agglomerative structure based on k-cliques, i.e., groups in which each node is connected to at least k other nodes. Following Palla et al. (2005), I examined the community structure for a series of models at varying levels of k (minimum clique size) and w (minimum link strength for inclusion in the analysis). The solution presented here (k=4, w=2) is a narrow one, emphasizing coherence over comprehensiveness, which places 57 of the 118 papers into 8 communities. These communities are linked into two superordinate groupings, one comprising 36 papers in three areas, the other 21 papers in five smaller areas (see Figure 2). Several of the eight communities are essentially subsets of the three sections of the journal: The 15 papers in Community 2 are all drawn from Personality and Individual Differences. In Community 4, 15 of 19 papers are in Interpersonal Relations and Group Processes; while in Community 5, all 5 papers are drawn from the Attitudes and Social Cognition section. Text analysis. In order to achieve a closer and more rigorous identification of the eight communities, I examined the raw text of the 57 papers. I first removed common stopwords (the), terms judged to be nondiscriminating (abstract), common names, and terms which were used fewer than five times or in only one of the eight corpora (hexacohh). I then double-centered the matrix of words x communities to recover words which are particularly characteristic of each of the eight categories. Figure 1. A JPSP structural network. Nodes represent papers published in JPSP in 2014. Size reflects page rank; color corresponds to the JPSP section in which the paper appeared. Edge thickness corresponds to the number of references shared by papers. Nodes are labeled by author name, section abbreviation, volume and first page (e.g., Smith_J.a.107.1). The spatial layout was derived using Gephi’s Force Atlas 2 algorithm. A nonlinear spline function has been applied to node size to highlight central papers. Figure 2. Eight communities of scholarship in JPSP, derived from k-clique analysis. Papers are linked if they share two or more references in common (w = 2). Communities are defined as groups in which each paper is linked to four or more other papers (k = 4). Links between communities gives rise to a hierarchical structural network. Figure 3. The most characteristic terms in eight communities of JPSP papers. Size of communities is roughly proportional to number of papers, and links between communities indicate overlap (see Figure 2). Within communities, the size of terms is proportional to their differential frequency or diagnosticity. Clockwise from top, the categories include Love and marriage (3), Personality measurement (2), Social distance (0), Social cognition (7), Free will (5), Perspective taking (6), Social games (4), and Mortality and existence (1). Table 1. Connections within and between papers in JPSP sections. A typical paper in JPSP:Attitudes shares one or more references with 39% of the other papers in Attitudes, 21% of the papers in Interpersonal, and 14% in Personality. Figure 4. Density of connections among JPSP sections over time. 0.0% 5.0% 10.0% 15.0% 20.0% 25.0% 30.0% 1994 1999 2004 2009 2014 Within all sections Between the two social areas Between all sections Between personality and social

Upload: kevin-lanning

Post on 27-Jul-2015

40 views

Category:

Science


0 download

TRANSCRIPT

4 6

5

7

0

`

Kevin Lanning

Florida Atlantic University, USA

narrative and code at http://bit.ly/StructureOfPsychology

The network structure of social and personality

psychologya bibliometric approach

abstract / introduction

The structure of social-personality psychology includes (but is not limited to) constructs, scholars, papers and the links among them. This project is a case study of part of this network, the 2014 volumes of Journal of Personality and Social Psychology (JPSP). Using techniques borrowed from contemporary bibliometrics and data science, I find (a) that the network cannot be simply or easily parsed into discrete Aristotelian regions, but that (b) a model which allows communities to overlap illuminates core concepts and their relationships. I also (c) examine the sections of the journal and find (d) that there is no clear trend indicating that the three sections – or the two areas of personality and social psychology – are either converging or growing apart.

I used bibliometric couplings (shared references) to explore relations among the 118 papers published in JPSP in 2014. These were linked by 7,248 references (i.e., all of those which included DOI identifiers). Citation information was manually extracted from the proprietary PsycInfodatabase, then examined using open source software. The network was derived using Gephi, communities were extracted using C-finder, and text analysis was performed using the R text mining (tm) package. Much of the code and a narrative describing this project is posted on Github at http://bit.ly/StructureOfPsychology.

JPSP Section(s) Density

w/in Attitudes (a) 39.1%

w/in Interpersonal (i) 26.9

w/in Personality (p) 24.3

Between a & i 21.2

Between a & p 14.4

Between i & p 15.5

Between (a & i) & p 15.0

All sections 20.6

JPSP papers and citations constitute a biphasic, directed space which can be represented as [118 sources -> 7248 target references <- 118 sources]. I reduced this citation network to the single-mode structural network of [118 source articles – 118 source articles] shown in Figure 1. Here, all 118 papers are connected in a dense space of 1421 edges, forming a (very) small world in which papers are separated by an average of 1.9 links.

JPSP is split into three sections. Within this network, articles are more closely connected to papers in the same than in different sections (Table 1). But attempts to partition the network into simple, discrete communities lacked robustness, with results dependent on initial random seeds. This failure reflects the lack of clear demarcations between areas of scholarship (e.g., Campbell, 1969), and opens the door to a richer approach to community structure.

methods

a structural network

communities is social-personality a single area?

discussion

If the publication of Allport (1937) marked the birth of personality psychology, and Mischel (1968) marked its (partial) demise, then the 1981 partitioning of JPSP into separate sections marked a resurrection of sorts for the study of personality as a separate endeavor. Today, the autonomy of personality psychology may be questioned: On the one hand, there are few graduate programs in personality psychology proper, reducing the field to an optional appendage at the end of ‘social-.’ On the other, research in the area continues to prosper.

Should ‘social-personality’ be construed as a single area, as two separate ones (social and personality), or as three (the three sections of JPSP)? A weak form of validity for the two and three-area models would require that the connectedness (or density) of papers within areas is greater than that between areas; a strong form would require that a particular model is superior to any other, including one with overlapping categories. The citation data in JPSP can be used to help evaluate these claims, and provides support for the weak form of the two- and three-area models. That is, papers in the personality section are more frequently connected with each other (24%) than with papers in the social sections (15%). Further, papers in all three areas are more closely connected than are papers between areas (Table 1).

In order to provide some context for this effect, I compiled reference data for four additional occasions, covering a total of 25 years of the development of the field. The most visible trend is that there has been greater connectedness over time for all sections of the journal, an effect which is at least partially due to increased network size as more references with DOI identifiers have become available. This effect notwithstanding, two additional findings are of note. First, over this period, the connectedness of the personality section with the remaining sections has been consistently lower than connections between and within the remaining sections. Second, there is no evidence that personality and social psychology are converging, rather, the distance between them may be increasing. Such an effect, if real, may not be an echo of the acrimony of an earlier time, but a manifestation of the general trend towards differentiation in science.

A citation is a social act, as is a written word or the act of reading a scientific paper (or poster). Knowledge is a social product, a connection in a complex network. Yet the network that is social-personality psychology is at present poorly understood, mapped by anecdote and tradition rather than by strong data.

The present study applies diverse methods to the understanding of this network and provides new connections between psychology and data science, such as that between psychological categories and network communities. It is limited in that it thus far focuses on papers and citations within a single journal, and primarily for a single year. With additional data, a category structure could be achieved which could take us beyond arbitrary keywords, increase synergistic connections among us, and foster the creation of intellectual and social capital.

1

3

2

1

3

2

4

6

5

7

0

extraction interpretation

Word clouds are shown in Figure 3. These are individually coherent, particularly given the relatively small number of papers upon which they are based. Collectively, the word clouds attest to the depth and diversity of contemporary work in social and personality psychology.

In the top section, Category 1 (existence) includes terms linked primarily with the literature in terror management (denial, existential, mortality, salience, ego, body), as well as humility, appearance, and scheduling. Terms in Group 3 (love and marriage) include not just partners, wives, and husbands, but also unresponsive, guilt and hurt, reflecting a concern not just with interpersonal processes but more specifically with problems in dyadic relationships. Category 2 (personality measurement) includes terms depicting both measurement (dimensions, indices, correlates, situation, and taxonomy) and content (autism, parenting, and adjustment).

In the bottom section, Category 4 (social games) includes aggression, cooperation, and regret; while Category 5 (free will) is concerned with morality, generosity, and punishment. These two communities are not directly linked, though each overlaps with Category 6 (perspective taking), which includes terms related to wealth and social class. Category 7 includes core concepts from social cognition such as implicit, entity, and framing. while Category 8 appears concerned with social distance, including terms such as movement, approaching, aversion and trust.

As with many other concepts and categories, scholarly communities are defined by family resemblance rather than by a set of individually necessary and jointly sufficient attributes (Rosch & Mervis, 1975). That is, there are typically no methods, theories, etc. which are shared by and uniquely characteristic of all papers within a particular area of scholarship. Rather, membership in categories is graded, and an exemplar (here, a paper) may belong to more than one category (area of scholarship).

These principles can largely be instantiated by developing a bottom-up, agglomerative structure based on k-cliques, i.e., groups in which each node is connected to at least k other nodes. Following Palla et al. (2005), I examined the community structure for a series of models at varying levels of k (minimum clique size) and w (minimum link strength for inclusion in the analysis). The solution presented here (k=4, w=2) is a narrow one, emphasizing coherence over comprehensiveness, which places 57 of the 118 papers into 8 communities. These communities are linked into two superordinate groupings, one comprising 36 papers in three areas, the other 21 papers in five smaller areas (see Figure 2).

Several of the eight communities are essentially subsets of the three sections of the journal: The 15 papers in Community 2 are all drawn from Personality and Individual Differences. In Community 4, 15 of 19 papers are in Interpersonal Relations and Group Processes; while in Community 5, all 5 papers are drawn from the Attitudes and Social Cognition section.

Text analysis. In order to achieve a closer and more rigorous identification of the eight communities, I examined the raw text of the 57 papers. I first removed common stopwords (the), terms judged to be nondiscriminating(abstract), common names, and terms which were used fewer than five times or in only one of the eight corpora (hexacohh). I then double-centered the matrix of words x communities to recover words which are particularly characteristic of each of the eight categories.

Figure 1. A JPSP structural network. Nodes represent papers published in JPSP in 2014. Size reflects page rank; color corresponds to the JPSP section in which the paper appeared. Edge thickness corresponds to the number of references shared by papers. Nodes are labeled by author name, section abbreviation, volume and first page (e.g., Smith_J.a.107.1). The spatial layout was derived using Gephi’s Force Atlas 2 algorithm. A nonlinear spline function has been applied to node size to highlight central papers.

Figure 2. Eight communities of scholarship in JPSP, derived from k-clique analysis. Papers are linked if they share two or more references in common (w = 2). Communities are defined as groups in which each paper is linked to four or more other papers (k = 4). Links between communities gives rise to a hierarchical structural network.

Figure 3. The most characteristic terms in eight communities of JPSP papers. Size of communities is roughly proportional to number of papers, and links between communities indicate overlap (see Figure 2). Within communities, the size of terms is proportional to their differential frequency or diagnosticity. Clockwise from top, the categories include Love and marriage (3), Personality measurement (2), Social distance (0), Social cognition (7), Free will (5), Perspective taking (6), Social games (4), and Mortality and existence (1).

Table 1. Connections within and between papers in JPSP sections. A typical paper in JPSP:Attitudes shares one or more references with 39% of the other papers in Attitudes, 21% of the papers in Interpersonal, and 14% in Personality.

Figure 4. Density of connections among JPSPsections over time.

0.0%

5.0%

10.0%

15.0%

20.0%

25.0%

30.0%

1994 1999 2004 2009 2014

Within all sections

Between the two social areas

Between all sections

Between personality and social