the impact of communication structure and interpersonal

The Impact of Communication Structure andInterpersonal Dependencies on Distributed Teams

Timothy La Fond1, Dan Roberts1, Jennifer Neville1, James Tyler2, Stacey Connaughton31Department of Computer Science, 2Department of Psychological Sciences,

3Brian Lamb School of CommunicationPurdue University, West Lafayette, IN USA

{tlafond, droberts, neville, tyler, sconnaug}@purdue.edu

Abstract—In the past decade, we have witnessed an explosivegrowth of the Web, online communities, and social media.This has led to a substantial increase in the range and scopeof electronic communication and distributed collaboration. Indistributed teams, social communication is thought to be criticalfor creating and sustaining relationships, but there is often limitedopportunity for team members to build interpersonal connectionsthrough face to face interactions. Although social science researchhas examined some relational aspects of distributed teams, thiswork has only recently begun to explore the potentially complexrelationship between communication, interpersonal relationshipformation, and the effectiveness of distributed teams. In thiswork, we analyze data from an experimental study comparingdistributed and co-located teams of undergraduates working tosolve logic problems. We use a combined set of tools, includingstatistical analysis, social network analysis, and machine learning,to analyze the influence of interpersonal communication on theeffectiveness of distributed and co-located teams. Our resultsindicate there are significant differences in participants’ self- andgroup perceptions with respect to: (i) distributed vs. co-locatedsettings, and (ii) communication structures within the team.

I. INTRODUCTION

In the past decade, we have witnessed an explosive growthin the use of the Web, online communities, and social media.This has led to widespread adoption of multiple modalities forelectronic communication, which in turn has made it easierto form virtual teams and collaborate in a geographicallydistributed manner. Social communication is thought to becritical for creating and sustaining relationships in these dis-tributed teams. However, while social science research hasexamined some relational aspects of distributed teams, theapproaches that examine the impact of communication on teameffectiveness are limited in their focus either on modelingsimple aggregate statistics (e.g., communication frequency), orindividual dyadic relationships (e.g., between a single pair ofindividuals). In this work, we combine the use of statisticalanalysis, social network analysis, and machine learning toanalyze the influence of interpersonal communication on theeffectiveness of distributed and co-located teams.

Communication is often described as the glue that holds dis-tributed teams together and lack of personal relationships canhave a major impact on team dynamics in distributed settings[1]. Although the relational aspects of teams is acknowledgedas important, theoretical models that capture these complex

patterns still remain to be developed [2]. Moreover, to datesocial science research has often neglected to fully examine theinfluence of social communication on interpersonal relation-ships among team members [3] and the work that does existhas produced rather mixed results. Specifically, some evidencesuggests that team effectiveness is challenged because itsgeographic dispersion constrains the interpersonal communi-cation among team members [4]. Other work has analyzedthe effect of remote communication on groups by categorizingthe communication text itself [5] and the results indicated thatdistributed communication was more task-oriented and thusmore conducive to problem-solving. Additional research hasfailed to find a negative relationship between team distributionand a variety of interpersonal indices [6]. The tenor of previousresearch findings reveals the potentially complex relationshipbetween communication, interpersonal relationship formation,and the effectiveness of distributed teams. To tease apart howcommunication functions in such a context, however, and tounderstand its impact on the overall effectiveness of a team,requires the combined analysis of both interactions amongteam members and their individual activity/performance, traits,and perceptions.

In this work, we analyze data from a laboratory studyon small cooperative teams conducted at Purdue University,comparing distributed and co-located teams. In the experiment,small teams of three to four undergraduate students were givena logic problem to solve as a group in a 45 minute session. Inthe first phase of the experiment the teams were distributed andthe participants communicated online through a chat room. Inthe second phase of the experiment, the teams were co-locatedand the participants communicated face-to-face in the sameroom. Each participant answered a survey after the session,where they evaluated the performance and characteristics oftheir teammates, their own performance, and the performanceof the group as a whole. In addition, the survey assessedthe participant’s communication aggressiveness, communica-tion anxiety and self-esteem. Transcripts of the chat roomsconversations were recorded and videos of the face-to-faceencounters were manually transcribed into a similar format.

Our analysis focuses on two aspects of the experimentaldata: determining what factors are predictive of the partic-ipants’ evaluations of group performance, and discovering

communication structures in the stream of communicationsamong the participants. From the transcripts of the conversa-tions we can determine how frequently each member spoke (orposted), as well as who they were replying to. This facilitatesrepresenting the group communication as a weighted graph,where the number of messages sent between two members issummarized as a weight on each edge. We use this structureto identify group types and determine member roles withinthose groups. We combine the communication informationwith the survey information (i.e., personal, teammate, andgroup impressions) and employ analytic methods to comparethe properties of distributed and face-to-face groups.

Our main findings include:

• We identify three types of groups: hub, outlier, equal, andfour roles of individual: hub, spoke, outlier, equal.

• Individuals identified as hubs are more assertive thanthose in other roles, while outliers are less assertive.

• In co-located and distributed settings, outlier individualsrate themselves lower on almost every category of pos-itive evaluation (compared to other roles). However, inco-located teams, outliers also report significantly higherlevels of communication aggressiveness.

• Participants in co-located settings rate their groups morehighly in terms of productivity, effectiveness, trust, co-hesion and satisfaction than individuals in distributedgroups. Moreover, participants in equal-type groups, ratetheir group significantly higher on cohesion compared toparticipants in other (co-located) group types. They alsoreport significantly higher levels of self-likability.

• We learned statistical models to predict individual rat-ings of group productivity, effectiveness, trust, cohesion,and satisfaction—and the models for predicting grouptrust are most accurate overall. However, we observesignificant differences between the models for distributedand co-located settings—there is a larger improvementover baseline for predicting (i) productivity in distributedsettings, and (ii) effectiveness in co-located settings.

• Gender is a significant feature in the predictive modelsfor distributed teams but not in co-located teams. Forpredicting group productivity specifically, outgoing eval-uations (i.e., ratings of teammates) of respect and produc-tivity are more important in the distributed setting, whileoutgoing evaluations of task-orientation and likability aremore important in the co-located setting. For predictinggroup effectiveness, both outgoing and self evaluations ofproductivity are more important in the distributed setting.

II. DATA DESCRIPTION

We analyzed data gathered at Purdue University for an ex-periment in small-group team communication, which consistsof two phases. During the experiment students were assignedto teams of three to four individuals. Participants in Phase Icommunicated in a distributed fashion using an online chatroom (in geographically dispersed locations), and participantsin Phase II communicated face-to-face in the same room. We

will hereafter refer to Phase I as Chat and Phase II as Face-to-face. In Chat, there were 79 groups of size three and 48 groupsof size four, while Face-to-face had 27 groups of size threeand 35 groups of size four. In both settings, each team wasgiven a logic problem to solve as a group over a 45 minutetime period. An example of one such puzzle is given a setof names, occupations, companies, and a set of constraints,match each person to their correct occupation and company.

After the session participants evaluated the traits and per-formance of each member (including themselves), as well asthe performance of the group as a whole. The set of surveyquestions assessed participants on: competence, conflict (withthe team), dominance, level of emotion (nervousness), involve-ment, likability, productivity, respectability, task-orientation,and trustworthiness. The participants also evaluated the per-formance of the group on group: cohesion, effectiveness,productivity, trustworthiness, and satisfaction. In addition,individuals answered a set of questions that assessed their levelof communication anxiety, communication aggressiveness, andself-esteem. Each question used rating scale of 1-7 except forthe self-esteem questions which used a scale of 1-5.

Transcripts of the group conversations during the sessionwere recorded—the chat rooms interaction were recorded elec-tronically; the face-to-face interactions were videotaped. Thetranscripts of Face-to-face sessions tended to be longer thanthose of Chat, with the participants of Face-to-face speakingan average of 31.2 times while the participants of Chat postedan average of 8.6 messages.

The survey data can be represented as collection of at-tributed graphs, one for each team. Each graph G = {V,E,X, Y } consists of V nodes (one for each group member andthe group itself) and E edges among all pairs of nodes. Thenxki→j ∈ X is the evaluation node i gave regarding node j

in terms of feature k, and ymi ∈ Y is the evaluation node igave the group as a whole on feature m. Note X includesself-evaluations. Specific features for X and Y are listed inTable I. Each feature is calculated as a sum of several surveyquestions (e.g., self-esteem is determined from 10 questions).

Figure 1 illustrates a graph for one group, from the pointof view of a single participant. Each Person’s evaluations (forself, incoming, and outgoing) consists of a feature vector Xcontaining the set of numerical features listed in IndividualFeatures in Table I. The group evaluations consists of a featurevector Y containing the set of numerical features listed inGroup Features in Table I. In Section IV, we report theresults of learning statistical models to predict a Person’sgroup evaluation based on the individual’s self evaluations,the incoming evaluations from their teammates, the outgoingevaluations of their teammates, as well the self-evaluations andgroup evaluations of the teammates.

III. COMMUNICATION PATTERNS

We supplemented the survey data with weights on the edgesE derived from the conversation transcripts, where wi→j foreij ∈ E is the number of replies node i sent to node jduring the course of the conversation. Then the complete

TABLE IDATA FEATURES

Group FeaturesCohesion SatisfactionEffectiveness TrustProductivity

Individual FeaturesCompetence (to work) RespectConflict Task-OrientationDominance TrustEmotion Communication AggressivenessαInvolvement Communication AnxietyαLiking Self-Esteem (Rosenburg)αProductivity αself-report features only

Person

Teammate

Teammate

GroupSelf eval Group

eval

Outgoingeval

Incomingeval

Outgoingeval

Incomingeval

Teammateself eval

Teammate group eval

Teammategroupeval

Teammateself eval

Fig. 1. Illustration of the types of survey questions reported by eachindividual and their teammates.

representation of each group is G = {V,E,W,X, Y } whichalso includes the message frequencies W .

We construct the weights W by considering each messagein the transcript to be a reply directed towards the immediatelypreceding message, with the following exceptions: individualsare assumed to not reply to their own messages or to messagesof trivial length (here we define ‘trivial’ to be messages ofthree words or less). Each message is treated as a replyto the most recent non-trivial message. With this approach,we transform the transcript into a stream of messages Mwhere every message mt

i→j is a reply from node i directedtoward node j that occurs at time t. The weights on thedirected graph can be easily computed from this stream aswi→j =

∑|M |t=1 m

ti→j . Figure 2 shows a snippet from one of

the transcripts and its resulting message stream. Note that theparticipant labeled IL receives no replies during this period asthey contribute only trivial one-word responses.

We also define additional measures that further aggregatethe information in the message stream. We define an “outgoingmessage weight” and “incoming message weight” score foreach node that represents a node’s prominence as a generatoror receiver of replies. The outgoing weight Oi and incomingweight Ii are calculated as follows:

Oi =

∑Vj 6=i wi→j∑V

j

∑Vk 6=j wj→k

Ii =

∑Vj 6=i wj→i

Z ∗∑Vj 6=i

∑Vk 6=j wj→k

(1)

Here Z =∑V

i Ii is a normalization constant to ensure theincoming message weights to sum to one. The outgoing weight

Transcript StreamAS: i don’t know if this helps you guys but for... ASQL: Frank West, purchaser (Maximus), Tuesday QL → ASAS: i don’t knpw how you got that, but BRILLIANT! AS → QLQL: Cythia Wild, Traffic controller (Metropolis), Thursday QL → ASIL: up IL → QLQL: Mary Prime, Design engineer (Sierra computing)... QL → ASIL: yup IL → QL

Fig. 2. Example of dialogue fragment and corresponding reply streamrepresentation. Note: IL does not receive replies as their messages are trivial.

Pers

on 3

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Person 2

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

Person 1

(a) Chat

Pers

on 3

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Person 2

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

Person 1

(b) Face-to-Face

Fig. 3. Group communication patterns based on incoming message weights:(a) groups from Chat, (b) group from Face-to-face. Blue circles are outliergroups, red crosses are hubs, and black triangles are equal. Persons 1, 2 and 3are chosen in descending order of outgoing message frequency for consistency.

Oi is simply the fraction of messages that an individualproduces. The incoming weight Ii is the fraction of theincoming messages a node receives over the sum of outgoingmessages across every other node in the group; this requiresnormalization as nodes cannot be the target of their ownmessages and so the denominator varies for each node. Theresulting measure can be viewed as the ratio of incomingmessages a node receives vs. the number of messages the nodecould have received but were sent to some other node.

By assigning a message weight to each member of a group,we can represent the communication pattern of a group asa point in a three- or four-dimensional space. Figure IIIshows the distribution of the incoming message weights ofthe chat room and face-to-face groups of size three. Eachaxis represents the incoming message weight of an individual,where persons 1-3 are the members of the group, labeled indescending order by number of messages they send. We plotthe incoming message weight rather than outgoing messageweight since Ii separates the nodes into more distinct cat-egories; in particular, nodes that generate many trivial one-or two-word messages may have a very low Ii even if theiroutgoing message weight appears reasonable.

Using the incoming and outgoing weights as a guideline, wedefine classifications for the group communication structuresas well as roles that the group members play. We divide thegroups into three categories: Outlier, Hub, and Equal. Outliergroups have an individual who both talks infrequently andreceives extremely little communication from others in thegroup. Hub groups have an individual who is the main focus ofthe conversation, with both a large incoming message weight

TABLE IIDISTRIBUTION OF GROUP TYPES

Chat Size 3 Chat Size 4 F2F Size 3 F2F Size 4

Hub 34 37 15 18Outlier 18 27 8 25Equal 25 5 8 9

TABLE IIIDISTRIBUTION OF ROLE TYPES

Chat Size 3 Chat Size 4 F2F Size 3 F2F Size 4

Hub 58 88 30 61Outlier 26 52 13 34Spoke 72 116 26 77Equal 75 20 24 36

and usually a large outgoing message weight as well. The roleof this primary individual is referred to as a Hub; the othermembers are of the group are called Spokes. Spokes tend tocommunicate at an average outgoing rate but usually directtheir messages towards the Hub rather than other members ofthe team. The third group type is an Equal group where allmembers communicate at a similar rate. Figure 4 shows thethree different group types and the role of each member.

Hub

Spoke Spoke

Hub

Outlier Spoke

Equal

Equal Equal

(a) Hub pattern (b) Outlier pattern (c) Equal pattern

Fig. 4. Communication patterns determined by frequency thresholds.

We define classification rules in terms of thresholds onthe incoming message weights. Assuming that members allgenerate the same number of messages and distribute themequally across the group gives an incoming weight for eachnode of 1

|gi| where |gi| is the size of i’s group; we refer tothis weight as the baseline weight. If an individual’s incomingmessage weight is greater than 125% of the baseline wecategorize the node as a Hub; similarly individuals with lessthan 50% of the baseline are categorized as Outliers. Afteridentifying the Outliers and Hubs we assign a type to eachgroup. If a group contains an Outlier node it is categorizedan Outlier type group; if there are no Outliers but there is aHub node then the group is categorized as a Hub type; if thereare no Hubs or Outliers then the group is categorized as anEqual type. After classifying the group we assign a role tothe remaining nodes which had a message weight closer tothe baseline. If the individual is a member of an Equal groupthey are categorized as Equals; otherwise they are Spokes.Although Equal and Spoke nodes may have similar incomingmessage weights, we treat them as distinct roles because theyoperate in different overall communication structures.

The distribution of the group types across both phases and

group sizes is shown in table II. Applying a Chi-square test totable II shows that while the distribution of group types is notsignificantly different between the two communication settingsthey are different for the two sizes of groups. Groups of sizethree have a tendency to be Hubs or Equal while in groups ofsize four Outliers are more prominent (p� 0.01). This patternoccurs despite the fact that a group of size four has an evenmore extreme threshold for nodes to be considered outliers.This may be due to the fact the it is harder for each memberto get an opportunity to speak in a larger group and individualsthat are not assertive will not get a chance to communicate.Performing a Chi-squared test to the node roles in table IIIgives a similar result—the number of nodes in each role isnot significantly different between Face-to-face and Chat butgroups of size four have significantly fewer Equals and moreOutliers compared to groups of size three (p� 0.01).

A. ANOVA Analysis

After categorizing the participants into Roles and the groupsinto Types, we analyzed the participants’ self- and groupevaluations to assess if there is any relationship between thereported scores and the types of observed communicationstructures. At the same time, we tested whether significantdifferences occur across the two communication settings (i.e.,phases)—to assess whether individuals perceive distributedteams differently from face-to-face teams. To do this, we used2-way ANOVA tests using Role (R) and Phase (P) as factorsover individuals, and Type (T) and Phase (P) over groups.

Table IV shows the ANOVA results for Role and Phase.The main columns of the table report the average ratings forindividuals in each setting (Chat, Face-to-face) and role (hub,outlier, spoke, equal). Bold values represent ratings that aresignificantly different across the groups, with an up- or down-arrow indicating the direction of difference. The R, P , andR × P columns report the F-values from the ANOVA testseach factor and their interaction, respectively. A notation of *indicates a score that is significant at p < 0.1, ** indicatesp < 0.05, and *** indicates p < 0.01. The rows of thetable can be roughly separated into three categories: individualperformance evaluations (Dominance through Conflict), indi-vidual social characteristics (Communication Anxiety throughRosenburg Self-Esteem), and group evaluations (Group Effec-tiveness through Group Satisfaction). In this table, we reportself-evaluations; incoming evaluations are highly correlatedwith self-evaluations so have been omitted for space.

The most obvious pattern in Table IV are the low self evalu-ations for Outlier individuals. Since these individuals were notinvolved in the group discussions, it is perhaps reasonable thatthey would rate their contributions to the team lower than theother team members. Hub nodes rate themselves significantlyhigher on Dominance (in both settings), as well as Productivityand Competence (in Chat). Since these individuals were morecentral to the conversation, it also seems reasonable that theywould rate themselves higher in terms of assertiveness anddominance. However, interestingly the effect seems larger inthe Chat setting than the Face-to-face setting. Spoke nodes

TABLE IVROLE AND SETTING ANOVA

Hub Outlier Spoke Equal F-valueChat F2F Chat F2F Chat F2F Chat F2F R P R× P

Dominance ⇑4.67 ⇑4.53 ⇓3.38 ⇓3.10 3.82 3.85 4.07 4.01 40.36*** 0.94 0.54Involvement 5.55 5.75 ⇓4.63 ⇓4.44 5.25 5.41 5.47 5.56 28.9*** 1.92 0.91Productivity ⇑6.02 6.23 ⇓5.21 ⇓5.23 5.67 5.80 5.72 6.15 17.7*** 5.78** 0.85Respect 6.0 6.04 ⇓5.61 ⇓5.76 6.01 6.18 6.02 6.26 6.12*** 4.09** 0.33Competence ⇑5.83 5.82 ⇓4.97 ⇓5.10 5.52 5.59 5.54 5.78 10.75*** 0.92 0.29Trust 6.20 6.23 ⇓5.73 ⇓5.95 6.18 6.14 6.14 6.30 7.43*** 1.04 0.91Liking 6.05 6.07 ⇓5.74 ⇓5.68 6.06 5.94 5.85 ⇑6.18 5.10*** 0.10 2.49*Emotion 5.73 5.70 ⇓5.15 ⇓5.49 5.68 5.69 5.53 5.73 6.20*** 1.37 1.20Conflict 1.60 1.59 1.87 1.79 1.73 1.62 1.85 1.64 2.46* 2.23 0.36Task-Oriented 5.81 5.97 ⇓5.06 ⇓5.68 5.68 5.85 5.84 5.88 10.85*** 8.64*** 2.17*

Communication Anxiety 2.0 1.90 ⇑2.15 ⇑2.22 2.10 2.16 2.07 2.00 8.37*** 0.11 1.81Communication Aggressiveness ⇑2.25 2.04 ⇑2.30 ⇑2.28 ⇑2.34 2.20 ⇑2.29 2.05 3.36** 20.15*** 1.34Rosenburg Self-Esteem 3.52 3.64 ⇓3.36 ⇓3.39 3.55 3.53 3.48 3.58 2.80** 1.45 0.63

Group Effectiveness 5.17 ⇑5.65 5.04 ⇑5.47 5.19 ⇑5.55 5.18 ⇑5.73 0.51 19.74*** 0.16Group Productivity 5.04 ⇑5.55 4.97 ⇑5.36 5.11 ⇑5.46 5.04 ⇑5.71 0.46 27.72*** 0.64Group Trust 5.52 ⇑5.98 5.33 ⇑5.81 5.61 ⇑5.98 5.59 ⇑6.08 0.10 37.29*** 0.18Group Cohesion 3.95 ⇑4.52 3.99 ⇑4.30 4.07 ⇑4.40 3.89 ⇑4.93 0.42 32.34*** 2.69**Group Satisfaction 5.03 ⇑5.58 4.90 ⇑5.34 5.03 ⇑5.43 5.09 ⇑5.81 1.31 29.60*** 0.59

TABLE VGROUP TYPE AND SETTING ANOVA

Hub Outlier Equal F-valueChat F2F Chat F2F Chat F2F T P T × P

Task-Oriented 5.70 5.86 ⇓5.47 5.86 5.84 5.88 2.53* 9.36*** 1.75Involvement 5.30 5.43 ⇓5.14 ⇓5.29 5.47 5.56 3.77** 2.33 0.97Liking 6.03 5.99 5.95 5.89 5.85 ⇑6.18 0.37 0.22 3.06**

Group Cohesion 3.98 4.57 4.06 4.30 3.89 ⇑4.93 0.44 32.8*** 4.68***Group Trust 5.61 6.10 ⇓5.40 ⇓5.80 5.59 6.08 3.76** 41.0*** 0.20

and Equal nodes do not exhibit a significant difference intheir self-evaluations, with the exception of a higher thanaverage Likability self-evaluation for Equal nodes in face-to-face settings.

For the social characteristic self-evaluations, the Outlierstend to rate themselves higher in Communication Anxiety,Communication Aggressiveness, and Self-Esteem. So in ad-dition to rating themselves lower on contributions to the team,Outliers also report more anxiety, exhibit more aggressiveness,and report lower self-esteem. In terms of phase the membersof Chat teams also tend to have report higher levels ofCommunication Aggressiveness, which may be a result ofperceived communication difficulties in distributed settings.

For the group performance features the evaluations areclearly split by phase: participants in the Face-to-face groupsrate the performance of the group significantly higher in everycategory. This may also indicate the difficulties inherent tocommunication and cooperation in distributed settings.

Table V shows a similar analysis using Group Type andPhase as factors. For space considerations, we omit the fea-tures that exhibit a difference only across phases which can beobserved in the previous table. Since Hub and Outlier groups

comprise a set of individuals with different node roles, thedistinction between the roles is lost and the ratings mostlyconverge to similar distributions. However, there are a fewfeatures that exhibit significant differences across group Type.Members of Outlier groups rate themselves lower with respectto Involvement and Group Trust, and in the Chat phasethe individuals in these groups also rate themselves loweron Task-Orientation. Members of Equal groups on the otherrate themselves higher with respect to Likability and GroupCohesion, but only in the Face-to-face phase.

Although it is clear that the team communication setting(i.e., distributed vs. co-located) has the largest effect onindividual evaluations of group performance, we also in-vestigated second-order associations between individual/team-member evaluations and assessments of group performance.In the next section we report the results of learning modelsto predict the individual ratings of group performance ineach setting and establish a set of hypothesis tests usingthese models to establish the relationship between individualcharacteristics/perceptions and group evaluation.

Effective Productivity Trust Cohesion Satisfaction

Chat BaselineChat RegressionF2F BaselineF2F Regression

Group Evaluation Variable

MSE

0.0

0.5

1.0

1.5

2.0

2.5

Fig. 5. MSE of linear regression models when predicting group evaluationvariables, compared to baseline of the population mean (blue = chat, red =face-to-face; lighter colors = baseline).

IV. PREDICTING GROUP EVALUATIONS

In order to determine which factors had the greatest influ-ence on group performance we used two machine learningtechniques—linear regression and decision trees—to predictthe way individuals rate their groups. We considered each ofthe group evaluation features as a target classification variable:productivity, effectiveness, trust, satisfaction and cohesion.The predictor variables included all other available features,including self-, incoming and outgoing evaluations; the groupevaluations of teammates; communication anxiety, communi-cation aggressiveness and Rosenburg self-esteem scores; cal-culated message weights and gender. By determining the mostimportant features for predicting an individual’s evaluation ofgroup performance we can begin to understand which aspectsof the individual (and team) are correlated with perceptions ofteam performance and compare any differences in importantpredictors across the Chat and Face-to-face settings.

A. Linear Regression

Linear regression models are a subset of statistical regres-sion models wherein a scalar target variable is represented bya linear combination of predictor variables. In addition to itsuse as a predictive model, linear regression can be used toevaluate the strength of the relationship between the targetvariable and the predictor variables by analyzing the vectorof estimated feature coefficients. This is the application oflinear regression that we use—we learn two models for thegroup evaluation variables, one for the Chat setting and one forthe Face-to-face setting. We then investigate whether differentfeatures are important across the two settings.

Figure 5(a) shows the average performance of the learnedregression models compared to a baseline model that usesthe population mean for prediction. We use ten-fold cross-validation, measure mean squared error (MSE) of the predic-tions, and report average MSE across the ten folds. There aretwo models that exhibit varying performance gains across thesettings. In the Face-to-face setting, the model for group effec-tiveness shows a much larger improvement over baseline thanthe same model for the Chat setting. For group productivitythis effect is reversed—the model in the Chat setting shows alarger improvement over baseline than the Face-to-face model.

TABLE VICOEFFICIENTS OF THE LINEAR REGRESSION

Group EffectivenessChat Face-to-faceVariable Coef. ∆Err Variable Coef. ∆ErrOutgoing Productivity 0.47 0.16 Outgoing Productivity 0.51 0.05Self-Productivity 0.23 0.04 Self-Productivity 0.31 0.03Teammate Group Eff. 0.27 0.06 Teammate Self-Comp. -0.14 0.004†Gender 0.22 0.01

Group ProductivityChat Face-to-faceVariable Coef. ∆Err Variable Coef. ∆ErrOutgoing Productivity 0.40 0.07 Outgoing Productivity 0.47 0.18Self-Productivity 0.21 0.03 Self-Productivity 0.42 0.06Teammate Group Prod. 0.172 0.01 Teammate Self-Comp. -0.23 0.03†Outgoing Likability 0.18 0.007† Incoming Involvement -0.16 0.05†Gender 0.24 0.01Self-Conflict -0.13 0.007

Group TrustChat Face-to-faceVariable Coef. ∆Err Variable Coef. ∆ErrSelf-Trust 0.28 0.03 Outgoing Trust 0.57 0.18Outgoing Respect 0.18 0.016 Self-Conflict -0.28 0.01Outgoing Productivity 0.20 0.018 Self-Productivity 0.20 0.04Outgoing Trust 0.25 0.012 Teammate Self-Prod. 0.17 0.004†Self-Likability 0.14 0.002†

Group CohesionChat Face-to-faceVariable Coef. ∆Err Variable Coef. ∆ErrOutgoing Productivity 0.38 0.05 Self-Involvement 0.42 0.12Outgoing Likability 0.36 0.04 Out. Task-Oriented 0.44 0.07Gender 0.25 0.01 Inc. Message WeightSelf-Task-Oriented 0.15 0.006† Max Change -6.95 0.03Outgoing Conflict 0.23 0.03 Gender 0.43 0.03Self-Dominance 0.11 0.006†

Group SatisfactionChat Face-to-faceVariable Coef. ∆Err Variable Coef. ∆ErrOutgoing Productivity 0.42 0.05 Outgoing Respect 0.46 0.11Outgoing Likability 0.30 0.02 Self-Productivity 0.33 0.06Outgoing Dominance 0.15 0.005† Self-Likability 0.38 0.05

Self-Respect -0.28 0.02

To determine the features with the strongest associationto group evaluations, we first examine the features that thelinear regression models determine to be significant. Thesignificant coefficients (p < 0.05) learned for the model overeach setting are shown in Table VI. The italicized featuresare those that are unique to one of the settings. As expectedmany of the significant variables are predictors across bothphases. However, some variables are only significant in asingle setting; for example, gender shows up more frequentlyin the Chat models. We also note that with the exception ofoutgoing productivity, the significant features in the Face-to-face setting tend to be self-evaluations rather than outgoing(i.e., teammate) evaluations while the reverse is true in theChat setting.

The features with negative coefficients also indicate in-teresting aspects of the models. Two of the models havenegative associations between self evaluations of conflict andgroup performance, which indicates the importance of groupharmony on effectiveness. Two of the models also have nega-tive associations between teammate evaluations of competence

and group performance. However, while these competenceevaluations were deemed to to be significant with respect to thefit of the regression models, our randomization tests (describednext) shows they do not have a significant impact on the overallaccuracy of the model.

The p-value for coefficients in regression models testswhether a coefficient is significantly different from zero. Whilea small p-value suggest that a feature is important to the model,we can use more targeted randomization tests to assess theimpact the feature has on the accuracy of the model. Random-ization tests are a hypothesis testing technique where someassociations in the data are eliminated through permutation;by repeating this permutation through simulations, the nulldistribution of data can be estimated empirically [7] [8].

In our analysis, we permuted the significant features oneat a time by randomly shuffling the values in data columnfor that feature. From the permuted data, we learned a newregression model. Note the permutation process destroys anyassociation between the feature values and the class label, thusrendering the feature useless to the model. By repeating thispermutation/learning process multiple times we can generatea null distribution of the model MSE when the feature underconsideration has no association with the class label. We thenperform a significance test to determine if the MSE using thetrue values of the feature is significantly less than the MSEunder the null hypothesis. The magnitude of the improvementover the null is a direct measure of the the impact of thefeature on the accuracy of the model. This provides a betterindication of the importance of the feature to the model thanthe magnitude of coefficients or their p-values.

The change in MSE that we measured with the random-ization tests is reported as ∆MSE values in Table VI, witha † indicating when the difference is not significant. Featuresthat have significant coefficients in the regression model buthave insignificant impact on the model’s MSE correspond toredundant information in the data. This indicates that whenthe feature information is destroyed through permutation, anequally accurate model can still be learned with some othercombination of features. Features that result in a large ∆MSEindicate information that is not only important to the model,but also that the information is unique.

Note that the ∆MSE can also reveal differences in otherwisesimilar regression models. The weights learned for the modelpredicting effectiveness are very similar in magnitude betweenthe Chat and Face-to-face data; however, the Chat data exhibitsa much greater MSE loss when the outgoing productivityfeatures are permuted. In addition, the outgoing productivityand self-productivity features have similar weights in theFace-to-face model of group productivity, but permuting theoutgoing productivity feature has a larger impact on the MSE.Comparing with Figure 5(a) shows that the models that relystrongly on a single feature are also the ones that have lowerperformance overall.

B. Decision Trees

The second model we applied for our analysis was a de-cision tree. Decision tree algorithms learn models recursivelyby repeatedly partitioning on the feature which best splits thedata with respect to the class label [9]. To construct a binarylabel class label, we labeled the data instances with the highestthird of group evaluations as positive examples, the lowestthird as negative, and we discarded the middle third. In thisexperiment, the examples from both phases were combinedtogether and used in both the training and test sets. Figure6(a) shows the accuracy of the decision trees compared to abaseline classifier, which uses the majority class for prediction.We report the average results from ten-fold cross validation.

To analyze the importance of different features in thedecision tree models, we performed a similar set of random-ization tests as with the linear regression. By permuting eachsignificant feature we can observe the improvement in theclassification accuracy of the model over the model learnedwith the null distribution (of the feature). Figure 6(b, c) showsthe improvement for each feature separately. The features aresorted in increasing order of importance in the decision treemodel. The combined data set was used in the training setsthe learn the models. However, the accuracy of the modelswas measured on test sets that comprise the examples from(i) the Chat setting only, (ii) the Face-to-face only, and (iii)the combined data set. This allows for a comparison ofthe important predictive features with respect to each phase.Features that show a significant difference between Chat andFace-to-face are marked with * (p < 0.1) or ** (p < 0.05).

For some features there is a significant change in classifi-cation accuracy between the phases. This indicates that whilethe feature may be important in both phases, its importancein one phase is greater than in the other. Examples of thisinclude outgoing evaluations of task-orientation and likeability(more important in Face-to-face), and outgoing evaluations ofproductivity (more important in Chat).

The results of the significance analysis has some parallelswith the linear regression. For example, in the model pre-dicting effectiveness, the largest loss in classification accuracyoccurs for Chat when the outgoing productivity evaluationsare permuted. However a notable difference is that outgoingevaluations of task-orientation appear as important featuresin the decision tree models while they do not appear assignificant in the regression models. This may indicate thattask-orientation is more important to distinguish high and lowratings of group performance but it is not as helpful to predictthe middle range of ratings.

V. RELATED WORK

While the operation of virtual teams has been a focus ofsocial science research, other approaches have only utilized anarrow range of available interpersonal communication data.For example, Chudoba et al. [10] and Jarvenpaa et al. [4]attempt to find factors which indicate better group performancein a virtual setting but do not compare to a face-to-facesetting. These papers have focused primarily on discovering

Productivity

Effectiveness

Trust

Cohesion

Satisfaction

Both Decision TreeBoth Baseline

Classification Accuracy

0.0 0.2 0.4 0.6 0.8 1.0

(a) Prediction accuracy

Out Trust

*Out Productivity

**Out Liking

*Out Respect

**Out Task-Oriented

Out Involvement

In Weight Variance ChatF2FBoth

Change in Classification Accuracy

0.00 0.02 0.04 0.06 0.08

(b) Group Productivity

Out Trust

**Self Productivity

**Out Productivity

*Out Competence

**Out Task-Oriented

Out Conflict

Rosenburg ChatF2FBoth

Change in Classification Accuracy

0.00 0.01 0.02 0.03 0.04 0.05 0.06

(c) Group Effectiveness

Fig. 6. (a) Accuracy of the decision tree models for predicting group evaluations on the combined dataset (dark green); default accuracy is shown in lightgreen. (b-c) Mean change in classification accuracy of decision trees after randomly permuting the most important features for each class label. The changein accuracy was measured on the entire dataset (green) and on the distinct phases (blue for Chat, red for Face-to-face).

the effects of the virtual group setting without necessarilycomparing virtual groups to face-to-face ones. Examples ofpapers that directly compare the two settings are Kerr andMurthy [11], McDonaugh et al. [12], and Wilson et al. [13].These papers still rely on metrics that evaluate the groupas a whole rather than considering a more detailed view ofinterpersonal relations and the group structure.

Other work focuses on analyzing the textual content of thegroup communication. In Jonassen and Kwon [5], the authorsdefine a pattern of phases based on the sentence types thatindividuals use at different times. This is has a similar goalto the temporal analysis component of this paper, althoughwe relied on message frequency counts rather than sentenceclassifications that may be considered subjective.

Communication structure has been considered previously.In Cataldo and Ehrlich [14], the authors examine the re-lationship between the communication structure of productdevelopment teams and their productivity. Specifically, theydefine communication structure to be the presence or absenceof a hierarchical structure as measured by hierarchy metricsproposed in [15]. The authors then analyze the relationshipbetween these structure metrics and the performance of groups.

VI. CONCLUSION

The interpersonal dependencies and communication dynam-ics which drive successful teams in face-to-face and distributedsettings are still far from being completely understood. In thispaper we have presented multiple approaches for the analysisand comparison of small groups. Standard regression anddecision tree models, supplemented with hypothesis testingthrough randomization tests, revealed several characteristics ofgroups that indicate higher evaluations of team performance.In addition we showed that communication streams can beused to identify types of groups, as well as individual rolesin those groups, based on the overall structure of teamcommunication. Anova tests indicate there are significantdifferences in participants’ self- and group perceptions acrossboth distributed and co-located teams, and across differentcommunication structures.

ACKNOWLEDGEMENTS

This research is supported by NSF under contract num-ber SES-0823313. The U.S. Government is authorized toreproduce and distribute reprints for governmental purposesnotwithstanding any copyright notation hereon.

REFERENCES

[1] S. L. Connaughton and J. A. Daly, “Leadership in the new millennium:Communication beyond temporal, spatial, and geographical boundaries,”Communication Yearbook, vol. 29, pp. 187–213, 2005.

[2] S. W. J. Kozlowski and D. R. Ilgen, “Enhancing the effectiveness ofwork groups and teams,” Psychological Science in the Public Interest,vol. 7, no. 3, pp. 77–124, 2006.

[3] J. B. Walther and U. Bunz, “The rules of virtual groups: Trust, liking,and performance in computer mediated communication,” JCM, vol. 55,no. 4, pp. 828–846, 2005.

[4] S. L. Jarvenpaa, K. Knoll, and D. E. Leidner, “Is anybody out there?antecedents of trust in global virtual teams,” JMIS, vol. 14, pp. 29–64,1998.

[5] D. Jonassen and H. Kwon, “Communication patterns in computermediated versus face-to-face group problem solving,” Educational Tech-nology Research and Development, vol. 49, no. 1, pp. 35–51, 2001.

[6] P. Hinds and M. Mortensen, “Understanding conflict in geographicallydistributed teams: The moderating effects of shared identity, shared con-text, and spontaneous communication,” Organization Science, vol. 16,pp. 290–307, 2005.

[7] D. Jensen, J. Neville, and M. Rattigan, “Randomization tests forrelational learning,” Department of Computer Science, University ofMassachusetts Amherst, Tech. Rep. 03-05, 2003.

[8] T. LaFond and J. Neville, “Randomization tests for distinguishing socialinfluence and homophily effects,” World Wide Web Conference, 2010.

[9] J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1,pp. 81–106, 1986.

[10] K. M. Chudoba, E. Wynn, M. Lu, and M. B. Watson-Manheim, “Howvirtual are we? measuring virtuality and understanding its impact on aglobal organization,” ISJ, vol. 15, pp. 279–306, 2005.

[11] D. Kerr and U. Murthy, “Divergent and convergent idea generation inteams: A comparison of computer-mediated and face-to-face communi-cation,” Group Decisions and Negotiation, vol. 13, pp. 381–399, 2004.

[12] E. McDonough, K. Kahn, and G. Barczak, “An investigation of the use ofglobal, virtual, and collocated new product development teams,” JPIM,vol. 18, no. 2, pp. 110–120, 2001.

[13] J. M. Wilson, S. G. Straus, and B. McEvily, “All in due time: Thedevelopment of trust in computer-mediated and face-to-face teams,”Organizational Behavior and Human Decision Processes, p. 1633, 2006.

[14] M. Cataldo and K. Ehrlich, “The impact of communication structure onnew product development outcomes,” CHI, 2012.

[15] D. Krackhardt, “Graph theoretical dimensions of informal organiza-tions,” Computational Organization Theory, pp. 89–111, 1994.

the impact of communication structure and interpersonal

Documents