automated discovery of social networks in online learning communities
TRANSCRIPT
AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIESONLINE LEARNING COMMUNITIES
Anatoliy Gruzd [email protected]
Dissertation DefenseApril 1, 2009
2
Online Social Networks
http://www.visualcomplexity.com/vc
• Email networks
• Forum networks
• Blog networks
• Friends’ networks on MySpace, Facebook, etc
• Networks of like-minded people on
3
Users’ contributions and networks are growing daily!
Source: IDC white paper, “The Diverse and Exploding DigitalUnverse,” sponsored by EMC, March 2008.
Usenet newsgroups4.6 terabytes of text *daily*
Blogs900,000 new blogs *daily*
Emails100 billion emails *daily*
4
Users’ contributions and networks are growing daily!
Usenet newsgroups4.6 terabytes of text *daily*
Blogs900,000 new blogs *daily*
Emails100 billion emails *daily*
• What the group’s interests and priorities are?
• How and why one online community emerges and another dies?
• How people agree on common practices and rules in an online community?
• How knowledge and information is shared among group members?
5
© kelleyw
Automated Discovery of Social Networks
6
• Research Goal
– Use computers to discover online social networks automatically
• Case Study– Discussion forums in online classes
Automated Discovery of Social Networks
7
Research Questions
• Extracting Social Networks from Forum PostingsQuestion 1: What content-based features of postings help to uncover nodes and ties between group members?
8
Extracting Social Networks from Forum Postings Approach 1: Chain Network (Reply-to)
FROM: SamREFERENCE CHAIN: Gabriel “ Nick, Gina and Gabriel: I apologize for not backing this up with a good source, but I know from reading about this topic that … ”
Posting header
Content
Source Posting HeaderMethod Connects a sender to the previous poster
in the thread
Discovered Tie(s) Sam -> Gabriel
Possible Missing Connections:• Sam -> Nick • Sam -> Gina• Nick <-> Gina
9
Extracting Social Networks from Forum Postings Approach 2: Name Network
Method Connect the sender to people mentioned in the message
Connect people whose names co-occur in the same message(s)
Discovered Tie(s)
Ann -> Steve Ann -> Natasha
Steve <-> Natasha
FROM: Ann
“Steve and Natasha, I couldn't wait to see your site.
I knew it was going to [be] awesome!”
10
Extracting Social Networks from Forum Postings Approach 2: Name Network
• Compare each word from the posting against a dictionary of all names collected from the US Census data
• Find names that are NOT in the name dictionary (e.g., international names, informal names and nicknames) using contextual and structural information about words such as – Capitalization– Context words – Position in text
Step 1. Automatically find all personal names in the postings
11
Extracting Social Networks from Forum Postings Approach 2: Name Network
EXAMPLEFrom: [email protected] (= Wilma)Reference Chain: [email protected], [email protected]
Hi Dustin, Sam and all, I appreciate your posts from this and last week […]. I keep thinking of poor Charlie who only wanted information on “dogs“. […] Cheers, Wilma.
Wilma – Dustin Wilma – SamWilma – Charlie
Challenges to overcome:– One person can have many names – Many people can have the same name – Names can belong to students in the class and outsiders
Step 2. Connect a sender of the posting to all names discovered in the previous step
Solution: - Name alias resolution
Dustin – Sam – Charlie
12
Research Questions
• Extracting Social Networks from Forum Postings
• Evaluating Name NetworksQuestion 2: How are the proposed name networks similar to or different from networks derived from other methods?
13
Evaluating Name Networks
Name Network Chain Network
Forum Postings
Self-Reported Network
SurveyComparison Procedure:• QAP correlations• Exponential random graph models (p* models)• Manual exploration using network visualization
vs.
vs. vs.
14
Evaluating Name Networks Data collection
DatasetClasses 6School year Spring 2008Duration of each class 15 weeks
No. of students per class 15 – 28
Data source• Bulletin board
messages• Online
questionnaire
Response rate 54%-86% (63%)
No. of all postings
0500
100015002000
Class#1
Class#2
Class#3
Class#4
Class#5
Class#6
No. of students
0
10
20
30
Class #1 Class #2 Class #3 Class #4 Class #5 Class #6
15
Evaluating Name Networks Online Questionnaire
Section 1. Students’ perceived social structures I learned a lot about the subject matter from this person …
0 – never; 1 - rarely; 2 - for some of the course; 3 - during most of the course; 4 - throughout the whole course;
Section 2. Influential members of the class Indicate five students who you consider most important or influential in this class
regarding each of the following types of interaction:(1) Providing information; (2) Promoting discussion; (3) Giving help; (4)
Making class fun;
Section 3. Interactions in the class as a whole I felt that the class worked together …
[ Based on C. Haythornthwaite’s 1999 LEEP study protocol ]
Sample question:
Sample question:
16
Evaluating Name NetworksExample: Youtube comments
Name Network Chain Network
Chain Network
(less connections)
Name Network
(more connections)
17
Evaluating Name Networks Results from Online Learners Dataset
NName networks provide on average 40ame networks provide on average 40%% more information more information about social ties in a group as compared to about social ties in a group as compared to CChain networkshain networks
“New” Info(considering only the 40%)
82%82%An addressee has not
posted to the thread
18%18%An addressee is not the most
recent poster
70%70%Thread-starting posting
30%30%A subsequent posting
in the thread
Name Network Chain NetworkQAP correlation ~ 0.5
18
Evaluating Name NetworksResults from Online Learners Dataset
Structurally, the name and self-reported networks are far more Structurally, the name and self-reported networks are far more similar.similar.Based on p* models, the self-reported network is almost twice as likely to share the same ties with the name network than with the chain network.
Chain NetworkName Network
Self-Reported Network
Friends’ network for one of the classes
19
Research Questions
• Extracting Social Networks from Forum Postings
• Evaluating Name Networks
• Identifying Social Relations in Name Networks Question 3: What types of social relations do name networks include?
20
Identifying Social Relations in Name Networks Results
• The following social relations were found by the “name The following social relations were found by the “name network” methodnetwork” method
Learning ● Collaborative Work ● Help
21
Identifying Social Relations in Name Networks Results
• The following social relations were found by the “name The following social relations were found by the “name network” methodnetwork” method
Learning ● Collaborative Work ● Help
– Postings that show attention to subject matter discussed by someone else
“… it made me think of the faceted catalogs' display that Karen posted ”
22
Identifying Social Relations in Name Networks Results
• The following social relations were found by the “name The following social relations were found by the “name network” methodnetwork” method
Learning ● Collaborative Work ● Help
– Organizing group work, taking a leadership role
“ Some quick poking around shows that Steve and myself are here in Champaign, [...] and Nicole is in Chicago. [...] does anyone have a strong desire to be our contact person to the administrators ”
23
Identifying Social Relations in Name Networks Results
• The following social relations were found by the “name The following social relations were found by the “name network” methodnetwork” method
Learning ● Collaborative Work ● Help
– A reference to an event or interaction that happened outside the bulleting board
“ Anne and I have been corresponding via e-mail and she reminded me that we should be having discussion here "
24
Identifying Social Relations in Name Networks Results
• The following social relations were found by the “name The following social relations were found by the “name network” methodnetwork” method
Learning ● Collaborative Work ● Help
– Postings that ask others for help
“ [Instructor’s name] if you see this posting would you please clarify for us ”
25
Using the results in the learning context• Identify students who might need extra attention/help from the
instructor
• Discover if lectures or other class materials were unclear• Identify peer-help
• Find active group members who often take a leadership role in a group
StudentStudent Instructor
StudentStudentGroup Group Leader Leader StudentStudent
StudentStudent
26
Contributions of the Research
1. Development of a novel approach (name network) for content-based, automated discovery of social networks from threaded discussions in online communities and a framework for evaluating this new approach– The “name network” method can be used
• to transform even unstructured Internet data into social network data;
• where more traditional methods for data collection on social networks such as surveys are too costly or not possible;
27
Contributions of the Research (cont.)
2. Empirical comparison of name networks to chain and self-reported networks using data collected from 6 online classes
3. Demonstration of the proposed automated approach for collecting social network data is a viable alternative to the costly and time-consuming collection of self-reported networks
4. Demonstration of how name networks can be used to study online classes and assess collaborative learning
5. Development of the ICTA web-based system for content and network analysis (http://textanalytics.net)
28
http://TextAnalytics.nethttp://TextAnalytics.net
29
Limitations
• The ‘name network’ method
– is more expensive computationally then the ‘chain network’ method
– uses an email address as a unique identifier of a participant
– relies only on postings that include personal names (on average only about 25-30% of all postings)
30
Future Research
• Study other types of online communities• Study online communities using multiple data sources
such as forums, chats, wikis, etc• Develop automated techniques to identify types of social
relations and social roles
AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIESONLINE LEARNING COMMUNITIES
Anatoliy Gruzd [email protected]
April 1, 2009
Contributions• Developed the Name Network method and evaluated it in the
context of e-learning• Identified types of social relations in Name Networks• Developed ICTA – a web-based system for content and network
analysis (http://textanalytics.net)