oxford digital humanities summer school

Post on 06-May-2015

441 Views

Category:

Education

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Slides

TRANSCRIPT

(Social) Network Analysis

Scott A. HaleOxford Internet Institute

http://www.scotthale.net/

17 July 2014

What are networks?

Networks (graphs) are set of nodes (verticies) connected by edges (links,ties, arcs)

Additional details

Whole vs. ego: whole networks have allnodes within a natural boundary(platform, organization, etc.). An egonetwork has one node and all of itsimmediate neighbors.

Edges can be directed or undirected andweighted or unweighted

Additionally, networks may be multilayerand/or multimodal.

What are networks?

Networks (graphs) are set of nodes (verticies) connected by edges (links,ties, arcs)

Additional details

Whole vs. ego: whole networks have allnodes within a natural boundary(platform, organization, etc.). An egonetwork has one node and all of itsimmediate neighbors.

Edges can be directed or undirected andweighted or unweighted

Additionally, networks may be multilayerand/or multimodal.

Why?

Characterize network structure

How far apart / well-connected are nodes?Are some nodes at more important positions?Is the network composed of communities?

How does network structure affect processes?

Information diffusionCoordination/cooperationResilience to failure/attack

A network

First questions when approaching a network

What are edges? What are nodes?

What kind of network?

Inclusion/exclusion criteria

Network data repositories

http://www.diggingintodata.org/Repositories/tabid/167/

Default.aspx

http://datamob.org

http://snap.stanford.edu/data

http://www-personal.umich.edu/~mejn/netdata

Python resources

tweepy: Package for Twitter stream and search APIs (only python 2.7 atthe moment)

search and stream API example code along with code to creatementions/retweet network athttps://github.com/computermacgyver/twitter-python

Python two versions:

2.7.x – many packages, issues with non-English scripts

3.x – less packages, but excellent handling of international scripts(unicode)

NetworkX

http://networkx.github.io/

Package to represent networks as python objects

Convenient functions to add, delete, iterate nodes/edges

Functions to calculate network statistics (degree, clustering, etc.)

Easily generate comparison graphs based on statistical models

Visualization

Alternatives include igraph (available for Python and R)

Gephi

Open-source, cross-platform GUI interface

Primary strength is to visualize networks

Basic statistical properties are also available

Alternatives include NodeXL, Pajek, GUESS, NetDraw, Tulip, and more

Network measures

With many nodes visualizations are often difficult/impossible to interpret.Statistical measures can be very revealing, however.

Node-level

Degree (in, out): How many incoming/outgoing edges does a node have?Centrality (next slide)Constraint

Network-level

Components: Number of disconnected subsets of nodesDensity: observed edges

maximum number of edges possible

Clustering coefficient closed tripletsconnected triples

Path length distributionDistributions of node-level measures

Centrality measures

Degree

Closeness: Measures the average geodesic distance to ALL other nodes.Informally, an indication of the ability of a node to diffuse a propertyefficiently.

Betweenness: Number of shortest paths the node lies on. Informally,the betweenness is high if a node bridges clusters.

Eigenvector: A weighted degree centrality (inbound links from highlycentral nodes count more).

PageRank: Not strictly a centrality measure, but similar to eigenvectorbut modeled as a random walk with a teleportation parameter

NetworkX: Nodes

import networkx as nx

g=nx.Graph() #A new (empty) undirected graph

g.add_node("Alan") #Add one new node

g.add_nodes_from(["Bob","Carol","Denise"])#Add three new nodes from list

#Nodes can have attributes

g.node["Alan"]["gender"]="M"

g.node["Bob"]["gender"]="M"

g.node["Carol"]["gender"]="F"

g.node["Denise"]["gender"]="F"

for n in g:

print("{0} has gender {1}".format(n,g.node[n]["gender"]))

NetworkX: Edges

#Interesting graphs have edges

g.add_edge("Alan","Bob") #Add one new edge

#Add two new edges

g.add_edges_from([["Carol","Denise"],["Carol","Bob"]])

#Edge attributes

g.edge["Alan"]["Bob"]["relationship"]="Friends"

g.edge["Carol"]["Denise"]["relationship"]="Friends"

g.edge["Carol"]["Bob"]["relationship"]="Married"

#New edge with an attribute

g.add_edges_from([["Carol","Alan",

{"relationship":"Friends"}]])

NetworkX: Edges

for e in g.edges_iter():

n1=e[0]

n2=e[1]

print("{0} and {1} are {2}".format(n1,n2,g.edge[n1][n2]["relationship"]))

NetworkX: Measures

g.number_of_nodes()

g.nodes(data=True)

g.number_of_edges()

g.edges(data=True)

nx.info(g)

nx.density(g)

nx.number_connected_components(g)

nx.degree_histogram(g)

nx.betweenness_centrality(g)

nx.clustering(g)

nx.clustering(g, nodes=["Bob"])

NetworkX: Visualize or save

#Save g to the file my_graph.graphml in graphml format

#prettyprint will make it nice for a human to read

nx.write_graphml(g,"my_graph.graphml",prettyprint=True)

#Layout g with the Fruchterman-Reingold force-directed

#algorithm and save the result to my_graph.png

#with_labels will label each node with its id

import matplotlib.pyplot as plt

nx.draw_spring(g,with_labels=True)

plt.savefig("my_graph.png")

plt.clf() #Clear plot

NetworkX: Odds and ends

#Read a graph from the file my_graph.graphml in graphml format

g=nx.read_graphml("my_graph.graphml")

#Create a (empty) directed graph

g=nx.DiGraph()

See http://networkx.github.io/documentation/latest/reference/

index.html for many more commands. Note that some commands are onlyavailable on directed or undirected graphs.

Resources

Newman, M.E.J., Networks: An Introduction

Kadushin, C., Understanding Social Networks: Theories, Concepts, andFindings

De Nooy, W., et al., Exploratory Social Network Analysis with Pajek

Shneiderman B., and Smith, M., Analyzing Social Media Networks withNodeXL

(Social) Network Analysis

Scott A. HaleOxford Internet Institute

http://www.scotthale.net/

17 July 2014

top related