jeff naruchitparames university of nevada, reno - cse cs 790: complex networks, fall 2010

35
A Graph-Based Approach to Link Prediction in Social Networks Using a Pareto- Optimal Genetic Algorithm Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Upload: thane-whitney

Post on 30-Dec-2015

23 views

Category:

Documents


2 download

DESCRIPTION

A Graph-Based Approach to Link Prediction in Social Networks Using a Pareto-Optimal Genetic Algorithm. Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010. biological. social. 2. 3. 4. Social networks = Dynamic, judgmental environment - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

A Graph-Based Approach to Link

Prediction in Social Networks Using a Pareto-Optimal

Genetic Algorithm

Jeff NaruchitparamesUniversity of Nevada, Reno - CSE

CS 790: Complex Networks, Fall 2010

Page 2: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

biologicalsocial

2

Page 3: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

3

Page 4: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

4

Page 5: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

‣ Social networks =

‣ Dynamic, judgmental environment

‣ Affect friendships over time

5

very dynamicheterogeneous

Page 6: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

6

Page 7: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

7

‣ 1-2 hop distance only

‣ Friend-of-friend

Page 8: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

‣ Multiple hops; >1

‣ Structural; purely graph-based

‣ No explicit correlation between potential friends...

8

Page 9: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

‣ Silva, et. al.,‣ A Graph-based Recommendation System Using Genetic

Algorithms, 2010

9

Page 10: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

10

Page 11: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

11

Page 12: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Friends-of-Friends

2 hops

Filter Order

12

Page 13: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Filtering

“It’s more probable that you know a friend of your friend than any other random person”

Mitchell M., Complex Systems: Network Thinking, 2006.

13

Page 14: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

14

Page 15: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

15

Page 16: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Indexes

16

Page 17: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

‣ Heterogeneity

‣ Human behavior and preferences

‣ Multiple hops

17

What’s missing?

Page 18: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Pretty much a filtering problem...

18

My approach

Page 19: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

‣ Components (for filtering)

‣ Betweenness centrality

‣ Community detection

‣ Clique Percolation Method (CPM)

‣ Friends of friends

‣ 10-dimensional Pareto-optimal genetic algorithm

19

My approach

Page 20: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Betweenness Centrality

20

Page 21: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Community Detection

21

Page 22: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

‣ Remove duplicates

‣ Remove our test cases

‣ (More on this later...)

22

Page 23: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

The Genetic Algorithm Part

23

Page 24: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Pareto Fronts

24

Page 25: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

The Features

1. # of shared friends

2. location

3. age range

4. general interest

5. music

6. attended same events

7. groups

8. movies

9. education

10.religion/politics

25

Page 26: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Pareto Optimality

‣ Localized to implementation of selection

‣ Feature subset selection

‣ We want to find the best combination of these subsets that can give us the best solutions for how we determine friendships

26

Page 27: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Pareto Optimality and Feature Subset Selection

27

FF11 FF22 FF33 FF44 FF55 FF66 FF77 FF88 FF99 FF1010

CC11 00 11 00 11 00 00 00 11 11 00

CC22 11 11 00 00 00 11 00 11 00 11

..

..

..

CCnn 00 00 00 00 11 00 00 11 00 00

Page 28: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

A Point System

28

FF11 FF22 FF33 FF44 FF55 FF66 FF77 FF88 FF99 FF1010

UU11 -- 33 -- 1111 -- -- -- 2020 4444 --

UU22 -- 11 -- 1313 -- -- -- 3131 99 --

..

..

..

UUnn -- 1010 -- 1414 -- -- -- 4949 6161 --

Page 29: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Pareto Optimality

‣ Compare with the test cases we removed earlier...

‣ For all chromosomes in population, do:

‣ If ALL test cases ≥ optimal Pareto front

‣ Calculate fitness

‣ Good to go

‣ Else

‣ Calculate fitness

‣ Continue onto next chromosome

29

Page 30: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Fitness Function

∑ ∑ pi ln( fj )pi-1

30

n 10

i=1 j=1

Page 31: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

Continuing on with the Evolutionary

Process

‣ Apply fitness proportional selection

‣ Randomly select 2 parents to mate

‣ Apply 1-point crossover (82% chance)

‣ Bit mutation (0.05% chance)

‣ Do this until ALL test cases better than Pareto front OR fitness does not improve for 5 consecutive generations

31

Page 32: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

1-Point Crossover

32

Page 33: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

‣ Complex network theory + Genetic algorithm + social theory

‣ Betweenness centrality

‣ Community detection

‣ Clique Percolation Method

‣ Binary 10-dimensional Pareto-optimal genetic algorithm

‣ Dominant, fitness proportional selection

‣ Several levels of filtering and selection (aka filtering ☺)

33

Conclusion

Page 34: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

‣ Better fitness function (need to ask Sociologists)

‣ Weighted chromosome for Pareto optimization (as opposed to binary)

‣ Prove all this stuff actually works (sociology standpoint??)

‣ Parallelize or GPU-ize the code (it’s in Python)

34

Future Work

Page 35: Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010

35