fjourdan/publication/wspresentation

Post on 08-Mar-2016

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

http://www.lirmm.fr/~fjourdan/Publication/WSPresentation.pdf

TRANSCRIPT

A scalable force-directed method for thevisualization of large graphs

Fabien Jourdan, Guy Melançonfjourdan@lirmm.fr, melancon@lirmm.fr

M O

N T

P E

L L

I E

R

L I R M

Laboratoire d’Informatique, deRobotique

et de Microélectronique deMontpellier

CNRS - Université Montpellier II

Département Informatique Fondamentale et Applications

1

Plan

1. Introduction

2. Force-directed layout methods

� Description

� Algorithm

3. Partitioning nodes

� Observation

� Spreading activation metric

� Extracting nodes by layers

4. The Layout algorithm

� Virtual graphs

� Algorithm

� Complexity of the algorithm

5. Experimentation

� Energy computation

� Tests

6. Conclusion and future work

2

1. Introduction� Graph Visualization

Definition 1 Subfield of Information Visualization dealing with relational data.

– Areas of application

� social networks visualization

� class browsers (object oriented systems)

� metabolic pathways (post-genomic data)

� etc ...

– Graph Visualization challenges

� choosing an abstraction of a graph that will provide a beginning for exploration

� providing a good view of a graph while keeping the computing time low

� reducing the amount of data displayed without omitting key informations

3

2. Force-directed layout methods� Description

– Physical metaphor

� Nodes � physical bodies

� Edges � attractors (ex : springswith prefered length)

� Between each neighbours �

attraction forces

� Between each pair of nodes �

repulsion forces

� Attempt to find a minimum energyconfiguration of this system

– Advantages

� Require no specific graph properties

� Respect aesthetic criteria as edgelength, covering space...

� Good readability

4

2. Force-directed layout methods� Description

– Physical metaphor

� Nodes � physical bodies

� Edges � attractors (ex : springswith prefered length)

� Between each neighbours �

attraction forces

� Between each pair of nodes �

repulsion forces

� Attempt to find a minimum energyconfiguration of this system

– Advantages

� Require no specific graph properties

� Respect aesthetic criteria as edgelength, covering space...

� Good readability

4-a

2. Force-directed layout methods� Description

– Physical metaphor

� Nodes � physical bodies

� Edges � attractors (ex : springswith prefered length)

� Between each neighbours �

attraction forces

� Between each pair of nodes �

repulsion forces

� Attempt to find a minimum energyconfiguration of this system

– Advantages

� Require no specific graph properties

� Respect aesthetic criteria as edgelength, covering space...

� Good readability

5

2. Force-directed layout methods� Description

– Physical metaphor

� Nodes � physical bodies

� Edges � attractors (ex : springswith prefered length)

� Between each neighbours �

attraction forces

� Between each pair of nodes �

repulsion forces

� Attempt to find a minimum energyconfiguration of this system

– Advantages

� Require no specific graph properties

� Respect aesthetic criteria as edgelength, covering space...

� Good readability

6

2. Force-directed layout methods� Description

– Physical metaphor

� Nodes � physical bodies

� Edges � attractors (ex : springswith prefered length)

� Between each neighbours �

attraction forces

� Between each pair of nodes �

repulsion forces

� Attempt to find a minimum energyconfiguration of this system

– Advantages

� Require no specific graph properties

� Respect aesthetic criteria as edgelength, covering space...

� Good readability

7

2. Force-directed layout methods

Data : A graph � � ��� ��� �

Result : A drawing of the graph

give every node a random positionrepeat a linear number of times :(stop when the system energy gets under a lower bound)begin

for each node � do1 compute a vector �� obtained by summing up

all attractive and repulsive forces acting on

for each node � do2 apply the previously computed vector � �

to obtain the node’s new position

end

Its a priori time complexity is ����

� .

� To slow to be immersed in an interactive environment.

8

3. Partitioning nodes

Nodes and edges in the figure on the right have been colored according to their associatedSpreading Activation value.

9

3. Partitioning nodes� Spreading activation

– Signals are sent through the network (the graph)

– How spreads the signal through this graph

– Hogg and Huberman mathematical model (87) :

� Nodes are assigned a sequence of values computed iteratively following a simplerecurrence:

� � � � � � � � �� � � �� � �� � �� � � � � � (1)

� Hogg and Huberman give conditions under which this iterative process converge

� They evaluate the speed of convergence according to and �

– Nodes with the highest spreading activation values should provide a skeleton for thegraph

10

3. Partitioning nodes� Extracting nodes by layers

– Partition :Let � � � � �� � be a graph and let � � � �

��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �

denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �

� � � � � � � � � � associated with

� � � � � � � � � � � by� � � � � �� ��� � �

� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes

is imposed by the limitations of the force-directed method

– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )

11

3. Partitioning nodes� Extracting nodes by layers

– Partition :Let � � � � �� � be a graph and let � � � �

��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �

denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �

� � � � � � � � � � associated with

� � � � � � � � � � � by� � � � � �� ��� � �

� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes

is imposed by the limitations of the force-directed method

– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )

12

3. Partitioning nodes� Extracting nodes by layers

– Partition :Let � � � � �� � be a graph and let � � � �

��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �

denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �

� � � � � � � � � � associated with

� � � � � � � � � � � by� � � � � �� ��� � �

� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes

is imposed by the limitations of the force-directed method

– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )

13

3. Partitioning nodes� Extracting nodes by layers

– Partition :Let � � � � �� � be a graph and let � � � �

��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �

denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �

� � � � � � � � � � associated with

� � � � � � � � � � � by� � � � � �� ��� � �

� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes

is imposed by the limitations of the force-directed method

– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )

14

3. Partitioning nodes� Partitioning the interval of values

– Consider the statistical repartition of values in the graph

– � denote the statistical density function associated with the spreading activationmetric on �

� ��� � � �� � � � ��� ��� �

– It is possible to compute a unique inverse value �� �

�� � for� � � � � .(Simple adjustements are needed to compute an inverse value in the general cases)

– The partition � � , . . . , � must be such that � � � � � � � � ��� � � � � � ���

500 1000 1500 2000 2500 5000 1000012 22 30 40 45 83 154

15

3. Partitioning nodes� Partitioning the interval of values

– Consider the statistical repartition of values in the graph

– � denote the statistical density function associated with the spreading activationmetric on �

� ��� � � �� � � � ��� ��� �

– It is possible to compute a unique inverse value �� �

�� � for� � � � � .(Simple adjustements are needed to compute an inverse value in the general cases)

– The partition � � , . . . , � must be such that � � � � � � � � ��� � � � � � ���

�� � 500 1000 1500 2000 2500 5000 10000

�� � � � 12 22 30 40 45 83 154

15-a

3. Partitioning nodes� Filtration

– Gadjer, Goodrich and Kobourov (2000)

– Not based on numerical values but on neighborhood relationships

– Used to speed up force-directed layout methods for large graphs

– Their approach only work well for graphs with strong regularities(a grid, for instance)

Sommet de deg 3

Sommet de deg 4

Sommet de deg 2

16

4. The Layout algorithm� Introduction

– Assume that the spreading activationmetric has been computed on thewhole graph

– �� denote an instance of a force-directed layout algorithm

�� : � � � � positions for everynode �

– �� � a modified instance of the algo-rithm ��

�� � : � � � � � � � � � positions forevery node � � � � �

� Nodes � � � are fixed but act onnodes � � � � �

� Nodes � � � � � can move

17

4. The Layout algorithm� Introduction

– Assume that the spreading activationmetric has been computed on thewhole graph

– �� denote an instance of a force-directed layout algorithm

�� : � � � � positions for everynode �

– �� � a modified instance of the algo-rithm ��

�� � : � � � � � � � � � positions forevery node � � � � �

� Nodes � � � are fixed but act onnodes � � � � �

� Nodes � � � � � can move

17-a

4. The Layout algorithm� Introduction

– Assume that the spreading activationmetric has been computed on thewhole graph

– �� denote an instance of a force-directed layout algorithm

�� : � � � � positions for everynode �

– �� � a modified instance of the algo-rithm ��

�� � : � � � � � � � � � positions forevery node � � � � �

� Nodes � � � are fixed but act onnodes � � � � �

� Nodes � � � � � can move

18

4. The Layout algorithm� Virtual graphs

– Let � � � and � � � � the induced sub-graph

– Target : taking into account paths be-tween two nodes in � when comput-ing �� �

– The edges in

�� are only used to in-

duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �

19

4. The Layout algorithm� Virtual graphs

– Let � � � and � � � � the induced sub-graph

– Target : taking into account paths be-tween two nodes in � when comput-ing �� �

– The edges in

�� are only used to in-

duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �

19-a

4. The Layout algorithm� Virtual graphs

– Let � � � and � � � � the induced sub-graph

– Target : taking into account paths be-tween two nodes in � when comput-ing �� �

– The edges in

�� are only used to in-

duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �

20

4. The Layout algorithm� Virtual graphs

– Let � � � and � � � � the induced sub-graph

– Target : taking into account paths be-tween two nodes in � when comput-ing �� �

– The edges in

�� are only used to in-

duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �

21

4. The Layout algorithm

Data : A graph � � ��� ��� � .

Result : A drawing of the graph �

begin1 Compute a partition� � � � � � � � � � � �

Compute the virtual graph

�� ��� � � associated with� �

Run �� ��

� � � � � � � � �Record the positions of nodes � � and mark them as fixed

2 for each level� � ( �� � ) doCompute the induced subgraph � � � ��� � � � � �

Compute the virtual graph�

� ��� ��� � � � � �

Run �� ��

�� ��� ��� � � � � � � � ��� � � � � �

Report the position for nodes � � � � � and mark them as fixed

return The resulting drawing of �

end

22

4. The Layout algorithm� Complexity of the algorithm

– Theorem 1 The algorithm runs in time � � � � � � � .

– Claim 1 The partition� � � � � � � can be computed in linear time ��� � .

– Claim 2 Let� �� � � � � be two consecutive layers in the partition for� . Under theassumption that �� � � ��� � , the induced subgraph � � � ��� � � � � � is computed inlinear time ��� � . This also holds true for � ��� � � . Moreover, the additional time tocompute the virtual subgraph

�� ��� ��� � � � � � is constant on average. Thus the graph

�� ��� ��� � � � � � can be computed in time ��� � as well.

23

5. Experimentation� Energy computation

– Basalaj PhD thesis (2000)

– Computing the energy associated with a drawing � is measuring how close theeuclidean distances between points in � are to the actual distance between thecorresponding nodes in the graph.

� � � �� � � � denote the euclidean distance for a drawing �

� � � ��� � � denote the distance in �

� � � � � � � � � �� � ��� � � � � �� � � � �

� � � � � � � � � � (2)

24

5. Experimentation� Tests

– Tests on a variety of graphs

– Java API Royere

Graphs Algorithm 2 GEM Random Ring

�� � �� � Ref59 87 1 0.049 0.0139 0.349 0.388216 528 3 0.112 0.154 0.203 0.224250 490 4 0.220 0.130 0.223 0.236296 824 5 0.210 0.132 0.213 0.228300 639 6 0.217 0.169 0.228 0.243350 668 7 0.206 0.135 0.220 0.233400 992 8 0.200 0.130 0.220 0.224450 865 9 0.191 0.165 0.219 0.234500 784 10 0.203 0.196 0.214 0.233550 1172 11 0.190 0.128 0.210 0.221600 1182 12 0.175 0.189 0.210 0.221

25

5. Experimentation

26

5. Experimentation

27

6. Conclusion and future work� Conclusion

– Using a partition based on metric values

– � ��

� � ��� � � � � �

� Future work

– The drawing could serve as the input for an animated force-directed layout

– Choose an other metric

28

top related