dense subgraphs on dynamic networks

Dense Subgraphs on Dynamic Networks

Atish Das SarmaAshwin LallDanupon NanongkaiAmitabh Trehan

(Presented by Anisur Molla)

Density

Network density is probably the most fundamental network metric for understanding how networks tick....

Third Degree Centrality (blog), June 16, 2011

Sparse but Dense

• While a graph may be global sparse, it often still has dense substructures– These provide topological characteristics that are often

important to understand

• Important to study finding dense subgraphs in large graphs– The world wide web– Search query-document click logs– Social Networks– Telephone call logs– Peer-to-peer backbone networks

What do dense structures reveal?

• Web social network communities (potentially hidden)• Friend groups / shared interest groups

• Good structures to study for “cohesive” webpages• Helpful for identifying similar webpages• Potentially helps spam detection

• Network backbones in peer to peer networks• Understand connectivity structure

• User behavior/interest analysis from click logs

Properties of Density• Largely “robust” to graph alterations

– Small changes in the graph (so edge addition/deletion) only marginally affect the density – so “smooth” in this regard

• Relatively stable for dynamic graphs

• Measures “local” structural property that often reveals local and global topological insights

• Some variants of the problem are poly-time solvable

Density• Density of Graph G(V,E):

• Density of Subgraph S (= induced density on G):

Our problem• Efficient Distributed algorithms for

discovering densest subgraphs/ bounded size densest subgraphs

• Maintaining the subgraphs when edges change (Dynamic graphs)

Our Dynamic Model• Initial Graph over n nodes• Edge Dynamic Model

• At each time step, adversary may add or remove up to r edges

• Constraint: Bound on “dynamic diameter” is D

• After adversarial action, nodes communicate with direct neighbors under the Congest model

Distributed Congest Model

• Synchronous communication “rounds”• A node can exchange messages with each of its

neighbors in each round• Each message should be O(log n) size

(bandwidth restriction)• Objective: Minimize number of “rounds”/time

complexity• Model is well studied theoretical abstraction for

peer-to-peer network motivation

Additional details

• Algorithms run continuously to maintain the approximations at all times

• Self-awareness: Nodes are aware if they are part of output subgraph

• Nodes need knowledge of the dynamic diameter D

• Cost is measured in approximation guarantees as well as time bounds

Related Work• Lots of work on finding size-bound dense subgraphs in classical

setting– NP-hard with size restriction (poly time solvable otherwise)– No approximation scheme for size-exactly-k or size-at-most-k

(and no constant factor known)– Khuller-Saha and Andersen-Chellapilla gave constant factor

algorithms for size at-least-k• Some of our algorithms are based on Khuller-Saha

• Surprisingly no work in the distributed (CONGEST model) and dynamic settings

Related Work - 2• Lots of work on dynamic networks

– Notable recent model for edge-alteration by Kuhn-Lynch-Oshman (stability property with T-interval connectivity)

– Our model slightly different (though similar graphs generated)

• Lot of graph problems studied in the CONGEST model (Peleg)– Very fast distributed approximation algorithms studied– Densest subgraph falls under category of “global problem”

so rounds– For several global problems recent lower bound

(Das Sarma et al.)• Densest subgraph problem is one for which this lower

bound does not hold

Our ResultsDensest Subgraph Problem

We give a distributed algorithm for any

dynamic graph with dynamic diameter D and rate r that w.h.p. the densest subgraph at that time given the max density is at least

Our ResultsDensest Subgraph Problem

Static graphs We give a distributed algorithm that

obtains a w.h.p. in O(D logn) rounds of the CONGEST model

Note: First known algorithm for this problem in the CONGEST model

Our ResultsAt-least-k Densest Subgraph

We give an algorithm that

w.h.p. the at-least-k densest subgraph at that time given that the max at-least-k density is at least

Centralized version is known to be NP-hard

(Subgraph should have at least k vertices)

Our ResultsAt-least-k Densest Subgraph

Static graphs

A distributed algorithm that obtains w.h.p. a in O(D logn) rounds of the CONGEST model

Algorithm for Densest Subgraph

Previous 2-Approximation Algorithm[Khuller-Saha’09]

• Iteratively remove the lowest degree node and keep track of density

Slides idea borrowed from Sergei Vassilivitskii “MapReduce Algorithmics” presented at “Large-Scale Distributed Computation” Seminar 2011, Japan

Densities

Red = lowest degree Gray = deleted

1) 16/11

Previous 2-Approximation Algorithm [Khuller-Saha’09]



Densities


1) 16/11




Densities

1) 16/112) 15/10





Densities

1) 16/112) 15/103) 14/19





Densities

1) 16/112) 15/103) 14/194) 13/8





Densities

1) 16/112) 15/103) 14/194) 13/85) 12/7





Densities

1) 16/112) 15/103) 14/194) 13/85) 12/76) 10/6





Densities

1) 16/112) 15/103) 14/194) 13/85) 12/76) 10/67) 9/5





Densities

1) 16/112) 15/103) 14/194) 13/85) 12/76) 10/67) 9/58) 6/4





Densities

1) 16/112) 15/103) 14/194) 13/85) 12/76) 10/67) 9/58) 6/49) 3/3





Densities

1) 16/112) 15/103) 14/194) 13/85) 12/76) 10/67) 9/58) 6/49) 3/310) 1/2Red = lowest degree

Gray = deleted




Densities

1) 16/112) 15/103) 14/194) 13/85) 12/76) 10/67) 9/58) 6/49) 3/310) 1/211) 0





Densities

1) 16/112) 15/103) 14/194) 13/85) 12/76) 10/67) 9/58) 6/49) 3/310) 1/211) 0


2-approximated density = Largest density



• Inefficient to implement even on static distributed networks (needs W(n) rounds)

Our (2+)-approximation algorithm


• Iteratively remove the all nodes such that

… and keep track of densityDensities

Red = degree lower than averageGray = deleted

Average degree

Say,

1) 16/111) 32/11



… and keep track of densityDensitiesAverage degree

Say,

1) 16/111) 32/11





1) 16/112) 9/5

Average degree

1) 32/112) 18/5

Say,





1) 16/112) 9/53) 3/3

Average degree

1) 32/112) 18/53) 6/3

Say,





1) 16/112) 9/53) 3/34) 0

Average degree

1) 32/112) 18/53) 6/34) 0

Say,





1) 16/112) 9/53) 3/34) 0

Average degree

1) 32/112) 18/53) 6/34) 0

Say,

-Approximated Density = Largest density




… and keep track of density• Can be implemented on static distributed

networks in rounds• A similar idea is independently discovered by

Bahmani, Kumar, and Vassilvitskii [VLDB’12] for solving this problem on Streaming and MapReduce models.

Extension to Dynamic Networks

• Continuously maintain the -approximations at all times

• Assumption: Density of densest subgraph is never too low

• Nodes need knowledge of the dynamic diameter D (time to broadcast one message)

• Our algorithm only needs to approximately count the number of edges and nodes in the network!

• Self-awareness: Nodes are aware they are part of certain dense subgraphs

Extension to At-Least-k Densest Subgraphs

• Simultaneously maintain the -approximated solution to at-least-k densest subgraph problem, for all k. – Same assumptions– Without further communication, can, e.g., answer a

query “Give the approximated density of the densest subgraph of size at least n/10”.

Conclusions and Future Work• We provide distributed approximation

algorithms for densest and at-least-k densest subgraph problems in the CONGEST model for both static and dynamic cases.

• While most graph problems are hard to approximate even in the static case, density is a useful exception

• Can this be extended to node deletions, other density definitions, improve upper bounds or provide a lower bound?

Thank you

Image: [email protected] , under creative commons license

mailto:[email protected]

dense subgraphs on dynamic networks

Documents

dynamic settingsrelated

dynamic networks atish

dynamic diameter dcost

dense structures

n nodesedge dynamic

peer backbone networks

dense substructuresthese

constant factor algorithms