compressed sensing of correlated social network...

9
Compressed Sensing of Correlated Social Network Data Abstract In this paper, we present preliminary work on compressed sensing of social network data and identify its non-trivial sparse structure. Feature basis learnt from data samples and the inherent sparse structure induced by graphs are combined into a tensor-based sparse representation of high-dimensional so- cial network data. By solving the Tensor Compressed Sensing Problem, we could ef- fectively recover correlated social data from a few samples. An efficient greedy algorithm, called Tensor Matching Pursuit (TMP), is also proposed to handle the computational intractability with big data. We severely test our algorithm and implementations over so- cial network datasets like Twitter, Facebook and Weibo. The results show that our ap- proach robustly outperforms the baseline al- gorithms and hence in a sense captures the sparsity of social network data better. 1. Introduction This paper considers the data gathering problem of large-scale social networks, where each user is modeled with a vector of features like education, hobby, opin- ion, etc. Such social networks as Twitter, Weibo are pervasive nowadays and involved extensively with net- work analysis, data mining (Russell, 2011), machine learning. In general, social network data is acquired in an ac- cumulative fashion with user profiles and features col- lected and stored independently. As far as we know, however, social network data is far from independent (Anagnostopoulos et al., 2008). We, therefore, ask the fundamental question: can one essentially reduce the number of samples required for social network analysis with hopefully little accuracy tradeoff? The answer lies in the haystack of data correlations Network Science Term Project. Unpublished preliminary work. Do not distribute. in social networks. In particular, two representative types of correlations are frequently encountered, which we highlight as follows: Social Correlation. Your friend and you tend to like the same TV show. Social correlation, such as social influence (Anagnostopoulos et al., 2008) or information cascading (Easley & Klein- berg, 2010), characterizes the coordination of peo- ple’s behavior and features over a connected graph component. Feature Correlation. Being a Geek is being cool. The high-dimensional feature of each en- tity could have correlations among multiple di- mensions. In contemporary signal processing and machine learn- ing, the silver bullet to recover correlated data is com- pressed sensing (Cand` es & Wakin, 2008). This paper presents our preliminary attempt to apply compressed sensing techniques to social network data gathering. Our general idea is depicted in Figure. Instead of col- lecting the feature vectors of the entire data, we could just randomly sample a subset of the feature entries and later use sparse recovery algorithms to acquire the rest. Two major contributions are made in this paper. First, a novel sparse representation that simultaneously deals with both social and feature correlations is introduced by exploring the idea of hierarchical sparse modeling. Second, an efficient implementation of the sparse re- covery algorithm under the new representation is de- vised that allows for fast solving of our optimization problem. The algorithms are extensively tested on sev- eral datasets and the results show the advantage of our approach against baseline algorithms. The rest of the paper is organized as follows. Sec- tion 2 reviews related work. Section 3 establishes the compressed sensing framework for social graphs with various types of data gathering strategies. Section 4 introduces the combination of feature basis learning and diffusion wavelets in a hierarchical style for sparse representation of social networks. Section 5 presents two efficient implementations of our proposed algo- rithm. Section 6 demonstrates the test results on three

Upload: others

Post on 29-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Compressed Sensing of Correlated Social Network Dataweb.stanford.edu/~tianlins/research/corrcs-proj.pdf · network data by sampling. 4. Sparse Recovery In this section, we address

Compressed Sensing of Correlated Social Network Data

Abstract

In this paper, we present preliminary workon compressed sensing of social network dataand identify its non-trivial sparse structure.Feature basis learnt from data samples andthe inherent sparse structure induced bygraphs are combined into a tensor-basedsparse representation of high-dimensional so-cial network data. By solving the TensorCompressed Sensing Problem, we could ef-fectively recover correlated social data froma few samples. An efficient greedy algorithm,called Tensor Matching Pursuit (TMP), isalso proposed to handle the computationalintractability with big data. We severely testour algorithm and implementations over so-cial network datasets like Twitter, Facebookand Weibo. The results show that our ap-proach robustly outperforms the baseline al-gorithms and hence in a sense captures thesparsity of social network data better.

1. Introduction

This paper considers the data gathering problem oflarge-scale social networks, where each user is modeledwith a vector of features like education, hobby, opin-ion, etc. Such social networks as Twitter, Weibo arepervasive nowadays and involved extensively with net-work analysis, data mining (Russell, 2011), machinelearning.

In general, social network data is acquired in an ac-cumulative fashion with user profiles and features col-lected and stored independently. As far as we know,however, social network data is far from independent(Anagnostopoulos et al., 2008). We, therefore, ask thefundamental question: can one essentially reduce thenumber of samples required for social network analysiswith hopefully little accuracy tradeoff?

The answer lies in the haystack of data correlations

Network Science Term Project. Unpublished preliminarywork. Do not distribute.

in social networks. In particular, two representativetypes of correlations are frequently encountered, whichwe highlight as follows:

• Social Correlation. Your friend and you tendto like the same TV show. Social correlation,such as social influence (Anagnostopoulos et al.,2008) or information cascading (Easley & Klein-berg, 2010), characterizes the coordination of peo-ple’s behavior and features over a connected graphcomponent.

• Feature Correlation. Being a Geek is beingcool. The high-dimensional feature of each en-tity could have correlations among multiple di-mensions.

In contemporary signal processing and machine learn-ing, the silver bullet to recover correlated data is com-pressed sensing (Candes & Wakin, 2008). This paperpresents our preliminary attempt to apply compressedsensing techniques to social network data gathering.Our general idea is depicted in Figure. Instead of col-lecting the feature vectors of the entire data, we couldjust randomly sample a subset of the feature entriesand later use sparse recovery algorithms to acquire therest.

Two major contributions are made in this paper. First,a novel sparse representation that simultaneously dealswith both social and feature correlations is introducedby exploring the idea of hierarchical sparse modeling.Second, an efficient implementation of the sparse re-covery algorithm under the new representation is de-vised that allows for fast solving of our optimizationproblem. The algorithms are extensively tested on sev-eral datasets and the results show the advantage of ourapproach against baseline algorithms.

The rest of the paper is organized as follows. Sec-tion 2 reviews related work. Section 3 establishes thecompressed sensing framework for social graphs withvarious types of data gathering strategies. Section 4introduces the combination of feature basis learningand diffusion wavelets in a hierarchical style for sparserepresentation of social networks. Section 5 presentstwo efficient implementations of our proposed algo-rithm. Section 6 demonstrates the test results on three

Page 2: Compressed Sensing of Correlated Social Network Dataweb.stanford.edu/~tianlins/research/corrcs-proj.pdf · network data by sampling. 4. Sparse Recovery In this section, we address

Network Science Term Project

datasets. Finally, Section 7 concludes this paper anddiscusses potential future works.

2. Related Works

Over the past decades, network science research is be-coming increasingly popular. The mainstream splitsinto two parts: network modeling (e.g. small worldmodel (Newman, 2000) and network analysis (e.g. thelink prediction problem (Liben-Nowell & Kleinberg,2007). These research share a common feature: deal-ing with big data. Kwak et al. crawled the entire twit-ter network and gathered the data for analysis (Kwaket al., 2010).

The field of compressed sensing grows out of the workby Candes, Romberg, Tao and Donoho(Donoho, 2006).Recent years have witnessed the advancement of com-pressed sensing in both theory and application. Bara-niuk gives a simple proof the RIP property of randommatrices (Baraniuk et al., 2008). Candes et al. consid-ers signal recovery from highly incomplete information(Candes et al., 2006).

Compressed sensing has find great applications in mul-tiple fields like image processing, signal processing, au-dio analysis and sensor networks. The first applicationof compressed sensing technique owes back to Luo etal.(Luo et al., 2009). They construct a sensor net-work with a sink collecting compressed measurements,which is equivalent to a random matrix projection.Xu et al.(Liwen, 2013) considers more general com-pressed sparse functions for sparse representation ofsignals over graph.

In terms of sparse representations, dictionary learningoriginates from efforts in reproducing V1-like visualneurons through sparse coding (Lee et al., 2007). Mar-ial et al. proposed online dictionary learning methods,which leads to efficient computation of sparse coding(Mairal et al., 2009). Diffusion wavelet is first inventedby Maggioni in 2004 (Mahadevan & Maggioni, 2006)and has found its application in compressed sensingover graphs (Coates et al., 2007).

The typical numerical approach to solving compressedsensing problem is through l1 minimization. Never-theless, several ”greedy pursuit” algorithms have beenproposed. The pursuit algorithm could date back to1974 (Friedman & Tukey, 1974). The most classicMatching Pursuit algorithm is proposed by Matllatand Zhang in 1993 (Mallat & Zhang, 1993). Thereis now a family of ”greedy pursuit” algorithms andsome theoretical guarantees have been established (Lu& Do, 2008).

3. Compressed Social Data Gathering

Compressed Sensing, also known as Compressive Sens-ing, is a signal processing technique for efficiently ac-quiring and reconstructing a signal by finding solutionsto underdetermined linear systems (Candes & Wakin,2008).

~m = MΦ~x (1)

This takes advantage of the signal’s sparseness or com-pressibility (representable by a few non-zero coeffi-cients ~x) in some domain Φ, allowing the entire signalΦ~x to be determined from relatively few measurements~m. A compressed data gathering scheme should clev-erly devise the measurement matrix so as to maximizeinformation expressiveness. We propose two appeal-ing measurement schemes for compressed social datagathering in this section.

3.1. Sampling

According to traditional theory of compressed sensing,we need to choose at least Ck log n/k samples for theideal recovery of k-sparse signals, where c is a smallconstant. In the setting of graph compressed sensing,we hence need to select Θ(k log n/k) feature entries atvarious vertices to achieve compressed sampling. Forsimplicity, we just sample the data uniformly at ran-dom.

3.2. Message Propagation

On social networks, messages such as ”tweets” and”microblogs” are propagated along the edges from ver-tex to vertex. When a message m passes a vertexv (represents a client or a person), the property ofm could influenced by the property of the vertex v.Measuring the changes on a message may help us tolearn more about the graph’s features, and thus in away viewed as measuring information during the com-pressed sensing process.

Suppose that each vertex u has a high-dimensional fea-ture vector vu. Also, for each message m spread in thenetwork, there is a corresponding vector vm ∈ Rl thatimplies the properties of this message. vm could bechanged in propagation. Since traditional compressedsensing considers problem settings with linear trans-formations, we can assume that vm is changed linearlyaccording the the weight of the edge as it passes theedge. More precisely, suppose for each edge e = (s, t),there is a vector αe ∈ Rl that represents the weighvalue of different properties of this edge, then we can

Page 3: Compressed Sensing of Correlated Social Network Dataweb.stanford.edu/~tianlins/research/corrcs-proj.pdf · network data by sampling. 4. Sparse Recovery In this section, we address

Network Science Term Project

assume that

v′mi =vsi + αivmi

1 + αi, 1 ≤ i ≤ l. (2)

With this assumption, we measure social networksthrough paths of messages’ propagation. Measuringthe difference of the messages’s property vector afterit passes an edge, and from that we can construct aset of linear equations for solving the properties of thevertices.

However, what limits the use of this paradigm is howto properly set the edge weights, which is a strongprior information is somehow restrictive. Therefore,we consider a simpler, applied way for gathering thenetwork data by sampling.

4. Sparse Recovery

In this section, we address a fundamental issue en-countered in sparse recovery of gathered social net-work data. Social network data could be ”big” in twodimensions: rows (i.e. number of entities on graph)and columns (i.e. number of features for each entity).Since correlation could happen in both dimensions, itis unclear how one could find a suitable way to sparselyrepresent the data.

There are basically two straightforward paradigmsto identify a sparse representation of data on socialgraphs. One trivial approach would simply ignorenetwork structure and adopt the dictionary learningmethods (Lee et al., 2007; Mairal et al., 2009) to finda basis, under which feature vectors of entities {vi}ni=1

would have sparse coefficients. This approach over-looks most significant role of social ties as shown inprevious works (Anagnostopoulos et al., 2008), whichcould possibly lead to sparser representation. Theother paradigm follows preliminary work by Coates etal. (Coates et al., 2007) as well as Xu et al. (Liwen,2013), both of which in different ways consider sparsedecomposition of signal values with respect to somespecific functional basis on sensor networks. Althoughit suggests a direct way to model social correlation, itdoes not generalize naturally to high-dimensional so-cial vector graphs with feature correlations.

We could in principle concatenate features of all en-tities {vi}ni=1 into a large vector so as to captureboth feature and topological correlations, but neitherthe computational burden nor the data size would betractable.

To overcome these shortcomings, we propose anovel construction of sparse basis functions for high-

dimensional social graph data by combining the abovetwo paradigms in a reciprocal way.

4.1. Feature Basis

Identifying sparse feature basis is a learning problem –given κ statistically representative feature samples ofnetwork entities xi1 ,xi2 , ...,xiκ ∈ Rd, we seek a set ofvectors b1, b2, ..., bmB ∈ Rd, under which the samplescould be sparsely represented by L � d non-zero co-efficients αi1 ,αi2 , ...,αiκ and this property could begeneralized to the entire data universe. Mathemati-cally, it can be formalized as a joint optimization prob-lem:

minB,α

1

κ

κ∑j=1

(1

2||xij −Bαij ||2 + λ||αij ||1) (3)

where basis bi is settled at the i-th column of matrixB and λ is a regularization parameter for sparsity.This problem can be solved efficiently with second-order gradient descent(Lee et al., 2007), online learn-ing methods (Mairal et al., 2009), etc.

4.2. Diffusion Wavelet

Wavelet transforms are a staple of modern compressionand signal processing methods due to their ability ofrepresenting piece-wise smooth signals efficiently (sig-nals which are smooth everywhere, except for a fewdiscontinuities).

In general, wavelet transform produce a multi-scalingfunction decomposition defined on regularly sam-pled interval. Ronald R. Coifman and Mauro Mag-gioni (Ronald R. Coifman, 2006) introduces Diffusionwavelets specifically. They start from a semi-groupof diffusion operators {T t}, associated to a diffusionprocess, to induce a multi-resolution analysis, inter-preting the powers of T as dilation operators actingon functions, and constructing precise down samplingoperators to efficiently represent the multi-scale struc-ture. This yields a construction of multi-scale scalingfunctions and wavelets in a very general setting.

Our goal for the Diffusion Wavelet transformation is tocompute a collection {Bi}ni=1 of orthonormal waveletsbasis vectors. Then, A function y on the graph can bewritten as

y =

n∑i=1

βiBi

where βi is the i-th wavelet coefficient (Mark Coates& Rabbat, 2007).

Page 4: Compressed Sensing of Correlated Social Network Dataweb.stanford.edu/~tianlins/research/corrcs-proj.pdf · network data by sampling. 4. Sparse Recovery In this section, we address

Network Science Term Project

The process for computing the basis need a sparse QRdecomposition. Here given the input A (a sparse n ×n matrix and a precision indicator ε, the sparse QRdecomposition returns a n × n orthogonal matrix Qand an upper triangular matrix R where A =ε QR.

To compute the orthonormal bases of scaling func-tions, Φj , wavelets Ψ, the algorithm works as follows:

Algorithm 1 Diffusion Wavelet Construction (RonaldR. Coifman, 2006)

1: j ← 02: while j < N do

3: [Φj+1]Φj, [T ]Φ1

Φ0← SpQR([T 2j ]

Φj

Φj, ε)

4: Tj+1 := [T 2j+1

]Φj+1

Φj+1← [Φj+1]Φj

[T 2j

]Φj

Φj[Φj+1]∗Φj

5: [Ψj ]Φj← SpQR(I<Φj> − [Φj+1]Φ[Φj+1]∗Φj

, ε)6: end while

where [B1]B2represents the set of vectors B1 repre-

sented on a basis B2; and [L]B2

B1indicate the matrix

representing the linear operator L with respect to thebasis B1 in the domain and B2 in the range.

The key point for this process is how to choose theinitial diffusion operator T. This T must be a matrixsuch that Tij > 0 if and only if (i, j) ∈ E. Also, Tij ’svalue should represents the correlation of the verticesi and j. Hence, we choose T to be the Laplacian ofthe graph, i.e.

L(i, j) =

{0 , if(i, j) 6∈ E− 1√

didj, if(i, j) ∈ E (4)

4.3. Graph Tensor Basis

We propose to unify feature basis and diffusion waveletin a hierarchical fashion. First, feature vectors asso-ciated with each node {xi}ni=1 ∈ Rd are decomposedsparsely into coefficients {αi}ni under basis B.

xi = Bαi (5)

Let X,A ∈ Rd×n, each column of which denotes vectorxi or αi. Each row of A, denoted as {Ai,:}ni=1, isthen a value function on the graph and we proceed bydecomposing it over diffusion wavelets W .

A>i,: = Wui (6)

Using matrix notation, the entire pipeline can be writ-ten as

X = BA = BUW> (7)

where i-th row of the coefficient matrix U is u>i . Thefocus of our sparse recovery algorithm is then turned tominimizing the l1 norms ||u1||1, ||u2||1, ..., ||ud||1 sub-ject to Equation 7.

minimizeU

d∑i=1

||ui||1

subject to Y = M(BUW>)

(8)

The above optimization problem can be rewritten inthe familiar form of compressed sensing, if we concate-nate the columns of X and U into a long vector ~X, ~Uand they are connected with the Kronecker productB ⊗W> .

minimizeU

d∑i=1

||ui||1

subject to ~Y = MB ⊗W ~U

(9)

In other words, the feature basis and diffusion waveletare combined through tensor product to produce a newbasis for hierarchical sparse decomposition.

5. Efficient Implementation

For networks with large number of nodes, the opti-mization problem (11) could be daunting due to thehigh dimensionality of the tensor basis. To clear awaythe obstacles for the applicability of our recover al-gorithm, we introduce in this section two approxima-tions: Tensor Matching Pursuit and Patched-Basedsparse recovery.

5.1. Tensor Matching Pursuit

Although compressed sensing is often synonymouswith l1-based optimization, many applications oftenrequire efficient storage and fast speed. This is espe-cially true for the tensor version of our joint basis op-timization problem (11), which might inevitably con-sumes up to n2 × d2 space for d-dimensional vectorgraphs of n nodes.

We show that these burdens are not fundamental ob-stacles to our sparse recovery paradigm by introduc-ing a new greedy algorithm to tackle the optimizationapproximately. Our technique belongs to a large fam-ily of ”greedy pursuit” methods used in compressed

Page 5: Compressed Sensing of Correlated Social Network Dataweb.stanford.edu/~tianlins/research/corrcs-proj.pdf · network data by sampling. 4. Sparse Recovery In this section, we address

Network Science Term Project

sensing and generalizes the classic Matching Pursuitalgorithm.

Like other ”greedy pursuit” algorithms, our TensorMatching Pursuit (TMP) has two fundamental steps:element selection and coefficient update. In particu-lar, the approximation is incremental: first selectingone column from the basis B at each iteration and up-date the coefficients associated with the column suchthat the residual of constraints is decreased.

To derive Tensor Matching Pursuit, we need to puta weak constraints on measurements. Let M1,M2 betwo linear operators, the following gives the slightlyrestricted version of our optimization problem (10).

minimizeU

d∑i=1

||ui||1

subject to Y = M1BUW>M>2

(10)

This restriction requires simultaneously measure thesame feature dimensions and is reasonable in practice.The presented TMP method is elaborated in Algo-rithm 2.

Algorithm 2 Tensor Matching Pursuit

input Y, B, W1: Set R[0] = Y, X [0] = 02: for round t = 1 → ∞ until stopping criterion is

met do3: Calculate projection RB = B†R,RW = W †R>,

where B† and W † is Moore-Penrose pseudo-inverse.

4: Compute correlation matrixCB = RB [ w1

||w1||2 , ...,wp||wp||2 ] and CW =

RW [ b1||b1||2 , ...,

bp||bq||2 ] where bi and wj are col-

umn vectors of B and W .5: Select one entry e = (i, j) either CB or CW that

has the highest absolute correlation value |c(e)|.6: if e is chosen from CB then7: Xij = Xij + ηtc(e)8: else9: Xji = Xji + ηtc(e)

10: end if11: Update R = Y −AXB>.12: end foroutput X,R

5.2. Patch-Based Sparse Recovery

The notion of patch-based sparse recovery originatesfrom the compressed sensing of natural images withpatch-based representations (Yang et al., 2008). It is

then naturally applicable if we observe that social in-teractions are likely to be local. Technically, we dividethe nodes of the graph into groups G1,G2, ...,Gq. LetWG be the wavelet basis restricted G ( with rows notinvolved in group Gi removed) and MG be the corre-sponding measurements. The patch-based sparse re-cover could be formulated as:

minimizeU

d∑i=1

||ui||1

subject to ~YG = MGB ⊗WG ~U

(11)

for G ∈ {Gi}qi=1. The choice of grouping patch is of pri-mary concern for applications. We consider two basiccandidates here.

• mini-batch. Divide nodes {vi}ni=1 into K mini-batches {vn/K·j+1, ..., vn/K·(j+1)}Kj .

• k-hop. Patch groups consist of k-hops of ver-tices {vi}ni=1 on the graph: Gi is the set of verticeswithin distance k to vi.

6. Experiment

The proposed compressed sensing algorithm and fastimplementations are severely tested in three datasets,two of which are real-world social network data.

6.1. Synthetic Data

Synthetic data is generated from classical networkmodels that capture many characteristic aspects ofpractical social networks, for example, constant clus-tering coefficient, small-world effect, etc. (Newman,2009). It gives simple demonstration how our com-pressed sensing approach works on social networks.Furthermore, since real-world data is noisy, the gapof the algorithm’s performance on synthetic and realsocial network data could, in a way, measure its ro-bustness in the presence of noise.

In particular, we utilize preferential attachment andsmall world model (Newman, 2009) to synthesizegraphs with topology akin to that of social networks.To generate a synthesize graph G = (V,E) of n ver-tices, our algorithm works as follows

Here, Preferential Attachment rule selects the vertexvj with probability proportional to the in-degree of vj ,i.e.

Page 6: Compressed Sensing of Correlated Social Network Dataweb.stanford.edu/~tianlins/research/corrcs-proj.pdf · network data by sampling. 4. Sparse Recovery In this section, we address

Network Science Term Project

Algorithm 3 Graph Synthesis

1: V ← {v1}2: 1← i3: while i < N do4: i← i+ 15: V = V ∪ {vi}6: Choose vj ∈ {v1, v2, ...vi−1} according to the

Preferential Attachment rule7: E = E ∪ {(vi, vj)}8: Choose a long range link < vs, vt > accord-

ing to Kleinberg’s model, where vs, vt ∈ V and(vs, vt) 6∈ V

9: E = E ∪ {(vs, vt)}10: end while11: Sample a random basis B.12: for i = 1→ n do13: Randomly sample a k-sparse vector under basis

B.14: end for15: Use the similarity of feature vectors to define

weights for the Markov chain build from G =(V,E).

16: Simulate the Markov Chain with Gibbs Sampling.

Pr[vj is chosen] =in-degree(vj)

i−1∑k=1

in-degree(vk)

(12)

Kleinberg’s model (Kleinberg, 2000) choose a long-range link (vs, vt) with probability probability propor-tion to d(vs, vt)

−α, where α is a constant. In this pa-per, Kleinberg showed that for α = 2, a grid networkcan be routed in O(log2N) steps in expectation fortwo vertices with distance N (Kleinberg, 2000). Here,in our synthetic graph model, we also choose α = 2.

Experiment result shows that this synthetic graphmodel has small diameter, big clustering coefficientand similarly Power-Law degree distribution as nor-mal social network graphs.

Each node of the synthetic graph is then assigned witha randomly generated K-sparse feature under certainbasis. A Markov network is generated correspondingto the synthetic graph to incorporate correlations intoneighboring nodes.

6.2. SNAP Datasets

The Stanford Large Network Dataset Collection (aka.SNAP library) (sna, 2009) provides open access topopular network data with anonymized features. Thecollection subsets used in our experiment involve so-

(a) (b)

Figure 1. Synthetic Social Graph of (a) 100 nodes. (b)10000 nodes, which follow preferential attachment andpower-law degree distribution.

(a) Facebook 348 (b) Twitter 613313

Figure 2. Sample Social Circle from Facebook (Left) andTwitter(Right) Dataset in the SNAP Dataset Collection

cial circles from Facebook and Twitter. The Facebookdataset contains 10 circles with in all 4,039 nodes and88,234 edges. For each node, corresponding binary fea-tures are collected including education, school, year,location, work, etc. It is noticeable that these featuresare binary 0/1 vectors. The Twitter dataset followssimilar settings but contains a much larger number ofnodes and edges up to 81,306 and 1,768,149 respec-tively.

We optionally choose circle 3980 (#nodes 59, #dimen-sion 42) from Facebook dataset for the validation ofthe algorithms’ peformance. Around 300 samples arefirst sampled uniformly at random from these datato learn a dictionary (size L) of basis feature vec-

Facebook Twitter

0 birthday 19 #CES4 education 24 #Dell

10 education 28 #Facebook34 first name 41 #NBA44 languages 241 @DIY46 last name 355 @Microsoft

... ...

Table 1. Sample Features for Facebook and TwitterDataset in the SNAP Dataset Collection.

Page 7: Compressed Sensing of Correlated Social Network Dataweb.stanford.edu/~tianlins/research/corrcs-proj.pdf · network data by sampling. 4. Sparse Recovery In this section, we address

Network Science Term Project

0.2 0.4 0.6 0.8 10.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

Percent of Data Observed

Re

co

nstr

uctio

n E

rro

r

L = 0.3

Tensor

Feature

Wavelet

0.2 0.4 0.6 0.8 10.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

Percent of Data Observed

Re

co

nstr

uctio

n E

rro

r

L = 0.4

Tensor

Feature

Wavelet

0.2 0.4 0.6 0.8 10.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

Percent of Data Observed

Re

co

nstr

uctio

n E

rro

r

L = 0.5

Tensor

Feature

Wavelet

0.2 0.4 0.6 0.8 10.4

0.5

0.6

0.7

0.8

0.9

Percent of Data Observed

Re

co

nstr

uctio

n E

rro

r

L = 0.6

Tensor

Feature

Wavelet

Figure 3. Relative recover error of social circle 3980 inFacebook dataset when m ·#nodes of data is observed anda dictionary of size L is prepared.

tors for sparse representation of the high-dimensionaldata. Upon specific network topology, the wavelet dif-fusion process is simulated at 2 scales with the graphLaplacian operator, which is computed from the socialgraph with weights set to the cosin similarity (New-man, 2009) for adjacent nodes. To emulate a com-pressed sensing setting, m ·#nodes randomly selectedfeature entries of nodes on the social graphs are ob-served and then recovered with OMP algorithm underthe sparse tensor basis.

The experiment is performed with MATLAB on anIntel 4-core i5 2.4 GHz machine and utilizes softwarepackages SPAMS (spa, 2012) and (mau, 2009) Diffu-sion Wavelets.

By varying the number of items L in the dictionaryas well as the number of measurements m · #nodes,the relative reconstruction error under l2 norm is plot-ted in Figure 6.2. We observe that in spite of thestochastic reconstruction error due to uncertainty ofmeasurement, a larger over-complete dictionary couldresult in better outcome. Also we find that the rela-tive recovery error is usually large, and guess that thisbe because binary and categorical features could notfind a natural sparse representation. Even with thisdrawback, the graph tensor basis is shown to be morestable and outperforms the baseline methods in someparameter settings.

6.3. Weibo Data

Most open access datasets have restrictions on the useof social data due to privacy concerns. It brings inher-ent inconvenience to seeking an appropriate testbed

0.2 0.4 0.6 0.8 10.55

0.6

0.65

0.7

0.75

0.8

Percent of Data Observed

Re

co

nstr

uctio

n E

rro

r

L = 0.3

Tensor

Feature

Wavelet

0.2 0.4 0.6 0.8 10

0.5

1

1.5

2

2.5

Percent of Data Observed

Re

co

nstr

uctio

n E

rro

r

L = 0.3

Tensor

Feature

Wavelet

0.2 0.4 0.6 0.8 1

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

Percent of Data Observed

Re

co

nstr

uctio

n E

rro

r

L = 0.4

Tensor

Feature

Wavelet

0.2 0.4 0.6 0.8 10.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

Percent of Data Observed

Re

co

nstr

uctio

n E

rro

r

L = 0.5

Tensor

Feature

Wavelet

0.2 0.4 0.6 0.8 10.4

0.5

0.6

0.7

0.8

0.9

1

Percent of Data Observed

Re

co

nstr

uctio

n E

rro

r

L = 0.6

Tensor

Feature

Wavelet

Figure 4. Relative recover error of social circle 116808228in Twitter dataset when m · #nodes of data is observedand a dictionary of size L is prepared.

with highly correlated and abundant social featuresfor our algorithm.

We therefore resort to the Internet and use collectionsof data grabbed from Weibo.com, which contains notonly detailed user profile but also complete microblogtexts.

The Weibo dataset contains 200 millions users and200 GB microblogs. A connected subgraph of 965nodes and 3 millions microblogs is selected and a2000-dimensional feature vector is established for eachnode simply by counting the word distribution of itsmicroblog posts. This representation is in a sense”raw” and hence inevitably leads to stronger corre-lation among features as well as neighboring nodes.

We evaluate our algorithm’s performance following thesame settings as the Facebook and Twitter dataset.However, we find for this dataset the traditional l1minimization or greedy algorithms requires explosivetime and space to solve the tensor compressed sens-ing problem (11). In contrast, our proposed TensorMatching Pursuit algorithm much faster and resultsin comparable accuracy. As shown in Figure 6.3, ouralgorithm significantly outperforms the baseline onesfor small number of measurements.

7. Conclusion and Future Works

In this paper, we focus on compressive sensing of cor-related social network data. Based on the assumptionthat the network has good sparsity, we propose a novelalgorithm for both gathering the data process and therecovery process based on hierarchical sparse modeling

Page 8: Compressed Sensing of Correlated Social Network Dataweb.stanford.edu/~tianlins/research/corrcs-proj.pdf · network data by sampling. 4. Sparse Recovery In this section, we address

Network Science Term Project

Figure 5. The social subgraph taken from the Weibo datathat has 965 users and 3 millions microblogs.

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

Percent of Data Observed

Reconstr

uction E

rror

Tensor

Feature

Wavelet

Figure 6. Relative recover error on Weibo dataset when m ·#nodes of data is observed and a dictionary of size L isprepared.

0 100 200 300 400 500 600 700 800 900 10000

500

1000

1500

2000

2500

iteration

Residual

#Non−Zero Item

Figure 7. Tensor Pursuit Matching. Change of Residualand #Non-Zero Items.

and tensor representation. Efficient implementationslike Tensor Matching Pursuit and Patched-Based opti-mization are also presented to allow for a fast solvingof the tensor compressed sensing problem. To showthe robustness and effectiveness of our approach, wetest the algorithms on several datasets. The resultsshows our model in a way captures the sparsity of so-cial networks better and therefore more desirable inpractice.

Simple observation shows social networks share a non-trivial sparse structure. The graph tensor sparse rep-resentation is our very preliminary attempts to iden-tify that structure. The future work, as we think, isthree-fold. Technically, we may continue to exploremore sophisticated numerical methods to deal withreal big data from large social networks; Algorithmi-cally, we can make more diverse ways of gathering thedata, including path measurements, message propaga-tion measurements; theoretically, we could prove somelower bounds on the number of samples needed for oursparse recovery algorithm as well as guarantees of theTensor Pursuit Algorithm.

References

Matlab code for diffusion wavelets. http://http://

www.math.duke.edu/~mauro/code.html., 2009.

Snap: Stanford network analysis platform. http://

snap.stanford.edu., 2009.

Sparse modeling software. http://http:

//spams-devel.gforge.inria.fr/., 2012.

Anagnostopoulos, Aris, Kumar, Ravi, and Mahdian,Mohammad. Influence and correlation in social net-works. In Proceedings of the 14th ACM SIGKDD in-ternational conference on Knowledge discovery anddata mining, pp. 7–15. ACM, 2008.

Baraniuk, Richard, Davenport, Mark, DeVore,Ronald, and Wakin, Michael. A simple proof of therestricted isometry property for random matrices.Constructive Approximation, 28(3):253–263, 2008.

Candes, Emmanuel J and Wakin, Michael B. An intro-duction to compressive sampling. Signal ProcessingMagazine, IEEE, 25(2):21–30, 2008.

Candes, Emmanuel J, Romberg, Justin, and Tao, Ter-ence. Robust uncertainty principles: Exact signalreconstruction from highly incomplete frequency in-formation. Information Theory, IEEE Transactionson, 52(2):489–509, 2006.

Page 9: Compressed Sensing of Correlated Social Network Dataweb.stanford.edu/~tianlins/research/corrcs-proj.pdf · network data by sampling. 4. Sparse Recovery In this section, we address

Network Science Term Project

Coates, Mark, Pointurier, Yvan, and Rabbat, Michael.Compressed network monitoring for ip and all-optical networks. In Proceedings of the 7th ACMSIGCOMM conference on Internet measurement,pp. 241–252. ACM, 2007.

Donoho, David Leigh. Compressed sensing. Infor-mation Theory, IEEE Transactions on, 52(4):1289–1306, 2006.

Easley, David and Kleinberg, Jon. Networks, crowds,and markets, volume 8. Cambridge Univ Press,2010.

Friedman, Jerome H and Tukey, John W. A projec-tion pursuit algorithm for exploratory data analysis.Computers, IEEE Transactions on, 100(9):881–890,1974.

Kleinberg, Jon. The small-world phenomenon: an al-gorithmic perspective. In Proceedings of the thirty-second annual ACM symposium on Theory of com-puting, pp. 163–170. ACM, 2000.

Kwak, Haewoon, Lee, Changhyun, Park, Hosung, andMoon, Sue. What is twitter, a social network or anews media? In Proceedings of the 19th interna-tional conference on World wide web, pp. 591–600.ACM, 2010.

Lee, Honglak, Battle, Alexis, Raina, Rajat, and Ng,Andrew Y. Efficient sparse coding algorithms. Ad-vances in neural information processing systems, 19:801, 2007.

Liben-Nowell, David and Kleinberg, Jon. The link-prediction problem for social networks. Journal ofthe American society for information science andtechnology, 58(7):1019–1031, 2007.

Liwen, Xu. Efficient data gathering using compressedsparse functions. 2013.

Lu, Yue M and Do, Minh N. A theory for samplingsignals from a union of subspaces. Signal Processing,IEEE Transactions on, 56(6):2334–2345, 2008.

Luo, Chong, Wu, Feng, Sun, Jun, and Chen,Chang Wen. Compressive data gathering for large-scale wireless sensor networks. In Proceedings ofthe 15th annual international conference on Mo-bile computing and networking, pp. 145–156. ACM,2009.

Mahadevan, Sridhar and Maggioni, Mauro. Valuefunction approximation with diffusion wavelets andlaplacian eigenfunctions. Advances in neural infor-mation processing systems, 18:843, 2006.

Mairal, Julien, Bach, Francis, Ponce, Jean, and Sapiro,Guillermo. Online dictionary learning for sparsecoding. In Proceedings of the 26th Annual Inter-national Conference on Machine Learning, pp. 689–696. ACM, 2009.

Mallat, Stephane G and Zhang, Zhifeng. Match-ing pursuits with time-frequency dictionaries. Sig-nal Processing, IEEE Transactions on, 41(12):3397–3415, 1993.

Mark Coates, Yvan Pointurier and Rabbat, Michael.Compressed network monitoring for ip and all-optical networks, 2007.

Newman, Mark. Networks: an introduction. OUPOxford, 2009.

Newman, Mark EJ. Models of the small world. Journalof Statistical Physics, 101(3-4):819–841, 2000.

Ronald R. Coifman, Mauro Maggioni. Diffusionwavelets. In Applied and Computational HarmonicAnalysis, volume 21, pp. 53–94, 2006.

Russell, Matthew A. Mining the Social Web: Ana-lyzing Data from Facebook, Twitter, LinkedIn, andOther Social Media Sites. O’Reilly Media, 2011.

Yang, Jianchao, Wright, John, Huang, Thomas, andMa, Yi. Image super-resolution as sparse represen-tation of raw image patches. In Computer Visionand Pattern Recognition, 2008. CVPR 2008. IEEEConference on, pp. 1–8. IEEE, 2008.