adversarial robustness of probabilistic network embedding

Adversarial Robustness of Probabilistic NetworkEmbedding for Link Prediction

Xi Chen1 (�), Bo Kang1, Jefrey Lijffijt1, and Tijl De Bie1

IDLab, Department of Electronics and Information Systems, Ghent University,Technologiepark-Zwijnaarde 122, 9052 Ghent, Belgium

{firstname.lastname}@ugent.be

Abstract. In today’s networked society, many real-world problems canbe formalized as predicting links in networks, such as Facebook friendshipsuggestions, e-commerce recommendations, and the prediction of scien-tific collaborations in citation networks. Increasingly often, link predic-tion problem is tackled by means of network embedding methods, owingto their state-of-the-art performance. However, these methods lack trans-parency when compared to simpler baselines, and as a result their ro-bustness against adversarial attacks is a possible point of concern: couldone or a few small adversarial modifications to the network have a largeimpact on the link prediction performance when using a network em-bedding model? Prior research has already investigated adversarial ro-bustness for network embedding models, focused on classification at thenode and graph level. Robustness with respect to the link predictiondownstream task, on the other hand, has been explored much less.This paper contributes to filling this gap, by studying adversarial robust-ness of Conditional Network Embedding (CNE), a state-of-the-art prob-abilistic network embedding model, for link prediction. More specifically,given CNE and a network, we measure the sensitivity of the link pre-dictions of the model to small adversarial perturbations of the network,namely changes of the link status of a node pair. Thus, our approach al-lows one to identify the links and non-links in the network that are mostvulnerable to such perturbations, for further investigation by an analyst.We analyze the characteristics of the most and least sensitive perturba-tions, and empirically confirm that our approach not only succeeds inidentifying the most vulnerable links and non-links, but also that it doesso in a time-efficient manner thanks to an effective approximation.

Keywords: Adversarial Robustness · Network Embedding · Link Pre-diction.

1 Introduction

Networks are used to model entities and the relations among them, so they arecapable of describing a wide range of data in real world, such as social networks,citation networks, and networks of neurons. The recently proposed Network Em-bedding (NE) methods can be used to learn representations of the non-iid net-work data such that networks are transformed into the tabular form. The tabular

arX

iv:2

107.

0193

6v1

[cs

.SI]

5 J

ul 2

021

2 X. Chen et al.

data can then be fed to solve several network tasks, such as visualization, nodeclassification, recommendation, and link prediction. We focus on link predictionthat aims to predict future or currently missing links [25] as it has been widelyapplied in our lives. Examples include Facebook friendship suggestions, Netflixrecommendations, predictions of protein-protein interactions, etc.

Many traditional link prediction approaches have been proposed [31], butthe task is tackled increasingly often by the NE methods due to their state-of-the-art performance [30]. However, the NE methods lack transparency, e.g.,Graph Neural Networks (GNNs) [14], when compared to simpler baselines. Thus,similar to many other machine learning algorithms [13], they could be vulnerableto adversarial attacks. It has been shown that simple imperceptible changes ofthe node attribute or the network topology can result in wrongly predicted nodelabels, especially for GNNs [56,7]. Meanwhile, adversarial attacks are easy to befound in our daily online lives, such as in recommender systems [50,28,47].

Robustness of NE methods for link prediction is important. Attacking linkprediction methods can be used to hide sensitive links, while defending canhelp identify the interactions hidden intentionally, e.g., important connectionsin crime networks. Moreover, as links in online social networks represent the in-formation sources and exposures, from the dynamic perspective, manipulationsof network topology can be used to affect the formation of public opinions oncertain topics, e.g., via exposing a targeted group of individuals to certain infor-mation sources, which is risky. The problem we want to investigate is: Could oneor a few small adversarial modifications to the network topology have a large im-pact on the link prediction performance when using a network embedding model?

Existing adversarial robustness studies for NE methods mainly consider clas-sification at the node and graph level, which investigates whether the labelswill be wrongly predicted due to adversarial perturbations. It includes semi-supervised node classification [56,57,58,46,55,45,11,3,60,59,40], and graph classi-fication [7,29,20]. Only a few works consider the link-level task [26,4,2,9], leavingrobustness of NE methods for link prediction insufficiently explored.

To fill the gap, we study the adversarial robustness of Conditional NetworkEmbedding (CNE) [22] for the link prediction task. CNE is a state-of-the-artprobabilistic NE model that preserves the first-order proximity, of which the ob-jective function is expressed analytically. Therefore, it provides mathematicallyprincipled explainability [23]. Moreover, comparing to other NE models, such asthose based on random walks [35,15], CNE is more friendly to link predictionbecause the link probabilities follow directly from the model so there is no needto further train a classifier for links with the node embeddings. However, therehas been no study on the adversarial robustness of CNE for link prediction.

In our work, we consider only the network topology as input, meaning thatthere is no node attribute. More specifically, given CNE and a network, we mea-sure the sensitivity of the link predictions of the model to small adversarial per-turbations of the network, i.e., the changes of the link status of a node pair. Thesensitivity is measured as the impact of the perturbation on the link predictions.Intuitively, we quantify the impact as the KL-divergence between the two link

Adversarial Robustness of Link Prediction 3

probability distributions learned by the model from the clean and the corruptednetwork through re-training. While the re-training can be expensive, we developeffective and efficient approximations based on the gradient information, whichis similar to the computation of the regularizer in Virtual Adversarial Training(VAT) [33]. Our main contributions are:

– We propose to study the adversarial robustness of a probabilistic networkembedding model CNE for link prediction;

– Our approach allows us to identify the links and non-links in the network thatare most vulnerable to adversarial perturbations for further investigation;

– With two case studies, we explain the robustness of CNE for link predic-tion through (a) illustrating how structural perturbations affect the linkpredictions; (b) analyzing the characteristics of the most and least sensitiveperturbations, providing insights for adversarial learning for link prediction.

– We show empirically that our gradient-based approximation for measuringthe sensitivity of CNE for link prediction to small structural perturbationsis not only time-efficient but also significantly effective.

2 Related Work

Robustness in machine learning means that a method can function correctly witherroneous inputs [18]. The input data may contain random noise embedded, oradversarial noise injected intentionally. The topic became a point of concernwhen the addition of noise to an image, which is imperceptible to human eyes,resulted in a totally irrelevant prediction label [13]. Robustness of models againstnoisy input has been investigated in many works [32,52,8], while adversarialrobustness usually deals with the worst-case perturbations on the input data.

Network tasks at the node, link, and graph level are increasingly done by net-work embedding methods, which include shallow models and GNNs [27]. Shal-low models either preserve the proximities between nodes (e.g., DeepWalk [35],LINE [39], and node2vec [15]) or factorize matrices containing graph informa-tion [41,36] to effectively represent the nodes as vectors. GNNs use deep structureto extract node features by iteratively aggregating their neighborhood informa-tion, e.g., Graph Convolutional Networks (GCNs) [24] and GraphSAGE [16].

Adversarial learning for networks includes three types of studies: attack, de-fense, and certifiable robustness [37,21,5]. Adversarial attacks aim to maximallydegrade the model performance through perturbing the input data, which in-cludes the modification of node attributes or changes of the network topology.Examples of attacking strategies for GNNs include the non-gradient based NET-TACK [56], Mettack using meta learning [58], SL-S2V with reinforcement learn-ing [7], and attacks by rewiring for graph classification [29]. The defense strate-gies are designed to protect the models from being attacked in many differentways, e.g., by detecting and recovering the perturbations [45], applying adver-sarial training [13] to resist the worst-case perturbation [11], or transferring theability to discriminate adversarial edges from exploring clean graphs [40]. Cer-tifiable robustness is similar in essence to adversarial defense, but it focuses on

4 X. Chen et al.

guarantee the reliability of the predictions under certain amounts of attacks. Thefirst provable robustness for GNNs was proposed to certify if a node label willbe changed under a bounded attack on node attributes [59], and later a simi-lar certificate for structural attack was proposed [60]. There are also robustnesscertifications for graph classification [20,12] and community detection [19]. Themost popular combination is GNNs for node or graph classification, while thelink-level tasks has been explored much less.

Early studies on robustness for link-level tasks usually target traditional linkprediction approaches. That includes link prediction attacks that aim to solvespecific problems in the social context, e.g., to hide relationships [10,43] or todisguise communities [42], and works that restrict the perturbation type to onlyadding or only deleting edges [54,53,48], which could result in less efficient attacksor defenses. The robustness for NE based link prediction is much less investi-gated than classification, and is considered more often as a way to evaluate therobustness of the NE method, such as in [34,2,38]. To the best of our knowledge,there are only two works on adversarial attacks for link prediction based on NE:one targeting the GNN-based SEAL [51] with structural perturbations and onetargeting GCN with iterative gradient attack [4].

3 Preliminaries

In this section, we provide the preliminaries of our work, including the notations,the probabilistic network embedding model CNE that we use for link prediction,and the virtual adversarial training method to which the our idea is similar.

3.1 Link Prediction with Probabilistic Network Embedding

Network embedding methods map nodes in a network onto a lower dimensionalspace as real vectors or distributions, and we work with the former type. Givena network G = (V,E), where V and E are the node and edge set, respectively,a network embedding model finds a mapping f : V → Rd for all nodes asX = [x1,x2, ...xn]T ∈ Rn×d. Those embeddings X can be used to visualizethe network in the d-dimensional space; classify nodes based on the similaritybetween vector pairs; and predict link probabilities between any node pair.

To do link prediction, a network embedding model requires a function g ofvectors xi and xj to calculate the probability of nodes i and j being linked. Thiscan be done by training a classifier with the links and non-links, or the functionfollows naturally from the model. Conditional Network Embedding (CNE) isthe probabilistic model on which our work is based, and of which the function gdirectly follows [22]. Suppose there is an undirected network G = (V,E) with itsadjacency matrix A, where aij = 1 if (i, j) ∈ E and 0 otherwise, CNE finds anoptimal embedding X∗ that maximizes the probability of the graph conditionedon that embedding. It maximizes its objective function:

P (G|X) =∏

(i,j)∈E

P (aij = 1|X)∏

(k,l)/∈E

P (akl = 0|X). (1)


To guarantee that the connected nodes are embedded closer and otherwise far-ther, the method uses two half normal distributions for the distance dij betweennodes i and j conditioned on their connectivity. By optimizing the objectivein Eq. (1), CNE finds the most informative embedding X∗ and the probabilitydistribution P (G|X∗) that defines the link predictor g(xi,xj) = P (aij = 1|X∗).

Many network embedding methods purely map nodes into vectors of lower di-mensions and focus on node classification, such the random-walk based ones [35,15,39]and GCNs [24]. Those methods require an extra step to measure the similari-ties between the pairs of node embeddings for link prediction. Comparing tothem, CNE is a better option for link prediction. Moreover, CNE provides goodexplainability for link predictions as g can be expressed analytically [23].

3.2 Virtual Adversarial Attack

Adversarial training achieved great performance for the supervised classifica-tion problem [13], and virtual adversarial training (VAT) is better for the semi-supervised setting [33]. By identifying the most sensitive ‘virtual’ direction forthe classifier, VAT uses regularization to smooth the output distribution. Theregularization term is based on the virtual adversarial loss of possible local per-turbations on the input data point. Let x ∈ Rd and y ∈ Q denote the inputdata vector of dimension d and the output label in the space of Q, respectively.

The labeled data is defined as Dl ={x(n)l , y

(n)l |n = 1, ..., Nl

}, the unlabeled

data as Dul ={x(m)ul |m = 1, ..., Nul

}, and the output distribution as p(y|x, θ)

parametrized by θ. To quantify the influence of any local perturbation on x∗(either xl or xul), VAT has the Local Distribution Smoothness (LDS),

LDS(x∗, θ) := D[p(y|x∗, θ), p(y|x∗ + rvadv, θ)

](2)

rvadv := argmaxr;||r||2≤εD[p(y|x∗, θ), p(y|x∗ + r, θ)

], (3)

where D can be any non-negative function that measures the divergence betweentwo distributions, and p(y|x, θ) is the current estimate of the true output distri-bution q(y|x). The regularization term is the average LDS for all data points.

Although VAT was designed for classification with tabular data, the idea ofit is essentially similar to our work, i.e., we both quantify the influence of localvirtual adversarial perturbations. For us, that is the link status of a node pair. Aswe have not yet included the training with a regularization term in this work, wenow focus on finding the rvadv in Eq. (3). That is to identify the most sensitiveperturbations that will change the link probabilities the most.

4 Quantifying the Sensitivity to Small Perturbations

With the preliminaries, we now formally introduce the specific problem we studyin this paper. That is, to investigate if there is any small perturbations to the

6 X. Chen et al.

network that have large impact on the link prediction performance. The smallperturbations we look into are the edge flips, which represent either the deletionof an existing edge or the addition of a non-edge. It means that we do not restrictthe structural perturbations to merely addition or merely deletion of edges.

Intuitively, that impact of any small virtual adversarial perturbation can bemeasured by re-training the model. But re-training, namely re-embedding thenetwork using CNE, can be computationally expensive. Therefore, we also in-vestigate on approximating the impact both practically with incremental partialre-embedding, and theoretically with the gradient information.

4.1 Problem Statement and Re-Embedding (RE)

The study of the adversarial robustness for link prediction involves identifyingthe worst-case perturbations on the network topology, namely the changes ofthe network topology that influence the link prediction results the most. Forimperceptibility, we focus on the small structural perturbation of individual edgeflip in this work. Thus, our specific problem is defined as

Problem 1 (Impact of a structural perturbation). Given a network G = (V,E), anetwork embedding model, how can we measure the impact of each edge flip inthe input network on the link prediction results of the model?

Intuitively, the impact can be measured by assuming the edge flip as a virtualattack, flip the edge and retrain the model with the virtually corrupted network,after which we know how serious the attack is. That means we train CNE withthe clean graph G = (V,E) to obtain the link probability distribution P ∗ =P (G|X∗(A)). After flipping one edge, we get the corrupted graph G′ = (V,E′),retrain the model, and obtain a different link probability Q∗ = Q(G′|X∗(A′)).Then we measure the impact of the edge flip as the KL-divergence between P ∗

and Q∗. In this way, we also know how the small perturbation changes the nodeembeddings, which helps explain the influence of the virtual attack.

If the virtual edge flip is on node pair (i, j), a′ij = 1 − aij where aij is thecorresponding entry in the adjacency matrix of the clean graph A and a′ij of thecorrupted graph A′. Re-embedding G′ with CNE results in probability Q∗(i, j),then the impact of flipping (i, j), which we consider as the sensitivity of themodel to the perturbation on that node pair, denoted as s(i, j), is:

s(i, j) = KL [P ∗||Q∗(i, j)] . (4)

Measured practically, this KL-divergence is the actual impact for each possibleedge flip on the predictions. The optimal embeddings X∗(A) and X∗(A′) notonly explain the influenced link predictions but also exhibit the result of the flip.

Ranking the node pairs in the network by the sensitivity measure for allnode pairs allows us to identify the most and least sensitive links and non-links for further investigation. However, re-embedding the entire network can becomputationally expensive, especially for large networks. The sensitivity measurecan be approximated both empirically and theoretically, and we will show howthis can be done in the rest of this section.


4.2 Incremental Partial Re-Embedding (IPRE)

Empirically, one way to decrease the computational cost is to incrementally re-embed only the two corresponding nodes of the flipped edge. In this case, ourassumption is that the embeddings of all nodes except the two connecting theflipped edge (i.e., node i and j) will stay unchanged since the perturbation issmall and local. We call it Incremental Partial Re-Embedding (IPRE), whichallows only the changes of xi and xj if (i, j) is flipped. It means that the impactof the small perturbation on the link probabilities is restricted within the one-hop neighborhood of the two nodes, resulting in the changed link predictionsbetween node i and j with the rest of the nodes. The definition of the impactin Eq (4) stills holds and only the ith and jth columns and rows in the linkprobability matrix have non-zero values. Comparing to RE, IPRE turns out tobe a faster and effective approximation, which we will show with experiments.

4.3 Theoretical Approximation of the KL-Divergence

Incrementally re-embedding only the two nodes of the flipped edge is faster but itis still re-training of the model. Although our input is non-iid, in contrast to thetabular data used in VAT [33], we can form our problem as in Eq. (5), of whichthe solution is the most sensitive structural perturbation for link prediction.

∆A := argmax∆A;||∆A||=2KL[P (G|X∗(A)), P (G|X∗(A +∆A))

]. (5)

CNE has its link probability distribution expressed analytically, so the impactof changing the link status of node pair (i, j), represented by the KL-divergencein Eq (4) can be approximated theoretically. Given the clean graph G, CNElearns the optimal link probability distribution P ∗ = P (G|X∗(A)) whose entryis P ∗kl = P (akl = 1|X∗). Let Q∗(i, j) be the optimal link probability distributionof the corrupted graph G′ with only (i, j) flipped from the clean graph. Theimpact of the flip s(i, j) can be decomposed as,

s(i, j) = KL [P ∗||Q∗(i, j)] =∑[

p logp

q+ (1− p) log

1− p1− q

], (6)

where p and q are entries of P ∗ and Q∗(i, j) respectively. We can approximates(i, j) at G, or equivalently, at P ∗, as G is close to G′ thus P ∗ is close to Q∗(i, j).

The first-order approximation of s(i, j) is a constant because at G its gradient∂KL[P∗||Q∗(i,j)]

∂aij= 0, so we turn to the second-order approximation in Eq. (7),

which, evaluated at G, is s(i, j) in Eq. (8). That requires the gradient of each

link probability w.r.t the edge flip, i.e., ∂p∂aij

=∂P∗

kl

∂aij. Now we will show how to

compute it with CNE.

s(i, j) ≈∂KL [P ∗||Q∗(i, j)]∂aij

∆A +1

2

∂2KL [P ∗||Q∗(i, j)])∂a2ij

∆A2, (7)

s(i, j) =1

2

∑ 1

p (1− p)

[∂p

∂aij

]2. (8)

8 X. Chen et al.

The gradient. At the graph level, the gradient of a link probability P ∗kl for

node pair (k, l) w.r.t the input graph A is∂P∗

kl

∂A =∂P∗

kl

∂X∗(A)∂X∗(A)∂A . While at the

node pair level, the gradient of P ∗kl w.r.t. aij is

∂P ∗kl∂aij

=∂P ∗kl

∂x∗(A)

∂x∗(A)

∂aij(9)

= x∗T (A)EklETkl

[−H

γ2P ∗kl(1− P ∗kl)

]−1EijE

Tijx∗(A), (10)

where for clearer presentation we flatten the matrix X to a vector x that isnd× 1, Ekl is a column block matrix consisting of n d× d blocks where the k-thand l-th block are positive and negative identity matrix I and −I of the rightsize respectively and 0s elsewhere, and H is the full Hessian below

H = γ∑u6=v

[(P ∗uv − auv)EuvE

Tuv − γP ∗uv(1− P ∗uv)EuvE

Tuvx

∗(A)x∗T (A)EuvETuv

].

The gradient reflects the fact that the change of a link status in the networkinfluences the embeddings x∗, and then the impact is transferred through x∗ tothe link probabilities of the entire graph. In other words, if an important relation(in a relatively small network) is perturbed, it could cause large changes in manyP ∗kls, deviating them from their predicted values with the clean graph.

The gradient in Eq. (10) is exact and measures the impact all over the net-work. However, the computation of the inverse of the full Hessian can be expen-sive when the network size is large. But fortunately, H can be well approximatedwith its diagonal blocks [23], which are of size d × d each block. So we can ap-proximate the impact of individual edge flip with s(i, j) at a very low cost using

∂P ∗kl∂aki

= (x∗k − x∗l )T

[−Hk

γ2P ∗kl(1− P ∗kl)

]−1(x∗k − x∗i ), (11)

where Hk = γ∑l:l 6=k

[(P ∗kl − akl)I − γP ∗kl(1− P ∗kl)(x∗k − x∗l )(x

∗k − x∗l )

T]

is thekth diagonal block of H. Here P ∗kl is assumed to be influenced only by xk andxl, thus only the edge flips involving node k or l will result in non-zero gradientfor P ∗kl. It essentially corresponds to IPRE, where only the attacked nodes areallowed to move in the embedding space. In fact, as the network size grows,local perturbations are not likely to spread the influence broadly. We will showempirically this theoretical approximation is both efficient and effective.

5 Experiments

For the purpose of evaluating our work, we first focus on illustrating the robust-ness of CNE for link prediction with two case studies, using two networks ofrelatively small sizes. Then we evaluate the approximated sensitivity for nodepairs on larger networks. The research questions we want to investigate are:


– How to understand the sensitivity of CNE to an edge flip for link prediction?– What are the characteristics of the most and least sensitive perturbations

for link prediction using CNE?– What are the quality and the runtime performance of the approximations?

Data. The data we use includes six real world networks of varying sizes. Karateis a social network of 34 members in a university karate club, which has 78 friend-ship connections [49]. Polbooks network describes 441 Amazon co-purchasingrelations among 105 books about US politics [1]. C.elegans is a neural net-work of the nematode C.elegans with 297 neurons linked by 2148 synapses [44].USAir is a transportation network of 332 airports as nodes and 2126 airlinesconnecting them as links [17]. MP is the largest connected part of a Twitterfriendship network for the Members of Parliament (MP) in the UK during April2019, having 567 nodes and 49631 edges [6]. Polblogs is a network with 1222political blogs as nodes and 16714 hyperlinks as undirected edges, which is thelargest connected part of the US political blogs network from [1].

Setup. We do not have train-test split, because we want to measure the sensi-tivity of all link probabilities of CNE to all small perturbations of the network.The CNE parameters are σ2 = 2, d = 2 for the case studies, d = 8 for evaluatingthe approximation quality, learning rate is 0.2, max iter = 2k, and ftol = 1e− 7.

5.1 Case Studies

The first two research questions will be answered with the case studies on Karateand Polbooks, which are relatively small thus can be visualized clearly. Bothnetworks also have ground-truth communities, which contributes to our analysis.With Karate, we show how the small perturbations influence link probabilitiesvia node embeddings. On Polbooks, we analyze the characteristics of the mostand least sensitive perturbations. Note that we use the dimension 2 for both thevisualization of CNE embeddings and the calculation of the sensitively.

Karate. To show the process of attacking CNE link prediction on Karate, weillustrate and analyze how the most sensitive edge deletion and addition affectthe model in predicting links. With the RE approach, we measure the modelsensitivity to single edge flip and find the top 5 sensitive perturbations in Table 1.The most sensitive deletion of link (1, 12) disconnects the network, and we donot consider this type of perturbation in our work because it is obvious andeasy to be detected. We see the other top sensitive perturbations are all cross-community, and we pick node pairs (1, 32) and (6, 30) for further study.

Fig. 1 shows the CNE embeddings of the clean Karate and the perturbedgraphs, where the communities are differentiated with green and red color. CNEembeddings might have nodes overlap when d = 2, such as node 6 and 7, becausethey have the same neighbors, but this will not be a problem if d is higher.

The deletion of edge (1, 32) is marked with a cross in Fig. 1 (a), after which thechanged node embeddings are shown in Fig. 1 (b). Although being rotated, therelative locations of the nodes change a lot, especially node 1, 32, and those in the

10 X. Chen et al.

Table 1. The Top 5 Sensitive Perturbations

Rank Node Pair s(i, j) A[i, j] Community?

1 (1, 12) 12.30 1 within

2 (1, 32) 2.52 1 cross

3 (20, 34) 1.96 1 cross

4 (6, 30) 1.75 0 cross

5 (7, 30) 1.75 0 cross

Table 2. Runtime in seconds

RE IPRE Approx

Polbooks 0.889 0.117 0.00012

C.elegans 2.819 0.568 0.00045

USAir 6.206 0.781 0.00043

MP 8.539 2.289 0.00116

Polblogs 45.456 27.648 0.00124

(a) Clean graph (b) After deleting (1, 32) (c) After adding (6, 30)

Fig. 1. Case study on Karate with the most sensitive perturbations.

boundary between the communities, e.g., node 3 and 10. Node 1 is pushed awayfrom the red nodes, and as the center of the green nodes, it plays an essential rolein affecting many other link probabilities. Comparing to other cross-communityedges, (1, 32) is the most sensitive because both nodes have each other as theonly cross-community link. So the deletion largely decreases the probability oftheir neighbors connecting to the other community. Moreover, node 1 has a highdegree. Therefore, it makes sense that this is the most sensitive edge deletion.

The addition of edge (6, 30) is marked as a dashed arc in Fig. 1 (a), and thecase is similar for (7, 30). Adding the edge changes the node locations as shownin Fig. 1 (c). The distant tail in green that ends with node 17 moves closer to thered community. Note that both node 6 and 30 had only the within-communitylinks before the perturbation. Even though their degrees are not very high, theadded edge changes the probabilities of many cross-community links from almostzero to some degree of existence, pulling nodes to the other community.

Polbooks. Polbooks has three types of political books, which are liberal (L),neutral (N), and conservative (C), marked with colors red, purple, and blue,respectively. Shown in Table 3 are the most and least sensitive perturbations,where the left column are the Top 2 deletions and the middle and right columnsare the top 5 additions. We do so as real networks are usually sparse. The rankis based on the sensitivity measure, thus the non-sensitive perturbations areranked bottom (i.e., 5460). Then we will mark the those perturbations in theCNE embeddings, for edge deletions and additions separately.

The edge deletions are marked in Fig. 2, and we see the most sensitive ones arecross-community while the least sensitive ones are within-community. Similar to


Table 3. The Top Sensitive and Non-Sensitive Perturbations

Edge Deletion - S Edge Addition - S Edge Addition - Non-S

Rank Node Pair s(i, j) Community Rank Node Pair s(i, j) Community Rank Node Pair s(i, j) Community

1 (46, 102) 16.91 N-L 2 (3, 98) 15.53 C-L 5458 (37, 39) 0.035 C-C

15 (7, 58) 14.64 N-C 3 (3, 87) 15.42 C-L 5454 (8, 47) 0.036 C-C

Edge Deletion - Non-S 4 (28, 33) 14.98 N-C 5451 (33, 35) 0.038 C-C

5460 (72, 75) 0.033 L-L 5 (25, 98) 14.96 C-L 5449 (30, 71) 0.039 L-L

5459 (8, 12) 0.034 C-C 6 (25, 91) 14.92 C-L 5438 (66, 75) 0.042 L-L

the Karate case, node pair (46, 102) has each other as the only cross-communitylink, after deleting which the node embeddings will be affected significantly.Edge (7, 58) is in the boundary between liberal and conservative nodes, and ithas a neutral book. As the predictions in the boundary are already uncertain,one edge deletion would fluctuate many predictions, resulting in high sensitivity.The least-sensitive edge deletions are not only within-community, but are alsobetween high-degree nodes, i.e., d72 = 22, d75 = 16, d8 = d12 = 25. These nodeshave already been well connected to nodes of the same type, thus they have stableembeddings and the deletions have little influence on relevant predictions.

We mark the edge additions separately for the sensitive and non-sensitiveperturbations in Fig. 3, to contrast their difference. The left Fig. 3 (a) shows thetop 5 sensitive edge additions are all cross-community, and all include at least onenode at the distant place from the opposing community, i.e., nodes 33, 91, 87, 98.Being distant means those nodes have only the within-community connections,while adding a cross-community link would confuse the link predictor on the pre-dictions for many relevant node pairs. Meanwhile, as the sensitive perturbationsinvolve low-degree nodes, they are usually unnoticeable while weighted highlyby those nodes. The non-sensitive edge additions are similar to the non-sensitivedeletions in the sense that both have the pair of nodes embedded closely. Aslong as the two nodes are mapped closely in the embedding space, it makes littledifference if they are connected and the node degree does not matter much.

Interestingly, our observations in the case studies agree only partially with aheuristic community detection attack strategy called DICE [42], which has beenused as a baseline for attacking link prediction in [4]. Inspired by modularity,DICE randomly disconnect internally and connect externally [42], of which thegoal is to hide the a group of nodes from being detected as a community. Ouranalysis agrees with connecting externally, while for link prediction the discon-nection should also be external, meaning that disconnecting internally might notwork for link prediction. If the internal disconnection are sampled to node pairsthat are closely positioned, the attack will have the little influence. Therefore, itmight not be suitable to use DICE for link prediction attacks.

5.2 Quality and Runtime of Approximations

We use the sensitivity measured by re-embedding (RE) as the ground truthimpact of the small perturbations. The quality of an approximation is determinedby how close it is to the ground truth. As the sensitivity is a ranked measure, we

12 X. Chen et al.

46102

7

58

7275

812

Fig. 2. Case study on Polbooks with the most and least sensitive edge deletion.

(a) Sensitive (b) Non-sensitiveFig. 3. Case study on Polbooks with the most and least sensitive edge addition.

use the normalized discounted cumulative gain (NDCG) to evaluate the qualityof the empirical approximation IPRE and the theoretical approximation withthe diagonal Hessian blocks Approx. The closer the NDCG value is to 1, thebetter. We do not include the theoretical approximation with the exact Hessianbecause it can be more computationally expensive than RE for large networks.To show the significance, the p-value of each NDCG is found with randomizationtest of 1,000 samples. The runtime for computing the sensitivity of one edge flipis recorded on a server with Intel Xeon Gold CPU 3.00GHz and 1024GB RAM.

Shown in Table 4 are the quality of the approximations on five real-worldnetworks. The first two columns show how well IPRE and Approx approximate


RE, and the third column shows how well Approx approximates IPRE. We seethe NDCG values in the table are all significantly high. Comparing to Approx,IPRE better approximates RE, and as the network size gets relatively large, theNDCG is alway larger than 0.99, indicating that the larger the network, the morelocal the impact of a small perturbation. For Approx, the NDCG for approxi-mating RE are high across datasets, but it is even higher for IPRE. The reasonis that both Approx and IPRE essentially make the same assumption that theinfluence of the perturbation will be spread only to the one-hop neighborhood.

The approximations are not only effective, but also time-efficient. We seein Table 2 that RE is the slowest, IPRE is faster, and Approx is significantlymuch faster than the previous two empirical approaches, especially for largernetworks. On the Polblogs network, Approx is 36k times faster than RE and 22ktimes faster than IPRE. It shows that our method also scales to large networks.

Table 4. Quality of the Approximations - NDCG

ground truth RE IPRE

approximation IPRE Approx Approx

NDCG p-value NDCG p-value NDCG p-value

Polbooks (n = 105) 0.9691 0.0 0.9700 0.0 0.9873 0.0

C.elegans (n = 297) 0.9977 0.0 0.9880 0.0 0.9905 0.0

USAir (n = 332) 0.9902 0.0 0.9697 0.0 0.9771 0.0

MP (n = 567) 0.9985 0.0 0.9961 0.0 0.9960 0.0

Polblogs (n = 1222) 0.9962 0.0 0.9897 0.0 0.9899 0.0

6 Conclusion

In this work we study the adversarial robustness of a probabilistic network em-bedding model CNE for the link prediction task by measuring the sensitivity ofthe link predictions of the model to small adversarial perturbations of the net-work. Our approach allows us to identify the most vulnerable links and non-linksthat if perturbed will have large impact on the model’s link prediction perfor-mance, which can be used for further investigation, such as defending attacksby protecting those. With two case studies, we analyze the characteristics of themost and least sensitive perturbations for link prediction with CNE. Then weempirically confirm that our theoretical approximation of the sensitivity mea-sure is both effective and efficient, meaning that the worst-case perturbationsfor link prediction using CNE can be identified successfully in a time-efficientmanner with our method. For future work, we plan to explore the potential ofour theoretical approximation to construct a regularizer for adversarially robustnetwork embedding or to develop robustness certificates for link prediction.

Acknowledgement

The research leading to these results has received funding from the EuropeanResearch Council under the European Union’s Seventh Framework Programme

14 X. Chen et al.

(FP7/2007-2013) (ERC Grant Agreement no. 615517), and under the Euro-pean Union’s Horizon 2020 research and innovation programme (ERC GrantAgreement no. 963924), from the Flemish Government under the “Onderzoek-sprogramma Artificiele Intelligentie (AI) Vlaanderen” programme, and from theFWO (project no. G091017N, G0F9816N, 3G042220).

References

1. Adamic, L.A., Glance, N.: The Political Blogosphere and the 2004 U.S. Election:Divided They Blog. In: Proc. of LinkKDD 2005. pp. 36–43 (2005)

2. Bojchevski, A., Gunnemann, S.: Adversarial Attacks on Node Embeddings viaGraph Poisoning. In: Proc. of the 36th ICML. pp. 695–704 (2019)

3. Bojchevski, A., Gunnemann, S.: Certifiable Robustness to Graph Perturbations.In: Proc. of the 33rd NeurIPS. vol. 32 (2019)

4. Chen, J., Lin, X., Shi, Z., Liu, Y.: Link Prediction Adversarial Attack via IterativeGradient Attack. IEEE Trans. Comput. Soc. Syst. 7(4), 1081–1094 (2020)

5. Chen, L., Li, J., Peng, J., Xie, T., Cao, Z., Xu, K., He, X., Zheng, Z.: A Survey ofAdversarial Learning on Graphs. arXiv preprint arXiv:2003.05730 (2020)

6. Chen, X., Kang, B., Lijffijt, J., De Bie, T.: ALPINE: Active Link Prediction UsingNetwork Embedding. Applied Sciences 11(11), 5043 (2021)

7. Dai, H., Li, H., Tian, T., Huang, X., Wang, L., Zhu, J., Song, L.: Adversarial attackon graph structured data. In: Proc. of the 35th ICML. pp. 1115–1124 (2018)

8. Dai, Q., Li, Q., Tang, J., Wang, D.: Adversarial Network Embedding. In: Proc. ofthe 32nd AAAI. vol. 32 (2018)

9. Dai, Q., Shen, X., Zhang, L., Li, Q., Wang, D.: Adversarial Training Methods forNetwork Embedding. In: Proc. of the 28th WWW. pp. 329–339 (2019)

10. Fard, A.M., Wang, K.: Neighborhood Randomization for Link Privacy in SocialNetwork Analysis. World Wide Web 18(1), 9–32 (2015)

11. Feng, Fuli and He, Xiangnan and Tang, Jie and Chua, Tat-Seng: Graph adversarialtraining: Dynamically regularizing based on graph structure. IEEE Trans. Knowl.Data Eng. 33(6), 2493–2504 (2021)

12. Gao, Zhidong and Hu, Rui and Gong, Yanmin: Certified Robustness of GraphClassification against Topology Attack with Randomized Smoothing. In: Proc. ofthe GLOBECOM 2020. pp. 1–6 (2020)

13. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and Harnessing AdversarialExamples. In: Proc. of the 3rd ICLR (2015)

14. Gori, M., Monfardini, G., Scarselli, F.: A New Model for Learning in Graph Do-mains. In: Proc. of 2005 IEEE IJCNN. vol. 2, pp. 729–734 (2005)

15. Grover, A., Leskovec, J.: node2vec: Scalable Feature Learning for Networks. In:Proc. of the 22nd ACM SIGKDD. pp. 855–864 (2016)

16. Hamilton, W., Ying, Z., Leskovec, J.: Inductive Representation Learning on LargeGraphs. In: Proc. of the 31st NeurIPS. vol. 30 (2017)

17. Handcock, M.S., Hunter, D.R., Butts, C.T., Goodreau, S.M., Morris, M.: stat-net: An R package for the Statistical Modeling of Social Networks. Web pagehttp://www.csde.washington.edu/statnet (2003)

18. IEEE: IEEE Standard Glossary of Software Engineering Terminology. IEEE Std610.12-1990 pp. 1–84 (1990). https://doi.org/10.1109/IEEESTD.1990.101064

19. Jia, J., Wang, B., Cao, X., Gong, N.Z.: Certified Robustness of Community Detec-tion against Adversarial Structural Perturbation via Randomized Smoothing. In:Proc. of the 29th WWW. pp. 2718–2724 (2020)

https://doi.org/10.1109/IEEESTD.1990.101064


20. Jin, H., Shi, Z., Peruri, V.J.S.A., Zhang, X.: Certified Robustness of Graph Con-volution Networks for Graph Classification under Topological Attacks. In: Proc. ofthe 34th NeurIPS. vol. 33, pp. 8463–8474 (2020)

21. Jin, W., Li, Y., Xu, H., Wang, Y., Tang, J.: Adversarial attacks and defenses ongraphs: A review and empirical study. arXiv preprint arXiv:2003.00653 (2020)

22. Kang, B., Lijffijt, J., De Bie, T.: Conditional Network Embeddings. In: Proc. ofthe 7th ICLR (2019)

23. Kang, B., Lijffijt, J., De Bie, T.: ExplaiNE: An Approach for Explaining NetworkEmbedding-based Link Predictions. arXiv preprint arXiv:1904.12694 (2019)

24. Kipf, T.N., Welling, M.: Semi-Supervised Classification with Graph ConvolutionalNetworks. In: Proc. of the 5th ICLR (2017)

25. Liben-Nowell, D., Kleinberg, J.: The Link-Prediction Problem for Social Networks.J. Am. Soc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)

26. Lin, W., Ji, S., Li, B.: Adversarial Attacks on Link Prediction Algorithms Based onGraph Neural Networks. In: Proc. of the 15th ACM AsiaCCS. pp. 370–380 (2020)

27. Liu, X., Tang, J.: Network Representation Learning: A Macro and Micro View28. Liu, Z., Larson, M.: Adversarial Item Promotion: Vulnerabilities at the Core of

Top-N Recommenders That Use Images to Address Cold Start. In: Proc. of the30th WWW. pp. 3590–3602 (2021)

29. Ma, Y., Wang, S., Derr, T., Wu, L., Tang, J.: Attacking Graph ConvolutionalNetworks via Rewiring. arXiv preprint arXiv:1906.03750 (2019)

30. Mara, A.C., Lijffijt, J., De Bie, T.: Benchmarking Network Embedding Models forLink Prediction: Are We Making Progress? In: Proc. of the 7th IEEE DSAA. pp.138–147 (2020)

31. Martınez, V., Berzal, F., Cubero, J.C.: A Survey of Link Prediction in ComplexNetworks. ACM Comput. Surv. 49(4), 1–33 (2016)

32. Mirzasoleiman, B., Cao, K., Leskovec, J.: Coresets for Robust Training of DeepNeural Networks against Noisy Labels. In: Proc. of the 34th NeurIPS. vol. 33, pp.11465–11477 (2020)

33. Miyato, T., Maeda, S.i., Koyama, M., Ishii, S.: Virtual Adversarial Training: ARegularization Method for Supervised and Semi-Supervised Learning. IEEE PAMI41(8), 1979–1993 (2018)

34. Pan, S., Hu, R., Long, G., Jiang, J., Yao, L., Zhang, C.: Adversarially RegularizedGraph Autoencoder for Graph Embedding. In: Proc. of the 27th IJCAI. pp. 2609–2615 (2018)

35. Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: Online Learning of Social Repre-sentations. In: Proc. of the 20th ACM SIGKDD. pp. 701–710 (2014)

36. Qiu, J., Dong, Y., Ma, H., Li, J., Wang, K., Tang, J.: Network Embedding asMatrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec. In: Proc. ofthe 11th ACM WSDM. pp. 459–467 (2018)

37. Sun, L., Dou, Y., Yang, C., Wang, J., Yu, P.S., He, L., Li, B.: Adversarial attackand defense on graph data: A survey. arXiv preprint arXiv:1812.10528 (2018)

38. Sun, M., Tang, J., Li, H., Li, B., Xiao, C., Chen, Y., Song, D.: Data poisoning attackagainst unsupervised node embedding methods. arXiv preprint arXiv:1810.12881(2018)

39. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: LINE: Large-scaleinformation network embedding. In: Proc. of the 24th WWW. pp. 1067–1077 (2015)

40. Tang, X., Li, Y., Sun, Y., Yao, H., Mitra, P., Wang, S.: Transferring Robustness forGraph Neural Network against Poisoning Attacks. In: Proc. of the 13th WSDM.pp. 600–608 (2020)

16 X. Chen et al.

41. Wang, X., Cui, P., Wang, J., Pei, J., Zhu, W., Yang, S.: Community PreservingNetwork Embedding. In: Proc. of the 31st AAAI. vol. 31 (2017)

42. Waniek, M., Michalak, T.P., Wooldridge, M.J., Rahwan, T.: Hiding individuals andcommunities in a social network. Nature Human Behaviour 2(2), 139–147 (2018)

43. Waniek, M., Zhou, K., Vorobeychik, Y., Moro, E., Michalak, T.P., Rahwan, T.:How to Hide One’s Relationships from Link Prediction Algorithms. Scientific Re-ports 9(1), 1–10 (2019)

44. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature393(6684), 440–442 (1998)

45. Wu, H., Wang, C., Tyshetskiy, Y., Docherty, A., Lu, K., Zhu, L.: AdversarialExamples for Graph Data: Deep Insights into Attack and Defense. In: Proc. of the28th IJCAI. pp. 4816–4823 (2019)

46. Xu, K., Chen, H., Liu, S., Chen, P.Y., Weng, T.W., Hong, M., Lin, X.: TopologyAttack and Defense for Graph Neural Networks: An Optimization Perspective. In:Proc. of the 28th IJCAI. pp. 3961–3967 (2019)

47. Yang, G., Gong, N.Z., Cai, Y.: Fake Co-visitation Injection Attacks to Recom-mender Systems. In: Proc. of the 24th NDSS (2017)

48. Yu, S., Zhao, M., Fu, C., Zheng, J., Huang, H., Shu, X., Xuan, Q., Chen, G.: TargetDefense Against Link-Prediction-Based Attacks via Evolutionary Perturbations.IEEE Trans. Knowl. Data Eng. 33(2), 754–767 (2021)

49. Zachary, W.W.: An Information Flow Model for Conflict and Fission in SmallGroups. J. Anthropol. Res. 33(4), 452–473 (1977)

50. Zhang, H., Li, Y., Ding, B., Gao, J.: Practical Data Poisoning Attack againstNext-Item Recommendation. In: Proc. of the 29th WWW. pp. 2458–2464 (2020)

51. Zhang, M., Chen, Y.: Link Prediction Based on Graph Neural Networks. In: Proc.of the 32nd NeurIPS. vol. 31 (2018)

52. Zheng, C., Zong, B., Cheng, W., Song, D., Ni, J., Yu, W., Chen, H., Wang, W.:Robust Graph Representation Learning via Neural Sparsification. In: Proc. of the37th ICML. pp. 11458–11468 (2020)

53. Zhou, K., Michalak, T.P., Vorobeychik, Y.: Adversarial Robustness of Similarity-based Link Prediction. In: Proc. of the 19th IEEE ICDM. pp. 926–935 (2019)

54. Zhou, K., Michalak, T.P., Waniek, M., Rahwan, T., Vorobeychik, Y.: AttackingSimilarity-Based Link Prediction in Social Networks. In: Proc. of the 18th AAMAS.pp. 305–313 (2019)

55. Zhu, D., Zhang, Z., Cui, P., Zhu, W.: Robust graph convolutional networks againstadversarial attacks. In: Proc. of the 25th ACM SIGKDD. pp. 1399–1407 (2019)

56. Zugner, D., Akbarnejad, A., Gunnemann, S.: Adversarial attacks on neural net-works for graph data. In: Proc. of the 24th ACM SIGKDD. pp. 2847–2856 (2018)

57. Zugner, D., Borchert, O., Akbarnejad, A., Guennemann, S.: Adversarial Attackson Graph Neural Networks: Perturbations and their Patterns. ACM Trans. Knowl.Discov. Data 14(5), 1–31 (2020)

58. Zugner, D., Gunnemann, S.: Adversarial Attacks on Graph Neural Networks viaMeta Learning. In: Proc. of the 7th ICLR (2019)

59. Zugner, D., Gunnemann, S.: Certifiable robustness and robust training for graphconvolutional networks. In: Proc. of the 25th ACM SIGKDD. pp. 246–256 (2019)

60. Zugner, D., Gunnemann, S.: Certifiable robustness of graph convolutional networksunder structure perturbations. In: Proc. of the 26th ACM SIGKDD. pp. 1656–1665(2020)

adversarial robustness of probabilistic network embedding

Documents