a graph kernel approach for detecting core patents and ...perpustakaan.unitomo.ac.id › repository...

44 1541-1672/14/$31.00 © 2014 IEEE IEEE INTELLIGENT SYSTEMSPublished by the IEEE Computer Society

A Graph Kernel Approach for Detecting Core Patents and Patent GroupsDohyun Kim, Korea Institute of Science and Technology Information and Myongji University

Bangrae Lee, Korea Institute of Science and Technology Information and University of Seoul

Hyuck Jai Lee, Korea Institute of Science and Technology Information

Sang Pil Lee and Yeongho Moon, Korea Institute of Science and Technology Information and Korea University of Science and Technology

Myong K. Jeong, Rutgers University

An approach

to discovering

core patents and

clustering patents

uses a patent citation

network in which

core patents are

represented as an

influential node and

patent groups as a

cluster of nodes.

connections within the community but sparse connections with nodes from other commu-nities.4 For example, a community detection method in a citation network finds similar patent (or paper) subgroups. However, the methods used to discover influential nodes and detect communities have previously been studied separately, especially in a citation net-work.3,5–8 Simultaneously exploring influential nodes and communities allows for the easy discovery of significant nodes in each

community and the recognition of the distri-bution of similar nodes around the significant nodes, which isn’t true for existing community detection methods or centrality measures.

Moreover, community detection in a citation network has generally been conducted using document similarity measures such as biblio-graphic coupling9 and co-citation.10 The bibli-ographic coupling measure counts the number of common references used by two documents, while a co-citation measure is defined as the

A long with the influential nodes detection problem, the graph nodes cluster-

ing or community detection problem is one of the main topics in network

analysis. Researchers in diverse fields, including computer science and physics,

have studied the problem.1–3 A cluster or community has dense node-to-node

C o r e P a t e n t G r o u P i n G

IS-29-04-Jeong.indd 44 9/2/14 1:33 PM

JuLY/AuGuST 2014 www.computer.org/intelligent 45

frequency with which two documents are cited together.11 The bibliographic coupling and co-citation are the most popular citation-based similarity mea-sures of documents. Although many studies12-13 have shown that biblio-graphic coupling and co-citation can be used to cluster patents or papers suc-cessfully, these two similarity measures have the drawback that they inevita-bly overlook the case that two comple-mentary documents are connected by an edge. Hence, our proposed method focuses on simultaneously discovering influential nodes (that is, core patents) and detecting communities (that is, pat-ent groups) in a patent citation net-work over all nodes. Additionally, the proposed method tries to alleviate the complementary edge issue.

For the purpose of the present study, we introduce a kernel k-means cluster-ing algorithm with a graph kernel. A graph kernel in network analysis helps to compute implicit similarities between patents in a high-dimensional feature space.3 Then, the kernel k-means cluster-ing algorithm performs patent clustering based on a similarity matrix obtained from the graph kernel. Namely, kernel k-means clustering based on graph ker-nel searches clusters and their centers in the feature space, not in the original input space. The cluster center is ob-tained to minimize the sum of distances from other nodes within a cluster in fea-ture space. This means the center is the most tightly connected to other patents by citations. For this reason, we define the center of each cluster as the clus-ter’s core patent. The concept is similar to that of closeness centrality, which ex-plores a node with the minimum length sum of shortest path to other nodes from the target node. The difference is that the kernel k-means clustering uses the distance in feature space, which is calculated using a graph kernel, instead of the length of the shortest path. A sim-ilar approach has been applied in the

declarative modeling field for searching representative designers.14

Counting the number of citations of each patent can be one of the alterna-tive approaches to finding the cluster center in each cluster. However, this approach counts only directly con-nected edges and thus can’t distinguish between similar and complementary documents. In general, the similar docu-ments have one or more directly or in-directly connected common references, while complementary documents have only direct citations. Therefore, the proposed graph kernel-based approach prevents the complementary patents, which aren’t similar, from being classi-fied as similar by considering even in-directly connected edges. That is, the proposed graph kernels, including the exponential diffusion kernel and von Neumann diffusion kernel, compute the summation of rth power of adjacent matrix A(r = 1, …, ∞) of which value means the weighted sum over all paths from one node to another.15 This ap-proach leads to lower similarity scores for complementary edges and higher similarity scores for similar edges and then selects the most tightly connected patents to others as a cluster center. Af-ter all, the proposed approach alleviates the complementary edge issue.

Note that, when using kernel k-means clustering, the center of each cluster is detected, and each cluster is built based on each cluster’s center. This procedure is performed iteratively until conver-gence is complete. We consider the fi-nally obtained cluster centers as the core patents and the determined clus-ters as the patent communities. From this perspective, building clusters and detecting influential nodes are consid-ered a simultaneous work undertaken during the clustering stage.

However, in kernel k-means cluster-ing, the obtained center in each cluster isn’t always consistent with one of the given patents. Therefore, the center in

each cluster is regarded as a virtual, core patent, and we might consider the closest patent from the center as a real, core patent.

Proposed MethodThe proposed method consists of

• computation of a graph kernel func-tion over all nodes in a network,

• clustering of nodes using kernel k-means clustering,

• computation of the distance from each node to its cluster center, and

• identification of an influential node for each cluster based on the dis-tance from each center.

In the following text, these steps are explained in detail.

Graph KernelSimilarity-based clustering approaches, such as k-means clustering, generally use the similarities or distances between ob-servations as shown in input space. The k-means clustering algorithm can be ex-tended using a kernel function, which helps to compute implicit similarities between observations in a high-dimen-sional feature. The k-means cluster-ing algorithm using a kernel function is called kernel k-means clustering.

In kernel k-means clustering for a network analysis, a similarity measure between graph nodes integrating indi-rect paths should be calculated. Among many kernels, graph kernel functions aim to calculate the similarity be-tween graph nodes. Therefore, kernel k-means clustering in network analysis uses graph kernels; however, the ker-nel function for kernel k-means clus-tering is required to meet the following conditions:16

• kernel should be a positive, semi-definite matrix; and

• kernel should be a similarity mea-sure, not a dissimilarity measure.

IS-29-04-Jeong.indd 45 9/2/14 1:33 PM

46 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS


Therefore, some kernels, such as the Laplacian kernel, which calculates dissimilarity between nodes, should consider techniques that reverse the kernel to meet the second condition.17

A representative graph kernel is an exponential diffusion kernel that uses an adjacency matrix of the graph.15,18 The exponential diffusion kernel is defined as follows:

KA

AEXP = ==

∞

∑α αk k

kk!

exp( )0

,

where the elements of Ak, aijk represent

the number of paths from node i to node j with k transitions; this kernel gives more weight to two nodes with shorter paths. Note that the exponen-tial function gives the kernel positive values, indicating that the kernel is clearly positive and semidefinite.

By changing the parameter of the exponential diffusion kernel, we can obtain the von Neumann diffusion kernel.18,19 The von Neumann dif-fusion kernel has an exponential pa-rameter ak, instead of ak/k!:

K A I AVN = = − −

=

∞

∑α αk k

k

( ) 1

0

.

If a has the range of value 0 < <αA 2

1- , the kernel is well-defined, posi-tive, and definite.15

For the present study, we can use another kernel—random walk with restart kernel—introduced by Jia-Yu Pan and his colleagues.20 The random walk with restart kernel has an ad-vantage: The kernel can capture the graph’s global structure and the multifaceted relationship between two nodes. The random walk with re-start kernel uses the random walk from node i to adjacent node j with probability pij = P(S(t + 1) = j|S(t) = i) = aij/ai., where aij is an

element of the adjacency matrix A,

a ai ijj

n

⋅=

= ∑1

, and S(t) is the state of the

random walker at time t. In addition, we assume that the random walker comes back to node i with the probability 1 - g :

X(0) = ei, X(t + 1) = g PTX(t) = (1 - g)ei,

where ei is a starting vector where the ith element is 1 and 0 for others. The probability of finding the random walker on each node j when it starts at node i by steady state solution X(t + 1) = X(t) = X is given as follows:18,21

X = (1 - g )(I - g PT)-1ei.

From this probability, we can con-sider (I - g P)-1 to be a graph kernel that indicates similarities between nodes. To show that the ith row con-tains the similarity to node i in a sim-ilar way to other kernel functions, PT was transposed to P.

Once these similarities are com-puted using a graph kernel, we can exploit them to find clusters using the kernel k-means clustering method.

Kernel k-Means ClusteringKernel k-means clustering22 uses an iterative algorithm that minimizes the sum of within-cluster inertia in the feature space as

D Cc ck

i cCc

k

i c

{ }( ) = ( ) −=∈=∑∑1

1

2

φ x mx

,

where mc is the center of cluster c in the feature space and can be calculated as

m

xx

c

iC

c

i c

n=

( )∈∑ φ

,

where xi is the node vector corre-sponding to node i, f(.) is a mapping

function to transform the original in-put xi into a high-dimensional feature space f(xi), and nc is the number of nodes belonging to Cc. We can re-write D as follows:

D C

n

c ck

i i

i jCj c

{ }( ) =

( ) ( ) −( ) ( )

=

∈∑1

2φ φ

φ φx x

x xT

T

x

cc

j lC

c

j l c

n+

( ) ( )

∈∑ φ φx x

T

x x,

2

∈=∑∑

x i cCc

k

1

.

Note that the inner product f(xi)Tf(xj) can be replaced by the kernel function Kij, which is called the kernel trick. The value of kernel function K can be cal-culated using various kernel functions, instead of mapping our data via f(.) and then computing the inner product. Therefore, the separate nonlinear map-ping f(.) need never be explicitly com-puted. The kernel function performs a role to map data from the input space to a high-dimensional feature space. Then, measure D can be rewritten as follows:

D C

n n

c ck

ii

ijC

c

jlC

c

j c j l c

{ }( )= − +

=

∈ ∈∑ ∑1

2K

K Kx x x,

221

∈=

∑∑x i cCc

k .

The distance in feature space between node f(xi) and the center of each cluster is compared. Each node is then assigned to its nearest cluster as follows:

b

n

i c i c

c ii

ijC

c

j c

= −

= −∈∑

argmin ( )

argmin

φ x m

KK

x

2

2++

∈∑ Kx x jlC

c

j l c

n

,

2,

where bi means the assigned cluster label for node i. Then, the iteration for kernel k-means clustering is repeated until no more labels of each node are changed. We summarize the proce-dures for kernel k-means clustering:

IS-29-04-Jeong.indd 46 9/2/14 1:33 PM



Input K: kernel matrix k: number of clusters

Output C1, C2, …, Ck : clusters of

nodes

Procedures 1. Initialize each cluster:

C C Ck1(0)

2(0) (0), , ,… .

2. Assign each node to its nearest cluster:

bi c i c

c

ii

ijCj ct

= ( ) −=

× −∈

arg min

arg min

(

φ xx mm

KKKK

xx

2

2 )) ( ),.

∑ ∑+

∈

n nc

jlC

c

j l ct KK

xx xx2

3. Update the clusters: C C Ct t

kt

1( +1)

2( +1) ( +1), , ,… .

4. Repeat Steps 2 and 3 until no more nodes are assigned to a cluster different from the current one.

Patent groups are discovered as the re-sult of kernel k-means clustering, where each patent’s group is determined by comparing distances in the feature space from the patent to each group’s cen-ter. The patent is assigned to the group whose center is closest to that patent.

Influential Measures for the Detection of a Core PatentOnce patent clusters are detected using kernel k-means clustering as described previously, the core patent in each clus-ter is explored again, because the ob-tained center in each cluster isn’t always consistent with one of the given nodes. Usually, existing community detection methods in a citation network don’t deal with the problem of searching influential nodes; however, in the present study, the problem of detecting influential nodes, along with the node-clustering problem, is discussed. Using this parallel approach gives us the advantage of being able to recognize the distribution of similar

nodes around influential nodes. The dis-tances between the nodes and the center of each cluster obtained from the kernel k-means clustering are calculated for the influential nodes. The closest node from the center of each cluster is considered to be an influential node of each clus-ter. Note that there’s one difference be-tween the measure to build the clusters in the previous “Kernel k-Means Clus-tering” subsection and the measure to search the core patent of each cluster in this section. The cluster building measure aims to find a cluster to which node i be-longs based on the sum of within-cluster inertia, while the core patent searching measure seeks to find a real node that’s closest to the virtual core patent node in each cluster. Our proposed measure for an influential node—kernel-mean-based centrality (KMBC)—is determined ac-cording to the selected kernel. If the exponential diffusion kernel (EXP) is se-lected, an influential node wd for a spe-cific cluster C Cd c c

k∈{ } =1

is identified by the following measure:

wd

i i d

i

ii

( )= −

=( ) −

EXP

EXP

x m

KK

argmin ( )

argmin

φ 2

2 EEXPx

EXPx xK

( )

+( )

∈

∈

∑

∑

ijC

d

jlC

d

j d

j l d

n

n

,

2

=( ) −

( )

argmin

exp( )exp( )

i

iiαα

AA2 iijC

d

jlC

d

j d

j l d

n

n

x

x xA

∈

∈

∑

∑+

( )

exp( ),

α

2

.

Using the von Neumann (VN) dif-fusion kernel, the measure is given as follows:

wd

i

ii

ijj

( )

=−( ) −

−( )−

−∈

VN

xI A

I A

argmin

( )( )

αα

1

12CC

d

jlC

d

d

j l d

n

n

∑

∑+

−( )

−∈

( ),

I Ax x

α 1

2

.

The random walk with restart (RWR) kernel defines the measure for an influential node as follows:

wd

i

ii

ijj

( )

=−( ) −

−( )−

−

RWR

xI P

I P

argmin

( )( )

γγ

1

12∈∈

−∈

∑

∑+

−( )

C

d

jlC

d

d

j l d

n

n

( ),

I Px x

γ 1

2

.

From these measures, we find the closest node from the center of each cluster and consider the node as a core patent.

Experimental ResultsThe proposed methods based on ker-nel k-means clustering, in combina-tion with the previous three graph kernels, are optimized in terms of modularity, which quantifies the qual-ity of a division of a network into communities. Also, the core patents obtained by the proposed KMBC are compared to those of commonly used centralities. Note that the kernel ma-trix based on the adjacency matrix A from the directed network, such as a citation network, is not symmetric, and hence, we define a symmetric, undirected version of an adjacency matrix for the directed network as

A A+ T

2,

which leads to producing the sym-metric kernel matrix.

Additionally, the Gaussian radial-basis-function (RBF) kernel, which is not a graph kernel, is used for com-parison. The Gaussian RBF kernel is presented as follows:

KRBF = − −( )exp u u1 222τ ,

where t is the width parameter that controls the amplitude of the RBF, and u1 and u2 are the columns of A. The dataset used for the computational ex-periments includes US patents in the

IS-29-04-Jeong.indd 47 9/2/14 1:33 PM



area of information and security is-sued from 1976–2007. We used only the top 1 percent of frequently cited US patents from 2003–2007 for our study. The dataset has 40,982 citations from 19,677 patents. For computa-tional efficiency, we selected all pat-ents that cited the most-cited patent, US5349655, directly and indirectly among the extracted top 1 percent of US patents, and then only used the patents with two or more citations. The patents and citations from the corresponding patents constitute the nodes and edges in a citation network, respectively.

Performance MeasureModularity is used to evaluate the quality of a partitioning of a network into communities. This measure is frequently used in network analysis research.23,24 Modularity has been suc-cessfully used to catch the reasonable structure of communities.25,26 A good partition—which has a high value of modularity—is one in which dense

internal connections exist between nodes within communities, but only sparse connections exist between dif-ferent communities.25 Modularity for a directed network, such as a citation network, is defined as follows:23,24

Qm

k k

mc cij

iout

jin

i ji j= −

( )∑1

A,

,δ ,

where ci is the community to which node i belongs; and ki

out and kjin are

the out-degree of node i and the in-degree of node j, respectively. Also, m is the total number of edges in the network, and d(ci, cj) is the Kro-necker delta symbol, which is defined as 1 if ci is equal to cj and 0 other-wise. In the modularity measure, k k miout

jin / is the probability of an

edge between two nodes i and j un-der randomization. Therefore, the high modularity value tells us when there are more edges within commu-nities than we would expect on the basis of chance.24

ResultsThe proposed method, combined with the four kernel functions, is optimized in terms of modularity. In the experi-ment, kernel parameters (a for von Neumann and exponential diffusion, g for random walk with restart, and t for Gaussian RBF) are varied in such a way that a, g, t = 10-3, 10-2, 10-1, 1, 101. In addition, the number of clus-ters is varied from two to 50.

For each kernel function, we selected the optimal number of clusters and kernel parameters by maximizing the modularity value. Table 1 shows the op-timized number of clusters and kernel parameters for each kernel function and their modularity values. Depending on the kernel functions, the optimal num-bers of clusters are different. The pro-posed KMBC_EXP produces the largest modularity value and yields the best community structure over other meth-ods, although it has the smallest num-ber of clusters. The RBF kernel, which isn’t a graph kernel, didn’t achieve good clustering performance in comparison to graph kernels. This result indicates that the graph kernel is effective for clustering in a citation network.

Table 2 shows the core patents de-tected from KMBC using the expo-nential kernel function. It’s not easy to measure the accuracies of the proposed approaches for the core patents. There-fore, we compared the selected influen-tial nodes using the proposed method with 15 core patents by popular cen-trality measures. Using this comparison, we can verify that the influential nodes selected by the proposed method are meaningful. The obtained patents are compared to those from commonly used centrality measures, such as the degree, adjusted closeness, and weighted-reach-ability measures.27 The degree centrality measure is the number of nodes con-nected to the target node, and the ad-justed closeness measure emphasizes the length of the shortest path from the

Table 1. Modularity value and optimal number of clusters and kernel parameters.

Kernel Exponential von NeumannRandom walk with restart

Radial basis-function (RBF)

Kernel parameter 10-1 10-2 10-1 101

Optimal number of clusters

3 27 16 25

Modularity 0.258 0.236 0.222 0.174

Table 2. Core patents obtained from kernel-mean-based centrality with exponential diffusion kernel (KMBC_EXP) and centrality measures.

KMBC_EXP Degree ClosenessWeighted

reachability

US5862260US5892900US6314409

US5892900US5982891US5943422US5920861US5910987US5915019US5917912US6185683US5949876US6112181US5862260US6226618US6122403US5745604US6157721



IS-29-04-Jeong.indd 48 9/2/14 1:33 PM



target node to all other nodes. Addition-ally, weighted reachability is the central-ity measure that reflects the concept that the directly cited patents are more influ-ential than indirectly cited patents.27

Note that the core patents from com-monly used centrality measures in a ci-tation network are sorted according to their importance, while the patents from KMBC are sorted in the order of publi-cation. This is because kernel k-means clustering can detect only the core pat-ents of each cluster but can’t quantify their importance. Although ordering the patents according to their importance could be important in a global network, it might not be helpful if someone in-tends to find a particular type of patent community and its representative pat-ents. In that case, the neighboring pat-ent information is more useful than the order of importance. The key contribu-tion in this work is that even though the order of importance of patents can’t be determined, the neighbors of the spe-cific patents—which deal with a similar topic—can be found, as well as the rep-resentative patents among neighbors.

In Table 2, the bolded patents simul-taneously belong to the lists of both KMBC_EXP and commonly used centralities. From Table 2, we can see that two core patents detected from KMBC_EXP are included in the top 15 patents of the degree for adjusted

closeness and weighted reachabil-ity. Notably, the most cited patent, US5349655, is directly or indirectly not included in the list of KMBC_EXP. The three listed patents in Table 2 are influential in each cluster, but they might not be influential in a global network and vice versa—that is, the influential patents in a global network might not be influential in each clus-ter with similar topics. US6314409 is an example of an influential patent within a local network. US6314409 is not in the top 15 patents of the degree centrality in a global network, but it has the largest number of in-degree ci-tations in its corresponding cluster.

In addition, we investigated patents in each cluster and observed that pat-ents of each cluster deal with a similar topic. Table 3 shows the main topic of each cluster (see the results for k = 3 in Table 3). Clusters 1 and 2 have patents about information hiding and systems for secure transaction man-agement, respectively. Cluster 3 si-multaneously deals with systems for access control and data management.

Figure 1 shows the citation network in a circular graph, grouped using the kernel k-means clustering algorithm with an exponential kernel, which is obtained using the NetMiner program (see www.netminer.com). The three col-ors represent three clusters of patents,

subdivided into fields in the outer ring. The patents with the same color and shape belong to the same group, and the patents with the larger shapes repre-sent the core patents of each group. The line represents the citation (that is, link) between patents. For the sake of brevity, only patents with 15 or more incoming edges are plotted. This citation network shows the edge concentration in each cluster. From Figure 1, we can see that the clustering process is well conducted by observing that all clusters, except cluster 3, have dense node-to-node con-nections within the clusters and sparse connections with other clusters.

We can see that two core patents, US5862260 and US5892900, have more internal edges than other patents within a group, while there’s little edge intensity in the patent US6314409, se-lected from cluster 3. This is due to the fact that many patents that cite patent US6314409 had already been removed when we chose the patents for the plot; however, US6314409 is admittedly the core patent, with the largest number of in-degree citations in cluster 3. Table 3 shows that the core patent of each clus-ter, obtained from our approach with the optimal number of clusters (k = 3), cor-responds with the node with the highest number of internal edges in each clus-ter. From the results, we can verify that the proposed method has appropriately

TabIe 3. The communities of patents obtained from KMBC_EXP.

Communityk = 3

(the optimal number of clusters) k = 4 k = 5

Name Topic

Core patents by the

proposed method

Patents with the highest

degree centrality

Core patents by the

proposed method


degree centrality

Core patents by the

proposed method


degree centrality

Cluster 1 Information hiding: digital watermarking or steganography

US5862260 US5862260 US5613002 US5613002 US5349655 US5349655

Cluster 2 System for secure transaction management or rights protection

US5892900 US5892900 US5892900 US5745604 US5613002 US5613002

Cluster 3 System for access control, data management, or communication network

US6314409 US6314409 US6122403 US5862260 US5822436 US5822436

Cluster 4 _ _ _ US6226618 US6226618 US5892900 US5745604

Cluster 5 _ _ _ _ _ US6226618 US6226618

IS-29-04-Jeong.indd 49 9/2/14 1:33 PM



detected the core patents. The core pat-ents from our approach and degree cen-trality when we use the different number of clusters or different graph kernels, can be different (see the results for k = 4 and 5 in Table 3). Note that the core patents from the degree centrality were obtained after we first found the technol-ogy groups using our proposed graph kernel approach. The core patents ob-tained from the simple degree centrality in the original space in Table 2 are quite different from the results in Table 3.

The proposed algorithm could be useful when we need to learn the lead-ing patent within each patent group. Using the database of US patents, we’ve investigated the ratio of patents be-longing to each cluster among the top 15 patents of each centrality in Table 2. In all centralities, the core patents of cluster 2 have a portion more than 70 percent, while there are no core patents within cluster 3 among the 15 patents. In this way, the core patents obtained

from the global networks could be concentrated in a specific cluster. Our approach to simultaneously finding the patent communities and their leading patents can prevent this problem and be helpful when exploring the core patent in each specific area.

The objective of this study was to develop algorithms to simulta-

neously detect core patents and pat-ent groups in a complex network. To this end, the algorithm for the simul-taneous detection of core patents and patent groups based on kernel k-means clustering with a graph kernel was in-troduced. This method uses a graph kernel because it can compute im-plicit similarities between patents in a high-dimensional feature space. The proposed method helps core patents in each group to be discovered eas-ily and the distribution of the similar patents around the core patents to be

recognized. The experimental results revealed that this approach can detect influential and meaningful patents and patent groups efficiently. The methods developed in this study enable a per-son to detect core patents and patent groups in a citation network. Valuable areas of future research could include validating the findings using additional data sets, and developing a method to detect the change of core patents and patent groups over a period of time.

References1. M.E.J. Newman, “Detecting Community

Structure in Networks,” The European

Physical J. B, vol. 38, 2004, pp. 321-330.

2. Y. Yang et al., “Personalized Email Priori-

tization Based on Content and Social

Network Analysis,” IEEE Intelligent

Systems, vol. 25, no. 4, 2010, pp. 12-18.

3. Z. Yang et al., “Social Community

Analysis via a Factor Graph Model,”

IEEE Intelligent Systems, vol. 26, no. 3,

2011, pp. 58-65.

4. L. Yen et al., “Graph Nodes Clustering

with the Sigmoid Commute-Time Ker-

nel: A Comparative Study,” Data and

Knowledge Eng., vol. 68, no. 3, 2009,

pp. 338-361.

5. P. Chen and S. Redner, “Community

Structure of the Physical Review Cita-

tion Network,” J. Informetrics, vol. 4,

2010, pp. 278-290.

6. D. Kim et al., “Automated Detection

Of Influential Patents Using Singular Val-

ues,” IEEE Trans. Automation Science and

Eng., vol. 9, no. 4, 2012, pp. 723-733.

7. R. Rousseau, “The Gozinto Theo-

rem: Using Citations to Determine

Influences on a Scientific Publication,”

Scientometrics, vol. 11, nos- 3-4, 1987,

pp. 217-229.

8. C.A. Yeung et al., “Measuring Expertise

in Online Communities,” IEEE Intelligent

Systems, vol. 26, no. 1, 2011, pp. 26-32.

9. M.M. Kessler, “Bibliographic Coupling

between Scientific Papers,” American

Documentation, vol. 14, no. 1, 1963,

pp. 10-25.

Cluster 1

Cluster 2Cluster 3

Cluster 1

Cluster 2r 3

Figure 1. A citation network from kernel k-means clustering using exponential kernel. The three colors represent three clusters of patents, subdivided into fields in the outer ring. The patents with the same color and shape belong to the same group, and the patents with the larger shapes represent the core patents of each group. The line represents the citation (that is, link) between patents.

IS-29-04-Jeong.indd 50 9/2/14 1:33 PM



10. H. Small, “Co-Citation in the Scien-

tific Literature: A New Measure of the

Relationship between Two Documents,”

J. Am. Soc. for Information Science, vol.

24, no. 4, 1973, pp. 265-269.

11. L. Egghe and R. Rousseau, “Co-

Citation, Bibliographic Coupling and

a Characterization of Lattice Citation

Networks,” Scientometrics, vol. 55,

no. 3, 2002, pp. 349-361.

12. K.W. Boyack and R. Klavans, “Co-Cita-

tion Analysis, Bibliographic Coupling,

and Direct Citation: Which Citation

Approach Represents the Research

Front Most Accurately?” J. Am. Soc. for

Information Science and Technology,

vol. 61, no. 12, 2010, pp. 2389-2404.

13. R. Klavans and K.W. Boyack, “Quan-

titative Evaluation of Large Maps of

Science,” Scientometrics, vol. 68, no. 3,

2006, pp. 475-499.

14. N. Doulamis et al., “Collaborative

Evaluation Using Multiple Clusters in

a Declarative Design Environment,”

Artificial Intelligence Techniques for

Computer Graphics, vol. 159, 2008,

pp. 141-157.

15. J. Kandola, N. Cristianini, and J.

Shawe-Taylor, “Learning Semantic

Similarity,” Proc. Advances in Neural

Information Processing Systems, 2002,

pp. 657-664.

16. C.M. Bishop, Pattern Recognition and

Machine Learning, Springer, 2006.

17. C. Gao et al., “Graph Ranking for

Exploratory Gene Data Analysis,” BMC

Bioinformatics, vol. 10, supplement 11,

2009; www.biomedcentral.com/1471-

2105/10/S11/S19.

18. F. Fouss et al., “An Experimental Inves-

tigation of Kernels on Graphs for Collab-

orative Recommendation and Semisuper-

vised Classification,” Neural Networks,

vol. 31, July 2012, pp. 53-72.

19. R.I. Kondor and J. Lafferty, “Diffusion

Kernels on Graphs and Other Discrete

Structures,” Proc. 19th Int’l Conf. Ma-

chine Learning, 2002, pp. 315-322.

20. J.-Y. Pan et al., “Automatic Multimedia

Cross-Modal Correlation Discovery,”

Proc. 10th ACM SIGKDD Int’l Conf.

Knowledge Discovery and Data Min-

ing, 2004; http://dl.acm.org/citation.

cfm?id=1014135.

21. F. Fouss et al., “Random-Walk Computa-

tion of Similarities between Nodes of a

Graph with Application to Collaborative

Recommendation,” IEEE Trans. Knowl-

edge and Data Eng., vol. 19, no. 3, 2007,

pp. 355-369.

22. B. Schölkopf, A. Smola, and K.-R.

Müller, “Nonlinear Component Analy-

sis as a Kernel Eigenvalue Problem,”

Neural Computation, vol. 10, no. 5,

1998, pp. 1299-1319.

23. E.A. Leicht and M.E.J. Newman,

“Community Structure in Directed Net-

works,” Physical Rev. Letters, vol. 100,

no. 11, 2008; http://dx.doi.org/10.1103/

PhysRevLett.100.118703.

24. V. Nicosia et al., “Extending the Defini-

tion of Modularity to Directed Graphs

with Overlapping Communities,” J.

Statistical Mechanics: Theory and

Experiment, 2009, P03024.

25. M.E.J. Newman and M. Girvan, “Find-

ing the Evaluating Community Structure

in Networks,” Physical Review E, vol.

69, 2004, 026113; doi:10.1103/Phys-

RevE.69.026113.

26. M.E.J. Newman, “Finding Community

Structure in Networks Using the Eigen-

vectors of Matrices,” Physical Review

E, vol. 74, 2006, 036104; http://dx.doi.

org/10.1103/PhysRevE.74.036104 .

27. O. Kwon et al., “A Method to Make the

Genealogical Graph of Core Documents

from the Directed Citation Network,”

Information, vol. 12, no. 4, 2009,

pp. 875–888.

t h e a u t h o r sDohyun Kim is a senior researcher at the Korea Institute of Science and Technology In-formation. His research interests include statistical data mining, network data analysis, and bibliometric analysis. Kim has a PhD in industrial engineering from the Korea Ad-vanced Institute of Science and Technology. Contact him at [email protected].

Bangrae Lee is a senior researcher in the Information Analysis Center of the Korea In-stitute of Science and Technology Information. His research interests include data min-ing, informetrics, scientometrics, and social network analysis. Lee has an MS in robotics from the Korea Advanced Institute of Science and Technology. Contact him at [email protected].

Hyuck Jai Lee is a principal researcher at the Korea Institute of Science and Technol-ogy Information. His research interests include information visualization, social network analysis, and research evaluation based on bibliometric analysis. Lee has a PhD in chem-istry from Sogang University, Seoul, Korea. Contact him at [email protected].

Sang Pil Lee is a research fellow at the Information Analysis Center at the Korea Institute of Science and Technology Information. His research interests include emerging technol-ogy analysis, knowledge science, bibliometric analysis, scientometrics, and data mining. Lee has a PhD in applied biotechnology from Osaka University, Japan. Contact him at [email protected].

Yeongho Moon is a director at the Information Analysis Center at the Korea Institute of Science and Technology Information. He is also a vice president of Korea Technology In-novation Associate. His research interests include emerging technology analysis, knowl-edge geometrics analysis, and data mining analysis and information service. Moon has a PhD in construction and environment engineering from the Korea Advanced Institute of Science and Technology. Contact him at [email protected].

Myong K. Jeong is an associate professor in the department of industrial and systems en-gineering and the Rutgers Center for Operations Research at Rutgers University. His re-search interests include statistical data mining, recommendation systems, machine health monitoring, and sensor data analysis. Jeong has a PhD in industrial and systems engi-neering from the Georgia Institute of Technology. He received the Freund International Scholarship and National Science Foundation Career Award and is an associate editor of the IEEE Transactions on Automation Science and Engineering and International Jour-nal of Quality, Statistics, and Reliability. He is a senior member of IEEE. Contact him at [email protected].

IS-29-04-Jeong.indd 51 9/2/14 1:33 PM