a personalized researcher recommendation approach …€¦ · anhui, pr china,...

12
A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH IN ACADEMIC CONTEXTS: COMBINING SOCIAL NETWORKS AND SEMANTIC CONCEPTS ANALYSIS Yunhong Xu, Management School, University of Science and Technology of China, Hefei, Anhui, PR China, [email protected] Jinxing Hao, School of Economics and Management, Beihang University, Beijing, China; Department of Information Systems, City University of Hong Kong, Kowloon, Hong Kong, PR China, [email protected] Raymond Y.K. Lau, Department of Information Systems, City University of Hong Kong, Kowloon, Hong Kong, PR China, [email protected] Jian Ma, Department of Information Systems, City University of Hong Kong, Kowloon, Hong Kong, PR China, [email protected] Wei Xu, School of Information, Renmin University of China, Beijing, China, [email protected] Dingtao Zhao, Management School, University of Science and Technology of China, Hefei, Anhui, PR China, [email protected] Abstract The rapid proliferation of information technologies especially the web 2.0 techniques have changed the fundamental ways how things can be done in many areas, including how researchers could communicate and collaborate with each other. The presence of the sheer volume of researcher and topical research information on the Web has led to the problem of information overload. There is a pressing need to develop researcher recommender systems such that users can be provided with personalized recommendations of the researchers they can potentially collaborate with for mutual research benefits. In an academic context, recommending suitable research partners to researchers can facilitate knowledge discovery and exchange, and ultimately improve the research productivity of both sides. Existing expertise recommendation research usually investigates into the expert finding problem from two independent dimensions, namely, the social relations and the common expertise. The main contribution of this paper is that we propose a novel researcher recommendation approach which combines the two dimensions of social relations and common expertise in a unified framework to improve the effectiveness of personalized researcher recommendation. Moreover, how our proposed framework can be applied to the real-world academic contexts is explained based on two case studies. Keywords: Recommender agents, social network analysis, semantic concept analysis, knowledge management, expertise recommendation 1516

Upload: others

Post on 29-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH IN ACADEMIC CONTEXTS: COMBINING SOCIAL

NETWORKS AND SEMANTIC CONCEPTS ANALYSIS

Yunhong Xu, Management School, University of Science and Technology of China, Hefei, Anhui, PR China, [email protected]

Jinxing Hao, School of Economics and Management, Beihang University, Beijing, China; Department of Information Systems, City University of Hong Kong, Kowloon, Hong Kong, PR China, [email protected]

Raymond Y.K. Lau, Department of Information Systems, City University of Hong Kong, Kowloon, Hong Kong, PR China, [email protected]

Jian Ma, Department of Information Systems, City University of Hong Kong, Kowloon, Hong Kong, PR China, [email protected]

Wei Xu, School of Information, Renmin University of China, Beijing, China, [email protected]

Dingtao Zhao, Management School, University of Science and Technology of China, Hefei, Anhui, PR China, [email protected]

Abstract The rapid proliferation of information technologies especially the web 2.0 techniques have changed the fundamental ways how things can be done in many areas, including how researchers could communicate and collaborate with each other. The presence of the sheer volume of researcher and topical research information on the Web has led to the problem of information overload. There is a pressing need to develop researcher recommender systems such that users can be provided with personalized recommendations of the researchers they can potentially collaborate with for mutual research benefits. In an academic context, recommending suitable research partners to researchers can facilitate knowledge discovery and exchange, and ultimately improve the research productivity of both sides. Existing expertise recommendation research usually investigates into the expert finding problem from two independent dimensions, namely, the social relations and the common expertise. The main contribution of this paper is that we propose a novel researcher recommendation approach which combines the two dimensions of social relations and common expertise in a unified framework to improve the effectiveness of personalized researcher recommendation. Moreover, how our proposed framework can be applied to the real-world academic contexts is explained based on two case studies.

Keywords: Recommender agents, social network analysis, semantic concept analysis, knowledge management, expertise recommendation

1516

Page 2: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

1 INTRODUCTION

The rapid proliferation of information technologies especially the Web 2.0 techniques have changed the nature of the information processing mechanisms in many areas (Levy 2009; Razmerita et al. 2009), including the ways how researchers may communicate and collaborate with each other to conduct research work in the academic contexts (Jahnke et al. 2009). Figure 1 shows the development of web techniques to facilitate researchers’ works at various levels. The techniques in the first strand allow researchers to access to massive amounts of information and knowledge from various sources (e.g., various scholar database and search engineering) but contribute less to socialization aspect (Cachia et al. 2007). The techniques in the second strand enable researchers to share their research related objects, e.g., researcher papers, progress reports, research proposals, etc. The techniques in the second strand focuses on sharing and exchanging of research related objects (content centered), where few provide direct connections with researchers. The techniques in the third strand offer platform to let researchers to reveal their profiles and communicate with other researchers they might be interested in. For example, various scholarly research community sites (e.g., Linkedin 1 , ResearcherID2, Mendeley3) have been developed to facilitate researchers all over the world to communicate effectively, to make new connections and form groups. The techniques in this strand are user centered, where researchers can directly communicate with each other (Moeslein et al. 2009). These techniques bursts the bounds of traditional way that the collaboration is usually limited to researchers they might know or referred by their colleagues. Through these community sites, they can browse the registered researchers’ information all over the world, search other researchers they might be interested in and establish relationships with them.

Online access and archive

1st strand :Google, Googlescholar,

2nd strand :Wikipedia, BBS or forum in particular domain…

3rd strand :Linkedin, Mendeley, Facebook…

1st strand :Google, Googlescholar, ISI, IEEE…

Member profiling(user centered)

Communication(user centered)

Exchange of objects

(content centered)

Figure 1. The development of online services provided for researchers

With the rapid growth of scholarly researcher community, large number of researchers available on the web presents great challenges to users finding out researchers they might be interested in (Schwartz et al. 1993). For example, it is reported that there are more than 60 million users registered on Linkedin till February, 2010 4 . The information overloading problem presents users great challenges in such context to discover other researchers they might be interested in. Searching from so many choices and make a final decision is a tough task. Recommender agents reduce users’ information overload by providing them with personalized and filtered information. Moreover, providing users with value-added services such as suggesting them a list of potential contacts they might be interested can benefit both researchers and research communities. For researchers, they can relieve from tedious work of finding other researchers they might be interested in from huge number of researchers. For research communities, providing researchers with these services can improve their loyalty and satisfaction to retain researchers and attract new ones in an academic community. Furthermore, due to the implicit nature of expertise knowledge, it is difficult to represent researchers’ expertise, which makes the problem more difficult to solve.

1 Http://www.linkedin.com 2 Http://www.researchid.com 3 Http://www.mendeley.com 4 http://techcrunch.com/2010/02/11/linkedin-now-60-million-strong/

1517

Page 3: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

Expertise recommendation as a subfield of knowledge management aims to elicit interests or preferences of users and recommend them a list of experts they might be interested in. Researcher recommendation can be considered as an instantiation of expertise recommendation in academic contexts. Expertise recommendation has attracted much interest from the academic community. Current research on expertise recommendation makes recommendations from two streams. The first research stream roots in information retrieval and recommends experts based on analyzing the content of expertise (e.g., keywords, concepts or maps to representing expertise) using techniques like data mining and text mining. The second research stream addresses this problem by analyzing the social aspects of experts and make recommendations accordingly. We believe that social relations and semantic similarity of expertise are two crucial factors to help prepare the ground for the development of recommendation mechanisms. However, few researchers combine these two dimensions together, except research on Referral Web (Kautz et al. 1997). In researcher recommendation context, it is important that how to introduce personalized and socially related researchers to users they might be interested in through recommendation mechanism.

In this paper, we propose a two-layer network based approach to combine semantic and social network information together, where the semantic analysis is used to analyze the semantic similarity of concepts and social network analysis dealing with the social relationship part.

2 RELATED RESEARCH

2.1 Expertise management and Expertise recommendation

Human resources are considered as a valuable asset to an organization because they possess a range of expertise which can benefit the organization (Lepak et al. 1999). However, due to the implicit nature of expertise, representing, assessing and utilization of this kind of knowledge still remains a real challenge to many organizations (Balog et al. 2009). Effective management of expertise can benefit both organization and individual researchers by facilitating the knowledge access, knowledge sharing and knowledge application more easily.

Expertise management has recently attracted substantial academic and industry attention as a subfield of knowledge management, which aims to effectively managing people’s expertise (Huang et al. 2006). Techniques developed in other fields, e.g., information retrieval, bibliometric, artificial intelligence, digital library and natural language processing have been borrowed and applied in expertise management (Becerra-Fernandez 2000). Expertise recommendation tries to understand users’ preferences and suggests them candidate experts they might be interested in. Various systems have been built to support expertise recommendation. One of the famous expertise recommendation systems is Refferral Web (Kautz et al. 1997), which combines social network and collaborative filtering to recommend expertise. Another example is Expertise Recommender, where heuristics and social interactions are utilized to recommend sources of expertise in an organizational environment (McDonald et al. 2000). An expert recommender system is designed to integrate into the specific software infrastructure of the organizational setting (Reichling et al. 2007).

2.2 Approaches to expertise recommendation

Most current expertise recommendation approaches actually follow the expertise finding mechanism and have their roots in information retrieval, where the focus is on the content of expertise itself (which can be represented via various ways, e.g., simple terms, concepts, maps, ontology) (Cameron et al. 2007; McDonald et al. 2000; Reichling et al. 2007). That is, experts whose capability is matching with users’ queries are recommended to them (Liu et al. 2005). McDonald and Ackerman (1998) claimed that expertise is fundamentally a collaborative activity. Perugini et al. (2004) pointed out that recommender agents have an inherently social element and ultimately bring people together. Generally speaking, people are more likely to communicate with those who have certain relationships with them rather than a stranger (Ehrlich et al. 2007). Therefore, expertise recommendation should reflect the social contexts in which people are embedded to facilitate the path to conversation.

1518

Page 4: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

In order to address the social and collaborative aspects, various recommendation approaches based on social network analysis are proposed (Christopher et al. 2003; Jun et al. 2007). However, current social network based approaches only use the network connectivity information to explore the relationships of persons and semantic information concerning about people’s expertise is lacked. It is found that experts recommended by social network analysis based approach often do not meet users’ specific needs (McDonald 2003). Thus, while the social network analysis based recommendation approaches can implicitly provide latent information related to expertise, there is no explicit representation of the expertise (Balog et al. 2009). Furthermore, the performance of social network analysis based approaches depends on capturing the social communication information of experts, like their e-mail communication, their collaboration. Take collaboration of researchers for example, this kind of information can be stored in diverse resources, such as bibliographic databases (e.g., ISI, Scopus), Internet sources and other technical reports. It is usually difficult to obtain the overall network information of researchers. Some link information is lost during the process of acquiring information. Making recommendations on incomplete link information is questionable. Augmenting collaboration information and semantic information related with expertise, better performance can be achieved than only using social relationship information.

2.3 Network analysis and social network analysis

In the past few years, networks analysis has attracted considerable attention of researchers from different fields, like Physics, information sciences, biology etc (Fortunato et al. 2004; Leicht et al. 2006; Otte et al. 2002). Network analysis models the real system by viewing the relationships of the elements in a complex system in terms of nodes and links, where nodes represent elements of the systems and links can be a single or multiple types of relationships or shared characteristics among elements. Social network analysis (SNA) as a subfield of network focuses in the social relationships of various entities. And its application can be found in many areas, such as research on organizational knowledge sharing (Tsai 2002), identifying subgroup structure and working relationships in organization (Cross et al. 2001), investigating the structures and interactions within criminal network (Jennifer et al. 2005).

3 A TWO-LAYER NETWORK TO RESEARCHER RECOMMENDATION

3.1 The network model

In this paper, a two-layer network is proposed to combine the social network and semantic information (as shown in Figure 2). The two-layer Concept-Researcher Network is an undirected graph ( , , )CRN C R E= , where C is the set of concepts (words which is used to represent researchers’ expertise, e.g., knowledge management, information management, network analysis) and R is the set of researchers. E indicates three kind of relationships existing in the network: concept-to-concept, concept-to-researcher and researcher-to-researcher. In Figure 2, the nodes in concept layer represent researcher interests in form of words or phrases. The links in this layer denote the relationships between research expertises, which can be obtained from ontology in a domain. The nodes in the researcher layer represent researchers, and the link between them represent some kind of social relationships, such as e-mail communication, taking part in a project, collaborate with a paper, etc. The link between content and researcher layer mean that a researcher show interest in the corresponding topic.

1519

Page 5: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

Figure 2. The two-layer model combines social and semantic information

3.2 Semantic similarity measure of concepts

Due to the fact that researchers tend to use different words or terms to indicate the same expertise, it is necessary to take this factor into consideration. In this research, we use semantic analysis to measure the similarity of concepts. In this context, concepts refer to the topical terms used to represent researchers’ expertise. In the past decades, Various approaches are proposed to measure the semantic similarity of concepts (Budanitsky et al. 2006; Jiang et al. 1997; Li et al. 2003; Rodriguez et al. 2003). In this research, WordNet as a well-developed lexical network for English words is used to compute the similarity of two concepts.

3.2.1 Semantic similarity measure of simple concepts

Resnik proposed a node-based approach to measure the semantic similarity of two words (Resnik 1992; Resnik 1995), whose basic idea is that the more information two concepts share in common, the more similar they are. The information shared by two concepts is indicated by the information content of the concepts that subsume them. Jiang and Corrath (Jiang et al. 1997) propose an approach combines the edge-based approach of the edge counting scheme and node-based approach of information content calculation. The formula can be represented as follows:

1 2 1 2 1 2

1 2 1 1

( , ) 2 ( (( , )) ( ( ) ( ))2log ( ( , )) (log ( )) (log ( ))

JCSIM c c IC lso c c IC c IC cp lso c c p c p c

= × − +

= − + +

1 2( , )lso c c represents the depth of the lowest super-ordinate of two words. ( )IC c is defined as the information content of a concept c. 1( ) log ( )IC c p c−= , where ( )p c is the probability of encountering an

instance of concept c , ( )( )

( ) w W ccount w

p cN

∈=∑

. ( )W c is the sets of words in the corpus whose senses are

subsumed by concept c , N is the total number of word tokens in the corpus that are also present in the WordNet.

Li et al. (2003) another approach using multiple information sources. The formula can be represented as follows:

1 2( , ) *h h

lLI h h

e eSIM c c ee e

β βα

β β

−−

−=

+

where [0,1]α ∈ and [0,1]β ∈ are constant parameters to sale the contribution of the shortest path and depth. In this formula,

1 2( , )l len c c= presents the length of the shortest path between two words. And

1 2( , )h depth c c= is the depth of subsumer of two words 1 2,c c in the hierarchical semantic nets.

1520

Page 6: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

Based on our experiments on the best benchmark data to date, it is found that combing the above two approaches can achieve better performance. Thus, in this research, we use the following formula to synthesize these two approaches:

_ 1 2 1 2 1 2( , ) ( , ) (1 ) ( , )JC LI JC LISIM c c SIM c c SIM c cλ λ= + − , where λ is the adjustment parameter.

3.2.2 Semantic similarity measure of complex concepts

Complex concepts are composed of simple concepts, therefore the similarity of complex concepts are based on the measure of simple concepts. We follow the research of Li et al. (2006) to compute the similarity of complex concepts:

1 21 21 2

1 2 1 2

.( , ) (1 ) (1 )

.s r

r rs sSIM T T S Ss s r r

δ δ δ δ−

= + − = + −+

Where sS represents the semantic information and

rS denotes the word order similarity. And [0,1]δ ∈ determines the relative contribution of semantic and word order information to the overall similarity computation. Li et al. (2006) also suggest that the value of δ is set to be 0.8.

3.3 Transforming the recommendation problem as a similarity measure problem

As discussed in previous section, the weight of links at the concept layer can be calculated using ontology (WordNet). The weight of links at the researcher layer can be calculated based on their strengths relating to expertise. For example, two researchers co-authoring a paper demonstrate more similarity than they communicating via email. And the weight of links in the between-layer is 1 by default meaning that a particular researcher shows expertise in a corresponding topic. Thus, a weighted graph is generated based on the weights of three kinds of relationships, see Figure 3 for example.

Figure 3. The weighted graph

An extended matrix M can be used to capture the three kinds of relationships in the two-layer network. Where the elements in C C× is the weight of links for any pair of concepts. If no such link exists, the value of the element is 0. The elements in R R× is the weight of links for researchers. If no link exists, the value of the element is 0. The elements in R C× is either 1 or 0, where 1 represents that the researcher shows interest in this area and 0 otherwise. While the matrix C R× is transpose of matrix R C× .

[ ] [ ][ ]

1 22 3 [ ]

T

C RM C M C C M C R

R M R C M R R

⎡ ⎤⎢ ⎥= = × = ×⎢ ⎥⎢ ⎥= × = ×⎣ ⎦

From the measurement of the weights of links for concept and researcher, we know that the matrix of C C× and R R× is symmetrical.

1521

Page 7: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

3.3.1 Approach based on route accessibility

The similarity of researchers in the proposed two-layer network can be measured by route accessibility (Chebotarev et al. 1998), that is, how “easy” a researcher can reach to another one along the links in the network. We first look at the researcher layer only. The reachability of any two researchers in the research layer can be obtained using the matrix 3M . Matrix 3M represents all direct paths from one researcher to another and its corresponding weight. Matrix 3 3M M× represents the strength one researcher can be linked to another researcher through an intermediate one. Matrix

3 3 3M M M× × represents the strength that one researcher can be linked to another researcher through two intermediate researchers. Therefore, the total strength that a researcher can be linked to another one can be calculated as follows (include direct and indirect links):

13 3 3 3 3 3 ... ( 3)I M M M M M M I M −+ + × + × × + = −

We then extend to the whole network where both researcher layer and concept layer are considered. Two researchers can be linked by one path, two paths, three paths, etc. If one path is considered, then the similarity of two researchers can be directly obtained through the 1 3S M= matrix, because any walk through concepts will be at least two paths, which is researcher-concept-researcher. If researchers can be linked to at most two paths, then for two-path link, two kinds of links exist, research-concept-researcher and researcher-researcher-researcher. Then the similarity between researchers for two paths is

2 3 3 2 2TP M M M M= × + × . And the ultimate similarity between researchers is obtained by 2 1 2S S P= + . If researchers can be linked to at most three paths, then for three-path link, four kinds of links exist, researcher-researcher-concept-researcher, researcher-concept-researcher-researcher, researcher-concept-concept-researcher and researcher-researcher-researcher-researcher. The similarity of three paths for two researchers is 3

3 3 2 2 2 2 3 2 1 2 3T T TP M M M M M M M M M M= × × + × × + × × + . And the ultimate similarity between researchers is obtained by 3 2 3S S P= + .

If we consider both two layers as a whole graph, then the link strength for any pair of nodes can be measured by 1... ( )SIM I M M M M M M I M −= + + × + × × + = − . This equation is valid if and only if

1 1λ < where 1λ is the spectral radius of M .

And nS is the lower right R R× elements of the matrix SIM as shown in Figure 4.

... ...... n

SIMS

⎡ ⎤= ⎢ ⎥⎣ ⎦

Figure 4. The relationship between nS and SIM

Researchers can have similarity through the links in the network, but the similarity becomes lower when the length of paths link them becomes longer. When attenuation is considered, the similarity is measured by

2 3 1( ) ... ( )SIM I M M M M M M I Mλ λ λ λ λ −= + + × + × × + = − . λ is a positive constant which is called attenuation coefficient.

3.3.2 Spreading activation algorithm

The similarity measurement problem can be considered as a graph search problem, where the start point is a particular researcher, and the end point is a set of researchers. Two researchers are seemed as highly similar in terms of their interests if they have strong links either through concepts or through other researchers. In this context, researchers who are similar will be connected by a comparatively larger number of short paths. On the contrary, for researchers with different interests, there will be fewer paths connecting them and these paths will be longer.

1522

Page 8: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

When the number of concepts and researchers is large, it is difficult to calculate the similarity of researchers using the above method, so heuristic algorithms like hopfield net algorithm are employed to simulate the activation process. The basic idea of hopfield net algorithm is that starting from a target researcher and walks through the two-layer network along the links of concept-concept, researcher-concept and researcher-researcher. Then the communication time of researchers in a network or the number of times a researcher has been walked through during a limited period of time can be used to measure the similarity of researchers toward the target researcher.

The implemented hopfield net algorithm is described as follows (Huang et al. 2004; Lippmann 1987): • Initialization The activation level of the target researcher (or user to whom we recommend other

researchers) is set to be 1 and the other nodes (include other researchers and concepts) remain inactive and the value of their activation level is set to be 0.

• Activation spreading and activation level computation In each generation, a fixed number of nodes with the highest activation level are activated. The activation level of each node is computed as

1

( 1) [ ( )]n

j s ij ii

t f W tμ μ=

+ = ∑ , where 1 i n≤ ≤ and ijW is the weight of arc linking node i

and j in the corresponding graph G, ( 1)j tμ + is the activation level of node j at time 1t + .

sf is the SIGMOID transformation function 1 2

1( )1 exp(( ) / )sf x

xθ θ=

+ −. The activation level of

newly activated nodes is computed based on the summation of the activation level of their neighbors.

• Stopping criteria The algorithm repeats until there are no significant changes between the last two iterations. That is, ( 1) ( )j j

j jt t tμ μ δ+ − < ×∑ ∑ where δ is a small positive constant. And the

allowable changes are proportional to the number of iterations performed to speed up the convergence. Top researchers who have highest activation level in the final iteration of the algorithm are recommended to the target researchers in terms that they are considered similar to some extent.

4 CASE STUDY

4.1 Case 1

In order to evaluate the performance of the proposed approach, we collected all papers published on a specific Information Systems conference (including author names and keywords information of the paper) from year 2001 to 2009, where about 1211 papers and 2369 authors are involved. The researcher recommendation process is shown in Figure 5.The co-authorship network of researchers is constructed that two researchers are linked if they have co-authored a conference paper during the particular period of time (see Figure 6). For a particular time point, researchers who published papers on the conference are considered. The semantic network of keywords is constructed using the semantic analysis presented in previous section. The keywords used by researchers are unstructured, e.g., both short form and full spelling of the same concepts existing, both single and plural forms of words used, the keywords are pre-processed to become more standard. A researcher is linked to a particular keyword if this word appears in the researchers’ keywords part of her paper. We are now conducting an experiment to evaluate whether researchers are satisfied with the recommendation results. The preliminary results show that the results of the proposed recommendation approach is acceptable, further research will be conducted to evaluate the performance of this approach.

1523

Page 9: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

Figure 5. The researcher recommendation process

Figure 6. A sample of collaboration network of researchers of an IS conference

4.2 Case 2

Scholarmate5 is a researchers’ professional community web site established by us. Its vision is to foster a knowledge sharing cyberspace for researchers to collect and share kinds of resources, (e.g., publications, research progress reports). Different from other scholarly communities which require researchers to type research outputs manually, it can automatically collect a particular researcher’s outputs from various sources, like IEEE, ISI, Scopus, researchers only need to confirm the collected results by a simple click. The confirmation process makes the results more reliable and relieves the name ambiguity problem. It provides a cyberspace for researchers to share professional expertise, and construct links with other stakeholders, such as governmental and non-governmental organizations.

On the Scholarmate, researchers can share their professional works in terms of publications, projects, papers with other community members and their friends, and receive comments and suggestions. Researchers can add other researchers into their contact list as friend. Besides, researchers with similar interests can collaborate via self-organized special interest group (SIG) functions. Each SIG equips with discussion boards, online chatting, document repository, e-mail and group member interaction statistics.

The existing work of researchers can be regarded as reliable sources which demonstrate researchers’ expertise. We can abstract researchers’ expertise from their works. Besides, researchers can also declare their expertise directly on the Scholarmate web site. In addition, their collaboration relationships can also be extracted from their existing work. Thus, Scholarmate provides a potential platform to test and evaluate the proposed approach.

5 http://www.scholarmate.com

1524

Page 10: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

5 CONCLUSIONS AND FUTURE RESEARCH

The main contribution of this paper is the design and development of a two-layer network analysis framework, which combines social relations and common research expertise, for recommending potential research collaborators in a principled manner. Using this framework, we demonstrate how the association between candidate researchers and topic terms can be combined in a natural and transparent manner. Based on the proposed network model, the problem of researcher recommendation can be transformed into a network search problem.

Previous research on expertise recommendation is usually from two separate streams. One expertise recommendation mechanism is on exploring the social relationships of experts, the underlying principle is that people tends to connect with experts with close relationships. Another is on investigating the content of their expertise. And these two parts of expertise recommendation are rarely integrated. This study is the first move to combine social network analysis and semantic concept analysis in a two-layer network model for research collaborator recommendation. Social network analysis captures the communications of researchers. And semantic analysis enables us to further understand the content of researchers’ expertise. Representing these two kinds of knowledge in a two-layer network provides a natural way to combine them together. By representing researchers and their expertise as nodes, and the co-authorship of researchers, the researchers’ toward particular expertise and the semantic relationships between topical terms which indicate researchers’ expertise as links, recommendations can be made by investigating their similarity. The concept layer of the network provides semantic and the social network layer provides the social communications. Recommendations are made by computing the similarity of researchers in the network.

Many communities (virtual or face-to-face) emerge in academic contexts to support researchers sharing knowledge and communicating with each other. Some researchers have realized that communities work as a powerful platform for supporting knowledge management. Scholarmate has been created and developed by us as a platform to allow researchers there collect, share and discuss their works. Also, it enables researchers to communicate with each other in the form of “friends” and special interest groups. We can incorporate the proposed approach into the current version of Scholarmate to recommend potential research collaborators to users based on the users’ specific interests and research expertise.

Other kind of information sheds light on the proposed approach, such as specific domain ontology and social network sites. A specific domain ontology defines the set of basic concepts comprising the vocabulary of the domain area and the relationships that behind these concepts. Other than using Wordnet to capture the semantic relationships of researchers’ expertise in the forms of words or phrases, the specific domain ontology enable us to capture the domain knowledge of a specific area. The rapid growth of various social network sites enables us to discover other kinds of social relations related with researchers such as colleagues, classmates, and friends. Incorporating additional socialization information may complement the proposed approach from other perspectives. Also the privacy problem should be cautioned.

References

Balog, K., Azzopardi, L., and de Rijke, M. "A language modeling framework for expert finding," Information Processing & Management, 45(1), 2009, 1-19.

Becerra-Fernandez, I. "The role of artificial intelligence technologies in the implementation of People-Finder knowledge management systems," Knowledge-Based Systems, 13(5), 2000, 315-320.

Budanitsky, A., and Hirst, G. "Evaluating WordNet-based measures of lexical semantic relatedness," Computational Linguistics, 32(1), 2006, 13-47.

Cachia, R., Compano, R., and Da Costa, O. "Grasping the potential of online social networks for foresight," Technological Forecasting and Social Change, 74(8), 2007, 1179-1203.

1525

Page 11: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

Cameron, D., Aleman-Meza, B., and Arpinar, I.B. "Collecting expertise of researchers for finding relevant experts in a peer-review setting. ," 1st International ExpertFinder Workshop. Berlin, Germany 2007.

Chebotarev, P., and Shamis, E. "On Proximity Measures for Graph Vertices," Automation and Remote Control, 59(10), 1998, 1443-1459.

Christopher, S.C., Paul, P.M., Alex, C., and Byron, D. "Expertise identification using email communications," in: Proceedings of the twelfth international conference on Information and knowledge management, ACM, New Orleans, LA, USA, 2003.

Cross, R., Rice, R.E., and Parker, A. "Information seeking in social context: structural influences and receipt of information benefits," Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 31(4), 2001, 438-448.

Ehrlich, K., Lin, C.Y., and Griffiths-Fisher, V. "Searching for experts in the enterprise: combining text and social network analysis," in: Proceedings of the 2007 international ACM conference on Supporting group work, ACM, Sanibel Island, Florida, USA, 2007.

Fortunato, S., Latora, V., and Marchiori, M. "Method to find community structures based on information centrality," Physical Review E, 70(5), 2004.

Huang, Z., Chen, H., and Zeng, D. "Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering," ACM Transactions on Information Systems 22(1), 2004, 116-142.

Huang, Z., Chen, H.C., Guo, F., Xu, J.J., Wu, S.S., and Chen, W.H. "Expertise visualization: An implementation and study based on cognitive fit theory," Decision Support Systems, 42(3), 2006, 1539-1557.

Jahnke, I., and Koch, M. "Web 2.0 goes academia: Does Web 2.0 make a difference?," International Journal Web Based Communities, 5(4), 2009, 484-500.

Jennifer, X., and Hsinchun, C. "Criminal network analysis and visualization," Communications of the ACM 48( 6 ), 2005, 100 - 107

Jiang, J.J., and Conrath, D.W. "Semantic similarity based on corpus statistics and lexical taxonomy," Semantic similarity based on corpus stastics and lexical taxonomy, Taiwan, Proceedings of International Conference on Research in Computational Linguistics, 1997, 19-33.

Jun, Z., Mark, S.A., and Lada, A. "Expertise networks in online communities: structure and algorithms," in: Proceedings of the 16th international conference on World Wide Web, ACM, Banff, Alberta, Canada, 2007.

Kautz, H., Selman, B., and Shah, M. "Referral web: Combining social networks and collaborative filtering," Communications of the ACM, 40(3), 1997, 63-65.

Leicht, E.A., Holme, P., and Newman, M.E.J. "Vertex similarity in networks," Physical Review E, 73(2), 2006.

Lepak, D.P., and Snell, S.A. "The human resource architecture: Toward a theory of human capital allocation and development," Academy of Management Review, 24(1), 1999, 31-48.

Levy, M. "WEB 2.0 implications on knowledge management," Journal of Knowledge Management, 13(1), 2009, 120-134.

Li, Y.H., Bandar, Z.A., and McLean, D. "An approach for measuring semantic similarity between words using multiple information sources," IEEE Transactions on Knowledge and Data Engineering, 15(4), 2003, 871-882.

Li, Y.H., McLean, D., Bandar, Z.A., O'Shea, J.D., and Crockett, K. "Sentence similarity based on semantic nets and corpus statistics," IEEE Transactions on Knowledge and Data Engineering, 18(8), 2006, 1138-1150.

Lippmann, R. "An introduction to computing with neural nets," ASSP Magazine, IEEE, 4(2), 1987, 4-22.

Liu, P., Curson, J., and Dew, P. "Use of RDF for expertise matching within academia," Knowledge and Information Systems, 8(1), 2005, 103-130.

McDonald, D.W. "Recommending collaboration with social networks: a comparative evaluation," in: Proceedings of the SIGCHI conference on Human factors in computing systems, ACM, Ft. Lauderdale, Florida, USA, 2003.

1526

Page 12: A PERSONALIZED RESEARCHER RECOMMENDATION APPROACH …€¦ · Anhui, PR China, xuyunhong@mail.ustc.edu.cn Jinxing Hao, ... social interactions are utilized to recommend sources of

McDonald, D.W., and Ackerman, M.S. "Just talk to me: a field study of expertise location," in: Proceedings of the 1998 ACM conference on Computer supported cooperative work, ACM, Seattle, Washington, United States, 1998.

McDonald, D.W., and Ackerman, M.S. "Expertise recommender: a flexible recommendation system and architecture," in: Proceedings of the 2000 ACM conference on Computer supported cooperative work, ACM, Philadelphia, Pennsylvania, United States, 2000.

Moeslein, K., Bullinger, A., and Soeldner, J. "Open Collaborative Development: Trends, Tools, and Tactics," in: Human-Computer Interaction. New Trends, 2009, 874-881.

Otte, E., and Rousseau, R. "Social network analysis: a powerful strategy, also for the information sciences," Journal of Information Science, 28(6), 2002, 441-453.

Perugini, S., Goncalves, M.A., and Fox, E.A. "Recommender systems research: A connection-centric survey," Journal of Intelligent Information Systems, 23(2), 2004, 107-143.

Razmerita, L., Kirchner, K., and Sudzina, F. "Personal knowledge management The role of Web 2.0 tools for managing knowledge at individual and organisational levels," Online Information Review, 33(6), 2009, 1021-1039.

Reichling, T., Veith, M., and Volker, W. "Expert Recommender: Designing for a Network Organization," Computer Supported Cooperative Work (CSCW), 16(4), 2007, 431-465.

Resnik, P. "Wordnet and distributional analysis: A class-based approach to lexical discovery," Proceedings of the AAI Symposium on Probabilistic Approaches to Natural Language, San Joe, CA, 1992.

Resnik, P. "Using information content to evaluate semantic similarity in a taxonomy " Proceedings of the 14th International Joint Conference on Artifical Intelligence, Montreal, 1995, 448-453.

Rodriguez, M.A., and Egenhofer, M.J. "Determining semantic similarity among entity classes from different ontologies," IEEE Transactions on Knowledge and Data Engineering, 15(2), 2003, 442-456.

Schwartz, M.F., and Wood, D.C.M. "Discovering Shared Interests Using Graph Analysis," Communications of the ACM, 36(8), 1993, 78-89.

Tsai, W. "Social Structure of "Coopetition" Within a Multiunit Organization: Coordination, Competition, and Intraorganizational Knowledge Sharing," Organization Science, 13(2), 2002, 179-190.

1527