mendeley's data and perspectives on data challenges
DESCRIPTION
Presentation given at the RecSysChallenge workshop (http://2012.recsyschallenge.com/) at Recommender Systems 2012 (http://recsys.acm.org/2012/).TRANSCRIPT
![Page 1: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/1.jpg)
Mendeley's Data and Perspectives on Data
Challenges
Kris Jack, PhDChief Data Scientist
https://twitter.com/_krisjack
![Page 2: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/2.jpg)
➔ What's Mendeley?
➔ Why Run Challenges?
➔ Mendeley's Challenges
➔ Conclusions
Overview
![Page 3: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/3.jpg)
What's Mendeley?
![Page 4: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/4.jpg)
➔ Mendeley is a platform that connects researchers, research data and apps
➔ How are we building our community?
Mendeley Open API
![Page 5: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/5.jpg)
...organise their research
Mendeley provides tools to help users...
...organise their research
➔ Reference management
➔ Cite-as-you- write
➔ Full-text article search
➔ Digitalised annotations
![Page 6: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/6.jpg)
...organise their research
...collaborate with one another
Mendeley provides tools to help users...
...organise their research
➔ Professional research groups
➔ Social network
➔ Annotation sharing
![Page 7: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/7.jpg)
...organise their research
...collaborate with one another
...discover new research
Mendeley provides tools to help users...
...organise their research
➔ Personalised article recommendations
➔ Related research
➔ Research contact suggestions
![Page 8: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/8.jpg)
Social network (~2M users)
Research catalogue (~50M unique articles)
Research groups (~175K groups)
Personal libraries(~300M articles)
Our community from a data perspective
![Page 9: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/9.jpg)
Why Run Challenges?
![Page 10: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/10.jpg)
Why Run Challenges?
➔ An important part of our mission is to make science more open
![Page 11: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/11.jpg)
Why Run Challenges?
➔ An important part of our mission is to make science more open
“All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...].
![Page 12: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/12.jpg)
Why Run Challenges?
➔ An important part of our mission is to make science more open
“All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...].
But a lot of the state of knowledge of the human race is sitting in the scientists’ computers, and is currently not shared […] We need to get it unlocked so we can tackle those huge problems.“
![Page 13: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/13.jpg)
Why Run Challenges?
➔ An important part of our mission is to make science more open
“All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...].
But a lot of the state of knowledge of the human race is sitting in the scientists’ computers, and is currently not shared […] We need to get it unlocked so we can tackle those huge problems.“
➔ We run challenges that aim to open up science
➔ Your skills in information sciences are valuable to us
![Page 14: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/14.jpg)
Mendeley's Challenges
![Page 15: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/15.jpg)
Challenge: Build an application with our data, make science more open.
Results:
PloS/Mendeley's Binary Battle
More details at http://dev.mendeley.com/api-binary-battle/
![Page 16: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/16.jpg)
Challenge: Build off-line system for scientific recommendations with our API and DataTEL data set
Results: Will discuss today How to improve for the future?
ScienceRec Challenge 2012
More details at http://2012.recsyschallenge.com/tracks/sciencerec/
50K users, with at least 20 articles each
![Page 17: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/17.jpg)
Conclusions
![Page 18: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/18.jpg)
Conclusions
➔ Mendeley makes tools that help researchers to:➔ organise their research➔ collaborate with one another➔ discover new research
➔ We are crowdsourcing a wealth of research data➔ We're opening it up to the world➔ And inviting you to participate
![Page 19: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/19.jpg)
We're Hiring!
➔ Data Scientist➔ apply recommender technologies to Mendeley's data
➔ work on improving the quality of Mendeley's research catalogue
➔ starting in first quarter of 2013
➔ 6 month secondment in KNOW Center, TU Graz, Austria as part of the EC FP7 TEAM project (http://team-project.tugraz.at/)
➔ http://www.mendeley.com/careers/
![Page 20: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/20.jpg)
www.mendeley.com
![Page 21: Mendeley's Data and Perspectives on Data Challenges](https://reader035.vdocuments.us/reader035/viewer/2022081401/5565f8a2d8b42a20158b5267/html5/thumbnails/21.jpg)
A Challenge for the Future?
Challenge: Investigate how well algorithms perform in real-world settings
Motivation: Industry repeatedly finds that aggressive A/B testing is required because offline improvements do not necessarily translate to online improvements
Problem: Academia tends not to have accessto large online communities
Solution: Industry runs A/B test withacademic algorithms and reportsresults
What about privacy?Use publicly available dataAnonymise and aggregate results reported
Research groups (~175K groups)