mendeley: crowdsourcing and recommending research on a large scale

Mendeley:crowdsourcing and

recommending researchon a large scale

Kris Jack, PhDData Mining Team Lead

➔ what is mendeley?

➔ crowdsourcing on a large scale

➔ recommendations on a large scale

➔ data for you

Summary

...a startup company

...going to change the way that we

do research...

Mendeley is...

...organise their research

...collaborate with one another

...discover new research

Mendeley provides tools to help users...

➔ data for you

SummarySummary

works like this:

1) Install “Audioscrobbler”

2) Listen to music

3) Last.fm builds your music profile and recommends you music you also could like

Last.fmMendeley

and it’s the world’slargest open musicdatabase!

Last.fmMendeley

research libraries

researchers

papers

disciplines

music libraries

artists

genres

Screenshot taken from www.mendeley.com on 04/09/11

Mendeley is the world’slargest crowdsourced research catalogue!

assimilate research artefacts into catalogue in real time (pdfs + citation metadata)

recognise duplicate and non-duplicate artefacts in noisy input

Catalogue Crowdsourcing:System Requirements

articles

catalogue

catalogue generator

Main types of input:

→ article PDFs → article metadata (e.g. reference)

Main sources of input:

→ Mendeley Desktop → Mendeley Web Importer → External catalogue imports (e.g. ArXiv) → External catalogue lookups (e.g.

CrossRef)

articles

catalogue

catalogue generator

→ Cluster documents together → Generate catalogue entries

articles

catalogue

catalogue generator

Process:

→ Filehash check (SHA-1) → Identifier check (e.g. PubMed id) → Document fingerprint (full text) → Metadata similarity check → Update individual article page

articles

catalogue

catalogue generator

Catalogue with:

→ article metadata → aggregated statistics → support recs, etc.

➔ what does this mean for you?

SummarySummary

generate personal article recommendations for users(i.e. “here are some articles that may interest you”)

update recommendations every 24 hours

Article Recommendation:System Requirements

Output:Recommend 10 articles to each user

Input:User libraries

Recommendation through collaborative filtering

Article's in library or not (e.g. binary input)

Various similarity metrics (e.g. cooccurrence, loglikelihood, tanimoto)

16 months ago

Test:10-fold cross validation50,000 user libraries

Results:<0.025 precision at 10

10 months ago (i.e. + 6 months)

Results:~0.1 precision at 10

Test:Release to a subset of users

10 months ago (i.e. + 6 months)

Results:~0.4 precision at 10

Article Recommendation Acceptance RatesA

Number of months live

generate personal article recommendations users(i.e. “here are some articles that may interest you”)

update recommendations every 24 hours

Article Recommendation:System Requirements

1 million users!

How to scale up?

So, results comparable to non-distributed recommender

Completely distributed, so can easily run on EC2 within 24 hours...

Article Recommendation Precision Across User Library Sizes

Number of articles in user library

(using cooccurrence)

How will real users react?

➔ data for you

SummarySummary

Public Data

library readership library stars

Obtain from: http://dev.mendeley.com/datachallenge

user libraries

50,000 libraries4,848,724 articles

3,652,285 unique articles

Mendeley's API

www.mendeley.com

mendeley: crowdsourcing and recommending research on a large scale

catalogue entries catalogue

research catalogue

catalogue crowdsourcing

large scalerecommendations

newtheir researchresearch

large scaledata

large scalewhat

article recommendation

Education

facebook + endnote = mendeley

mendeley - tpu

mendeley open api

mendeley august 2014

guide mendeley

mendeley gestión de las referencias...

guia mendeley

introducing mendeley

mendeley - core

mastering mendeley

mendeley (new)

hbase at mendeley

tutorial mendeley

mendeley to orcid

mendeley teaching presentation_0981_template

mendeley - basics

mendeley - universiti putra malaysia · 2020. 11. 10. ·...

mendeley teaching presentation

pengantar mendeley -...

denny - mendeley