wikitrust: turning wikipedia quantity into quality b. thomas adler, luca de alfaro, and ian pye
TRANSCRIPT
![Page 1: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/1.jpg)
WikiTrust: Turning Wikipedia
Quantity into Quality
B. Thomas Adler, Luca de Alfaro, and Ian Pye
![Page 2: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/2.jpg)
•Wikipedia:
•3,000,000+ Article,
•1,000,000,000+ Revisions
Our Goal: Crowd-sourcing community consensus
![Page 3: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/3.jpg)
Vandalism
•Prevents Wikipedia being taken fully seriously
•Harder to use Wikipedia in schools
•Harder to make static selections
![Page 4: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/4.jpg)
•Zero-delay: Use only those features which are available at the time the revision is created. (no lookahead)
•Historical: Use the full set of WikiTrust features, including how the revision is treated by subsequent authors. (lookahead)
Vandalism DetectionGiven a new revision, classify as Vandalism or Regular
![Page 5: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/5.jpg)
•Wikipedia 1.0 Project: Aims to extract a static snapshot of Wikipedia.
•Use in Schools, Developing Countries, OLPC Project.
Revision SelectionGiven an article, select the “best” revision to show to a user.
![Page 6: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/6.jpg)
Core Concepts•Wikipedia Article
•Many Revisions
•1 Author per Revision
•Author has Reputation, Revision has Trust.
•Binary Classifier: Either A or B.
![Page 7: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/7.jpg)
Zero Day Features•Author is Anonymous (Turns out we
don’t care)
•Time interval after the previous edit (Useful, but only as a predicate time > 12 seconds)
•Time of day of edit (Not used)
![Page 8: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/8.jpg)
Zero Day Features•Difference from previous revisions
(Not really)
•Comment Length (Nope)
![Page 9: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/9.jpg)
Zero Day Features(we care about these)
•Previous Text Trust Histogram
•Current Text Trust Histogram
•Histogram Difference
![Page 10: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/10.jpg)
Text Trust•New text starts with a trust value
proportional to the author's reputation.
•Text can gain trust when revised.
•Cut-and-paste, deletions result in local trust loss.
•We remember deleted text and its trust.
![Page 11: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/11.jpg)
A Sequence of Differences
•For revisions v1, v2, v3... of a wiki, word trust is computed from the difference between vi, vi-1
•How did we arrive at the current version of an article?
![Page 12: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/12.jpg)
Text Trust: The Algorithm Illustrated
1) Trust of new text
1
![Page 13: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/13.jpg)
Text Trust: The Algorithm Illustrated
1) Trust of new text
2) New block borders have the same trust as new text
2 22
![Page 14: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/14.jpg)
Text Trust: The Algorithm Illustrated
1) Trust of new text
2) New block borders have the same trust as new text
3) The revision effect increases the trust of existing text
3 3
![Page 15: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/15.jpg)
Text Trust: The Algorithm Illustrated
1) Trust of new text
2) New block borders have the same trust as new text
3) The revision effect increases the trust of existing text4) Note: this is not a new border
4
4
![Page 16: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/16.jpg)
Zero Day Features(we care about these)
•Previous Text Trust Histogram
•Current Text Trust Histogram
•Histogram Difference
![Page 17: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/17.jpg)
Historical Features
•Next revision comment length (length > 110 chars)
•Next revision comment has the word revert in it (too noisy)
![Page 18: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/18.jpg)
Historical Features•Author Reputation (How do other
users judge this user’s edits?)
![Page 19: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/19.jpg)
Historical Features
•Minimum Revision Quality
•Average Revision Quality
•Maximum Dissent
![Page 20: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/20.jpg)
Historical Features
•Total Weight of Judges (not at all)
![Page 21: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/21.jpg)
ROC AUC Scoring
•>0.90 = Excellent
•0.8 - 0.9 = Good
•< 0.8 = Poor
•0.5 = Expected result from flipping a coin
Probability that a binary classifier is correct
![Page 22: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/22.jpg)
Results (PAN 2010)ROC of 0.937
![Page 23: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/23.jpg)
Results (PAN 2010)ROC of 0.937XROC of 0.914 ?
![Page 24: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/24.jpg)
Results (PAN 2010)ROC of 0.937XROC of 0.904 ?
![Page 25: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/25.jpg)
Other Directions
•Wikipedia 1.0
•Vandalism API
•Newsgroup Reputation
•IP Address Reputation
![Page 26: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/26.jpg)
The fraction of change that is in the same direction of the future.
• Qual = 1: vj is a totally good edit
• Qual = -1: vj is reverted
• -1 ≤ Qual ≤ 1
vi
vk
vj
“work done”d(v
i, vj)
d(v
i , vj )-d
(vj , v
k )
“prog
ress”
the past
the future
Revision Quality
![Page 27: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/27.jpg)
![Page 28: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/28.jpg)
![Page 29: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/29.jpg)
![Page 30: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/30.jpg)
![Page 31: WikiTrust: Turning Wikipedia Quantity into Quality B. Thomas Adler, Luca de Alfaro, and Ian Pye](https://reader031.vdocuments.us/reader031/viewer/2022020319/56649f355503460f94c53aa3/html5/thumbnails/31.jpg)