mediaeval 2015 - the certh-unitn participation @ verifying multimedia use 2015
TRANSCRIPT
![Page 1: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/1.jpg)
The CERTH-UNITN Participation @ Verifying Multimedia Use 2015 Christina Boididou1, Symeon Papadopoulos1, Duc-Tien Dang-Nguyen2, Giulia Boato2, and Yiannis Kompatsiaris1
MediaEval 2015 Workshop, Sept 14-15, 2015, Wurzen, Germany
This task is supported by the REVEAL EC FP7 Project.
1Information Technologies Institute (ITI), CERTH, Greece 2University of Trento, Italy
![Page 2: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/2.jpg)
Overview
2
Approach Use of tweet-, user-based and forensics features
Supervised learning (SL) scheme
Semi-Supervised learning scheme called Agreement-based retraining technique (SSL-AR)
Aim Predict if a tweet that shares multimedia content is fake or real
![Page 3: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/3.jpg)
Features Features used in the experiments
3
Feature Set Description
TB–base Baseline tweet-based
TB–ext Extended tweet-based
UB–base Baseline user-based
UB–ext Extended user-based
FOR Forensics
Types • Tweet-based: information coming from the tweet and its metadata • User-based: information and metadata about the user posting (or retweeting) the tweet • Multimedia forensics: based on the image that accompanies the tweet.
Sets • Baseline (base) set: Features shared by the task • Extended (ext) set: New features extracted • Forensics (FOR) set: Both distributed by the task and some additional ones
![Page 4: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/4.jpg)
Additional Features
4
Tweet-based User-based Forensics
Contains word please Account age AJPG-BAG combined
Has external link Number of media content NAJPG-BAG combined
Number of slang words Shares location
Number of nouns Shares location that exists1
Readability2
Web Of Trust (WOT) score
In-degree centrality3
Harmonic centrality3
Alexa rankings
For the links
1Geonames dataset (http://download.geonames.org/export/) 2Flesch Reading Ease method, which computes the complexity of a piece of text as a score in the interval [0; 100] 3Common Crawl WWW Ranking (http://wwwranking.webdatacommons.org/more.html)
![Page 5: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/5.jpg)
Additional Forensics Features
5
AJPG map Binary map
‘Object’
Mask BAG
AJPG-BAG
combined
‘Object’
features
‘Background’
features
thresholding
• NAJPG-BAG was combined in the same way from NAJPG and BAG features.
![Page 6: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/6.jpg)
Agreement-based retraining method
6
• Make the initial model adaptable • Predict more accurately the values of the disagreed samples
![Page 7: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/7.jpg)
Bagging
7
Training set
• N=9 • Equal number of samples from each class • Average result of numerous predictors
![Page 8: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/8.jpg)
Submitted Runs
Run Learning Features
RUN-1 SL TB-base
RUN-2 SL TB-base + FOR
RUN-3 SSL-AR (TB-base + FOR) + UB-base
RUN-4 SL TB-ext + UB-ext + FOR
RUN-5 SSL-AR (TB-ext + FOR) + UB-ext
8
• RUN1, RUN2 & RUN4 plain classification model • RUN3 & RUN5 agreement-based retraining technique
• Random Forest classifier used for all models
CL1 CL2
SL: Supervised Learning SSL-AR: Semi-supervised-Learning – Agreement Retraining
![Page 9: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/9.jpg)
Results
Runs Recall Precision F-score
RUN-1 0.794 0.733 0.762
RUN-2 0.749 0.994 0.854
RUN-3 0.922 0.736 0.819
RUN-4 0.798 0.860 0.828
RUN-5 0.969 0.861 0.911
9
A. RUN5 achieved the best score B. Use of SSL-AR technique improves the performance a lot C. RUN2 better than RUN1 -> FOR features contribution D. RUN3 & RUN5 comparison -> ext features’ contribution
A B
C
D
Features
TB-base
TB-base + FOR
(TB-base + FOR) + UB-base
TB-ext + UB-ext + FOR
(TB-ext + FOR) + UB-ext
![Page 10: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/10.jpg)
Examples
Fake example classified as real
10
Fake example classified as fake
![Page 11: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/11.jpg)
Conclusions / Future Work
Features
• ext features perform better than base ones
• FOR features improve performance
Agreement-based retraining technique
• improves accuracy
• adapts to the new data
• requires a number of test samples to be applied
Future Ideas
• Experiment with other set of features
• Perform feature selection
• Adapt the method to be applied with fewer samples
11
![Page 12: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015](https://reader034.vdocuments.us/reader034/viewer/2022042706/588145b61a28abf65a8b700d/html5/thumbnails/12.jpg)
Questions
12
Thank you for your attention!