crowdsourcing biodiversity monitoring: how sharing your photo stream can sustain our planet
TRANSCRIPT
![Page 1: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/1.jpg)
http://www.plantnet-project.org/
Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet
1
Alexis Joly, Hervé Goëau, Julien Champ, Samuel Dufour-Kowalski, Henning Müller, Pierre Bonnet
Acknowledgement: Nozha Boujemaa, Daniel Barthelemy, Jean-François Molino
![Page 2: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/2.jpg)
2
• Global warming, food crisis and biodiversity erosion• Accurate knowledge of living species distribution and
evolution is essential• Ultimate goal: sustainable and global biodiversity
monitoring tools– Surveillance of global warming consequences, plant & animal diseases,
human activities impact, invasive species propagation• The Taxonomic impediment
– Less and less people can identify plants and animals– Less and less nature observers can produce biodiversity data
Context
![Page 3: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/3.jpg)
Pl@ntNet project (launched 2010)
Bridging the taxonomic impediment thanks to an innovative
crowdsourcing workflow based on automated plant identification
![Page 4: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/4.jpg)
The positive feedback loop does work !
+
++
Pl@ntNet project (launched 2010)
![Page 5: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/5.jpg)
Pl@ntNet app today2,5 M downloads
14 M sessions10-50 K users / day
150 Countries
5LanguagesFR, EN, ES, IT, PT,DE, AR, ZH, SK
![Page 6: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/6.jpg)
Pl@ntNet dataValidated data = 3% of the queried plant images
- 30K collaboratively revised observations per year (TelaBotanica)- Publicly available through international initiatives (GBIF, LifeCLEF)- Validation is a slow and hard process
![Page 7: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/7.jpg)
Pl@ntNet data
Unlabeled data = 97% of the raw query stream- > 1 Million of observations per year (5.1M today)- Not exploited today- A high potential for biodiversity monitoring
![Page 8: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/8.jpg)
Pl@ntNet mobile search logs
![Page 9: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/9.jpg)
Species Distribution Modelling from UGC image streams ?
Can we predict (real-time and/or long-term) Species Distribution Models directly from Pl@ntNet mobile search logs ?
Or from any other UGC image stream ?
9
![Page 10: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/10.jpg)
Challenges1. Improve recognition in open-world streams
10
![Page 11: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/11.jpg)
Recognizing plants in an open world
11
An open-set recognition problem- With 10K’s of known and unknown classes- Highly imbalanced training data
We carried out an evaluation within LifeCLEF 2016- Training set of 1000 known species (113K pictures)- Test set = 8K manually annotated Pl@ntNet queries (half
known, half distractors)- Classification Mean Average Precision on a subset of 26
invasive species
??
? ? ?
? ?
![Page 12: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/12.jpg)
1. Improve automatic recognition of plants in open-world streams- Novelty affects all systems, whatever the used rejection method (even supervised)- No rejection method can deal with strong novelty rates
→ we are still far from being able to monitor biodiversity in Twitter or Snapchat streams !
12
Recognizing plants in an open world
![Page 13: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/13.jpg)
Challenges1. Improve recognition in open-world streams
2. Use geo-location and date
13
![Page 14: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/14.jpg)
Geo-location and date ?- Not so easy !
- No real success within 5 years of PlantCLEF challenge- Why ?
- Plant distributions are not well known (this is actually our objective !)- Habitats are extremely heterogeneous from a species to another one (some
plants live everywhere while others live in very specific biotopes)- What can we do ?
- Big occurrence data (like GBIF) might help but is biased, heterogeneous and incomplete (no absence data)
- Environmental variables might help but heterogeneous, incomplete, noisy, etc.→ This will be one of the focus of LifeCLEF 2017
![Page 15: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/15.jpg)
Challenges1. Improve recognition in open-world streams
2. Use geo-location and date
3. Use taxonomy
15
![Page 16: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/16.jpg)
Using taxonomy ?Taxonomy = a hierarchical classification built by botanists for hundreds of years
→ 600 families > 14K genus > 300K species
But, taxonomy is highly heterogeneous and imbalanced
→ Classical hierarchical classification algorithms can be not be directly used
- Some genus with up to 1000 very similar species- But many genus and families include very distinct species- The long tail distribution occurs at each level and in each
node
Genus Orobanche
Genus Bupleurum
Family Bupleurum
![Page 17: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/17.jpg)
Challenges1. Improve recognition in open-world streams
2. Use geo-location and date
3. Use taxonomy
4. Optimize and boost training data production
17
![Page 18: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/18.jpg)
Pro-active crowdsourcing
Classifier (CNN)Annotators (heterogeneous skills)
Tasks selection & assignment
?
?
?
![Page 19: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/19.jpg)
Training Training
2. Create quizzes by Monte-carlo sampling
Beginner
Intermediate
1. ConvNet predictions
3. Sort quizzes by difficulty (= success expectation across all workers)
![Page 20: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/20.jpg)
Identification success rate
Experiments: Simpson’s paradox
20
Declared expertise
Workers are assigned tasks they have been trained on before !
![Page 21: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/21.jpg)
Challenges1. Improve recognition in open-world streams
2. Use geo-location and date
3. Use taxonomy
4. Optimize and boost data validation processes
5. Control bias in Species Distribution Models
21
![Page 22: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/22.jpg)
22
Objectif: Estimate the relative abundance Aij of species i in place j supposing
Nij ~ Law( Aij , Bij ) Nij: Number of observations of i in j
Aij: Abundance of i in j
Bij: Bias that might be complex because of the diversity of contributors, the opportunistic property of the observations and the confusions
Modeling bias factors ?
![Page 23: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/23.jpg)
Conclusion: biodiversity informatics needs MM
23
Biodiversity Dimension
Biodiversity Conservation Challenge
Who? Multimedia research topics
Aesthetic Enjoy and love it Everybody IR, Recommendation
Diverse Identify and classify Taxonomists Multimodal & Large-scale classification
Complex Decipher & model Biologists Multimedia Data analytics
Unknown Discover & associate Taxonomists Multimedia Data mining
Endangered Define & implement policies Decision makers Visualization, Interactivity
Indispensable Use sustainably Everybody Cross-media streams monitoring
![Page 24: Crowdsourcing Biodiversity Monitoring: How Sharing your Photo Stream can Sustain our Planet](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a29bd91a28ab36508b7931/html5/thumbnails/24.jpg)
Thank you