value of a coordinate: geographic analysis of agricultural biodiversity andy jarvis, julian ramirez,...

31
Value of a coordinate: geographic analysis of agricultural biodiversity Andy Jarvis, Julian Ramirez, Nora Castañeda, Samy Gaiji, Luigi Guarino, Hector Tobón, and Daniel Amariles

Upload: julia-bailey

Post on 23-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Value of a coordinate: geographic analysis of agricultural biodiversity

Andy Jarvis, Julian Ramirez, Nora Castañeda, Samy Gaiji, Luigi Guarino, Hector Tobón, and Daniel Amariles

Contents

• Why crop wild relatives?• How a coordinate can

help us complete the collections

• Cleaning coordinate data• Needs from standards

Wild relatives of crops• Include both progenitor species and closely related species of cultivated

crops• Faba beans – 0 wild relatives• Potato – 172 wild relative species• Increasingly useful in breeding, especially for biotic resistance

Florunner, with no root-knot nematode resistance

COAN, with population density of root-knot nematodes >90% less than in Florunner

Wild relative species

A. batizocoi - 12 germplasm accessions

A. cardenasii - 17 germplasm accessions

A. diogoi - 5 germplasm accessions

Gap Analysis: Strategies to fill the holes in our

seed collections

The Gap Analysis road map

Taxonomy review Data gathering Georeferentiaton

Environmental

data gathering

Gap Analysis

process

Final

recommendations

The Gap Analysis process

Proxy for:

• Range of traits

Proxy for:

• Diversity

• Possibly biotic traits

Proxy for:

• Abiotic traits

http://gisweb.ciat.cgiar.org/gapanalysis/

Crop Genus # species G H Total Avg. Records/speciesBarley Hordeum 27 1419 10965 12384 459Bean Phaseolus 72 2435 2952 5387 75Chickpea Cicer 23 314 19 333 14Cowpea Vigna 64 2509 6306 8815 138Faba bean Vicia 9 511 949 1460 162Finger millet Eleusine 7 3 68 71 10Maize Zea 4 228 143 371 93Pearl millet Pennisetum 54 963 3409 4372 81Pigeon pea Cajanus 26 197 601 798 31Sorghum Sorghum 31 320 4138 4458 144Wheat Aegilops 23 4016 2231 6247 272Wheat Triticum 3 1374 1 1375 458

Total number of herbarium specimens and germplasm accessions available for each major crop wild relative genepool through the GBIF portal

Environmental coverage

HERBARIUM GERMPLASM

NOGERMPLASM

DEFICIENTGERMPLASM

POTENTIALRICHNESS

RAREENVIRONMENTS

Which species, and where

Wild Vigna collecting priorities

• Spatial analysis on current conserved materials

• *Gaps* in current collections

• Definition and prioritisation of collecting areas

• 8 100x100km cells to complete collections of 23 wild Vigna priority species

Richness in collecting zones at genepool level

Predicted change in species richness to 2050.

Exploration and ex-situ conservation of Capsicum flexuosum

• Uncommon species of wild chili, found in Paraguay and Argentina, historically used by local indigenous communities

• 18 known registers of the plant prior to this work

• 2 germplasm accessions conserved in the USDA

• GIS used to target field collections

OBJECTIVE: Locate and collect germplasm of this species in Paraguay

• 6 new collections of C. flexuosum

• 160 seeds conserved ex situ

Behind all this

Data Quality

The GBIF database: status of the data

• The database holds 177,887,193 occurrences• Plantae occurrences are 44,706,505 (25,13%) • 33,340,000 (74.5%) have coordinates• How many of them are correct, and reliable?• How many new georreferences could we get?

CURRENT STATUS OFTHE Plantae RECORDS

The GBIF database: status of the data

• How to make the terrestrial data reliable enough?– Verify coordinates at different levels

• Are the records where they say they are?• Are the records inside land areas (for terrestrial plant species

only)• Are all the records within the environmental niche of the

taxon?– Correct wrong references– Add coordinates to those that do not have– Cross-check with curators and feedback to the database

• Using a random sample of 950.000 occurrences with coordinates

• Are the records where they say they are?: country-level verification

Records mostly locatedin country boundaries Inaccuracies in

coordinates

Records with null country: 58.051 6,11% of total Records with incorrect country: 6.918 0,72% of totalTotal excluded by country 64.969 6,83% of total

• Are the terrestrial plant species in land?: Coastal verification

Errors, and more errors

Records in the ocean: 9.866 1,03% of total Records near land (range 5km): 34.347 3,61% of totalRecords outside of mask: 369 0,04% of totalTotal excluded by mask 44.582 4.69% of total

Not so bad at all… stats

• 44’706.505 plant records• 33’340.008 (74,57%) with coordinates• From those

– 88.5% are geographically correct at two levels– 6.8% have null or incorrect country (incl. sea plant

species)– 4.7% are near the coasts but not in-land

Summary of errors or misrepresented data

TOTAL EVALUATED RECORDS: 950.000

Good records: 840.449 88.47% of total

RESULTING DATABASE

Next steps• It now takes 27 minutes to verify 950,000 records,

177million would be 83 hours (3 ½ days)• Identify terrestrial plant species and separate them from

sea species• Use a georreferencing algorithm to:

– Correct wrong references– Incorporate new location data to those with NULL lat,lon

• Interpret 2nd & 3rd-level administrative boundaries and use them too

• Implement environmental cross-checking (outliers)

Geo-referencing: BioGeomancer

http://bg.berkeley.edu/

Conclusions

• A coordinate can tell us a lot, and answer a number of interesting research questions, solve a lot of problems

• Agricultural world sadly behind the mainstream biodiversity world– Data not online, not available– Databases not connected

• Quality of coordinate data is critical:– We need the concept of precision included– We need fields such as location descriptions, and administrative

2nd and 3rd level descriptions for georeferencing– We need effective two way communications for verifying,

correcting and assigning coordinates from nodes to indexes and vice-versa Economy of scale