bien.nceas.ucsb.edubien.nceas.ucsb.edu/bien/wp-content/uploads/2012/03/bien_rarity…  · web...

12
The Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights into the Botanical Rarity of the Americas Brian J. Enquist, Brad Boyle, John Donoghue, Barbara Thiers, Peter Jorgensen, Brian McGill, Rick Condit, Lindsey Sloat, Naia Muerta-Holme, others? and the BIEN group. Since the early writings of von Humbolt, Darwin, and Wallace (1, 2) the realization of the role of commonness and rarity in shaping the diversity of life has been a common theme in Biodiversity Science (3-5). Assessing patterns of commonness and rarity can elucidate the ecological and evolutionary mechanisms driving larger scale gradients in biodiversity (6, 7); has implications for defining conservation priorities (8); and is perhaps the best predictor of risk of extinction to climate and land-use change (9, 10). A prominent hypothesis (1, 2, 11) that underlies our understanding of diversity gradients is that the increased diversity in the tropics is due to an increase of rare taxa which is thought to reflect the various ecological and forces influencing standing stocks of species richness (12, 13). Nonetheless, efforts to quantify the total number of species, test competing models for the origin and maintenance of diversity gradients, and assess patterns of rarity at increasingly larger scales by using increasingly more massive databases is impeded by our ability to accurately assess which species are indeed rare (3). We use perhaps the most fundamental measure of abundance – the number of times a given taxon has been independently observed at a given location. There are several potential issues with using the number of absolute observations as a measure of rarity. Integrating large, disparate, and heterogeneous datasets involves overcoming numerous challenges of data exchange, interoperability, and scaling (14, 15). Indeed, measures of commonness and rarity can be biased and at worse grossly incorrect (16, 17) because of three problems associated with analyzing biodiversity 1

Upload: others

Post on 21-Feb-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: bien.nceas.ucsb.edubien.nceas.ucsb.edu/bien/wp-content/uploads/2012/03/BIEN_Rarity…  · Web viewThe Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights

The Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights into the Botanical Rarity of the Americas

Brian J. Enquist, Brad Boyle, John Donoghue, Barbara Thiers, Peter Jorgensen, Brian McGill, Rick Condit, Lindsey Sloat, Naia Muerta-Holme, others? and the BIEN group.

Since the early writings of von Humbolt, Darwin, and Wallace (1, 2) the realization of the role of commonness and rarity in shaping the diversity of life has been a common theme in Biodiversity Science (3-5). Assessing patterns of commonness and rarity can elucidate the ecological and evolutionary mechanisms driving larger scale gradients in biodiversity (6, 7); has implications for defining conservation priorities (8); and is perhaps the best predictor of risk of extinction to climate and land-use change (9, 10). A prominent hypothesis (1, 2, 11) that underlies our understanding of diversity gradients is that the increased diversity in the tropics is due to an increase of rare taxa which is thought to reflect the various ecological and forces influencing standing stocks of species richness (12, 13). Nonetheless, efforts to quantify the total number of species, test competing models for the origin and maintenance of diversity gradients, and assess patterns of rarity at increasingly larger scales by using increasingly more massive databases is impeded by our ability to accurately assess which species are indeed rare (3).

We use perhaps the most fundamental measure of abundance – the number of times a given taxon has been independently observed at a given location. There are several potential issues with using the number of absolute observations as a measure of rarity. Integrating large, disparate, and heterogeneous datasets involves overcoming numerous challenges of data exchange, interoperability, and scaling (14, 15). Indeed, measures of commonness and rarity can be biased and at worse grossly incorrect (16, 17) because of three problems associated with analyzing biodiversity data. First, integrating data is limited by the joint taxonomic and informatics challenge of correcting and harmonizing species names in large databases (18). Synonymies and mis-spelled names can greatly bias estimates of the number of species or the characterization of rare taxa (19). Second, estimates of rarity can be corrupted in large databases from the presence of non-native taxa and recent taxonomc splits. Third, rarity estimates can be biased due to incomplete sampling of local or regional floras and/or taxa. Together, these errors can bias estimates of diversity and the proportion of taxa that are rare (20). These issues are increasingly problematic at larger spatial and temporal scales where biodiversity data often stem from multiple data sources and sampling periods. As a result, our knowledge of large-scale patterns of rarity are often based on summaries of expert opinions (8) or based on extrapolating climate correlations to large scales(21) and not from analysis of primary biodiversity data (species observations, occurrence, and distribution data).

Here, we utilize new informatics tools to quantify commonness-and-rarity of the plants of the Americas and map geographic patterns of rarity. We assess the

1

Page 2: bien.nceas.ucsb.edubien.nceas.ucsb.edu/bien/wp-content/uploads/2012/03/BIEN_Rarity…  · Web viewThe Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights

hypothesis that a larger number of rare species are found in the tropics. We integrate large and disparate botanical data sources that together provide a detailed sampling of the botanical diversity of North and South America. These data include: (i) herbarium collections; (ii) differing ecological plots and surveys; and (iii) trait measurements (see Supplemental Information). Together these data comprise the primary biodiversity data that consist of an observation of a given taxa. For a given taxa, we used the total number of unique observations across these data sources as a measure of commonness and rarity.

To correct and harmonize the taxonomy found within and between these various data sources we first utilized the Taxonomic Name Resolution Service or TNRS (ref). Our work represents the first broad scale usage of the TNRS to harmonize continental scale biodiversity data. Next, we removed observations with erroneous coordinates and known introduced and non-native taxa (see Sup. Doc.) Then, to map the distribution of rarity, we standardized for sampling intensity by using several rarefaction approaches (Sup. Doc). Lastly, to assess the accuracy of the species characterized as rare (species with three specimens of less), we compared our assignment of rarity with taxonomic experts (see Sup. Doc).

The total number of raw botanical observations sampled in the Americas is 22M. Utilizing the TNRS revealed that 35% of the original names were incorrect (due to either spelling and/or synonymy). This finding underscores the importance of name harmonization in order to prevent hyperinflation of diversity estimates. After scrubbing and standardization of data, the core database of reliable observations consisted of 7,599,780 observations.

Analyzing the distribution of the number of observations per species reveals that the continental scale distribution of observations per species is highly skewed and lacking of central tendency. A large percentage of species within the BIEN database contain a very small number of observations (Fig1a). Indeed, our results indicate that ~50,000 species consist of 3 observations or less. Interestingly, our results quantify the shape of the distribution of commonness-and-rarity at continental scales. The distribution of species abundance is more right-skewed than a lognormal distribution (22) or the Fischer’s log series (ref) which becomes increasingly unimodal at large sample sizes (refs). The observed distribution instead appears to follow an inverse power-function. The maximum likelihood estimate (ref) of the fitted exponent is -0.5 for all of the data and -0.7 for the plot data. An open question is if the distribution of commonness and rarity shown in Fig 1a is due to taxonomic and/or sampling biases.

We assessed if the prominence of rare species is due to biases in herbarium data by using two approaches. First, we compared the distribution of species commonness-and-rarity in just the plot datasets (Figs A vs. B) Ecological plots and surveys contain minimal sampling biases as all individuals surveyed within a given area are identified to species. Because the distribution in Fig1B is similar to Fig1A we conclude that our results are not solely driven by sampling biases associated with herbarium data. Next to assess if using the number of observations in botanical

2

Page 3: bien.nceas.ucsb.edubien.nceas.ucsb.edu/bien/wp-content/uploads/2012/03/BIEN_Rarity…  · Web viewThe Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights

datasets provide a reliable measure of rarity randomly sampled 300 species with 3 observations or less. Then, for each species selected, we consulted taxonomic experts at the Missouri Botanical Garden and the New York Botanical Garden to sort each species into several classifications (Fig. 2; see Sup. Doc). Most species, 72.7%, identified by BIEN as being ‘rare’ are indeed taxa that are recognized by experts as rare. Only 7.3% of these subsampled taxa appear to be incorrectly characterized as rare as they are recognized by experts to actually be abundant or come from large geographic ranges. The large number of rare taxa does not appear to be due to recent taxonomic splits or old names no longer apply as ~7.5% of these taxa were due to recent taxonomic splits. In total, 10.3% of species were identified as non-native species which may indeed be rare in their naturalized range. In sum, we estimate that between 72% to 90% of plant taxa (the later being equal to the recognized as rare + Recent name + Unresolved+Old Name) identified by BIEN as being rare would be recognized as rare by other metrics.

Figure 2 allows us to estimate the total number of Embryophyte species in the Americas. After correcting and standardizing data, there are between 145,750 and 120,098 Embryophyte plant species. The lower limit stems from subtracting 17.6% from the total (10.3% from the remaining presence of naturalized non-native species + 7.3% due to the over inflation of names due to ‘old names’ not yet corrected for by the TNRS see Fig. 2). Given the estimate of the total number of Embryophytes in the world is XXX,XXX our result indicates that XX% of the global Embryophyte diversity is in the Americas. Thus, our results indicate that approximately a third to half of all the species in the new world contain little to no botanical information. Intriguingly, approximately one half of all of the species recorded for the Americas, ~50,000 have been observed 5 times or less. This result supports past claims that rarity is perhaps most common across the majority of taxa.

Lastly, controlling for variation in sampling intensity, we next assessed the spatial distribution of rarity. Plotting the total number of rare species, reveals several patterns. Rare species cluster in: (i) in mountainous regions, (a thin strip along the western flank of Andes and the Sierra Madre of Mexico); and (ii) in subtropics areas (Mexico, Chile and Argentina and southern Brazil) and in relatively isolated habitats (the Mata Atlantica in Brazil, the Guyana shield, southern California, and the Caribbean). Importantly, there is a relative dearth of rare species throughout the Amazon basin confirming past claims that the Amazon consists of widespread and relatively abundant species (refs). On the one hand, these results are consistent with the importance of vicariance events and isolation in driving variation in diversity(ref). However, the large number of rare species in the subtropics suggests that the subtropics – especially more arid parts of the subtropics - may hold a large fraction of endemic and small ranged taxa.

While past research has indicated that many species are likely rare (refs) our results quantify the distribution of commonness-and-rarity. At large continental scales, the distribution of rarity is likely highly left skewed and differs from the often

3

Page 4: bien.nceas.ucsb.edubien.nceas.ucsb.edu/bien/wp-content/uploads/2012/03/BIEN_Rarity…  · Web viewThe Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights

hypothesized lognormal, of log-series distribution but instead appears to approximate a power-function. This indicates that perhaps a large fraction of species would be considered ‘exceptionally’ rare and points to the fact that we have very little botanical information for a large fraction of species in the new world.

Conclusions – yet to come. . . .

Overall, here are the main points to emphasize

- There is no model or central tendency to the distribution- most species are rare- this does not appear to be due to sampling, taxonomic history etc.- rare species are not uniformly in the tropics (amazon) Tropical species are

not uniformly rare- rarity tends to be spatially clumped in specific areas. - These areas overlap with areas identified in the past as biodiversity hotspots

BUT our analyses suggest that we revisit these areas.- Topical and subtropical mountainous areas appear to standout as high

concentrations of rare species.

Citations yet to be included

Pitman 1999

Gentry 1986, 1988

Supplemental Information

We standardized the botanical data so that our analyses were conducted on observations with reliable geographic coordinates uncontaminated by non-native plantings. We took the original 22.5M? observations and scrubbed it by removing observations characterized by: (i) specimens lacking a specific epithet; (ii) no geographic coordinates; (iii) known to be non-native taxa; (iv) samples originating in close proximity to botanical gardens & plantations; and (v) observations with problematic geographic determinations (see supplemental information). The core of the standardized BIEN database comprises 7,599,780 observations which comprise specimen observations, XXX plot and survey observations, and 140,285 observations from traits. In total 760 data providers are included in this effort.

-

4

Page 5: bien.nceas.ucsb.edubien.nceas.ucsb.edu/bien/wp-content/uploads/2012/03/BIEN_Rarity…  · Web viewThe Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights

- Science - Reports (up to ~2500 words including references, notes and captions or ~3 printed pages) present important new research results of broad significance. Reports should include an abstract, an introductory paragraph, up to four figures or tables, and about 30 references. Materials and Methods should usually be included in supplementary materials, which should also include information needed to support the paper's conclusions.

- Brevia should be no more than 800 words in length, including authors' names and affiliations, and occupy no more than 1 journal page. Brevia have one display item (figure or table), with no more than four panels. Brevia have short titles (~8 words) and 50-word on-line-only abstracts. A maximum of 6 references is suggested. Although we do not encourage supplementary material for this section, if necessary up to 500 words of text or one figure or table is allowable. Authors should avoid highly technical presentations and jargon specific to particular disciplines. Manuscripts will be peer-reviewed in the usual manner.

5

Page 6: bien.nceas.ucsb.edubien.nceas.ucsb.edu/bien/wp-content/uploads/2012/03/BIEN_Rarity…  · Web viewThe Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights

-

Figure 1 The commonness of rarity for plants of the Americas. The distribution of the number of observations recorded for species (A) in all botanical datasets within BIEN; and (B) within ecological plots and surveys. The distribution is highly skewed showing that most species have only been observed a very small number of times. Over 50% of the 120,000 plant species in the New World have 5 observations or less. The insert in both figures contain the log10 transformed axes and show that the distribution for both plots are approximate power functions. Comparing the distributions in A and B allows us to ascertain if the prominence of rare species is due to sampling biases in herbarium data. Ecological plots and surveys contain minimal sampling biases as all taxa are surveyed within a given area. Because the distribution of B is similar to A (stats) we conclude that our results are not driven by sampling biases.

6

Page 7: bien.nceas.ucsb.edubien.nceas.ucsb.edu/bien/wp-content/uploads/2012/03/BIEN_Rarity…  · Web viewThe Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights

Figure 2. Does using the number of observations in botanical datasets provide a reliable measure of rarity? Breakdown of the nature of rarity across the Americas. For a random sample of 300 species with 3 observations or less we consulted taxonomic experts at the Missouri Botanical Garden and the New York Botanical Garden. Most species, 72.7%, identified by BIEN as being ‘rare’ are indeed taxa that are recognized by experts as rare. Approximately 7.3% of these taxa appear to be incorrectly characterized as rare as they are recognized by experts to actually be abundant or come from large ranges. Finally, approximately 7.5% of these taxa may be due to recent taxonomic splits or old names no longer applied and 10.3% are non-native species which may indeed be rare. In sum, we estimate that between 72 to 90% of plant taxa (Recognized as rare + Recent name + Unresolved+Old Name) identified by BIEN as being rare would be recognized as rare by other measures of rarity.

7

Page 8: bien.nceas.ucsb.edubien.nceas.ucsb.edu/bien/wp-content/uploads/2012/03/BIEN_Rarity…  · Web viewThe Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights

Figure 3 Where are rare species distributed geographically? Plotting the geographic coordinates for all the observations for species with 3 observations or less reveals several patterns. First, the absolute number of rare observations (A) and total number rare species (B) using the rarified Menhinick diversity estimate (23) (see Sup Doc.). While most rare species are in the tropics, they are clustered in mountainous and subtropical regions. Notably, the Amazon basin does not contain a large proportion of rare species.

8

Page 9: bien.nceas.ucsb.edubien.nceas.ucsb.edu/bien/wp-content/uploads/2012/03/BIEN_Rarity…  · Web viewThe Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights

Figure S1

1. A. R. Wallace, Tropical Nature and Other Essays. (Macmillan, New York, 1878).

2. A. von Humboldt, Ansichten der Natur mit wissen-schaftlichen Erlauterungen. . (J. G. Cotta, , Tübingen, Germany 1808).

3. K. J. Gaston, Rarity. (Chapman and Hall, London, 1994).4. D. Rabinowitz, S. Cairns, T. Dillon, in Conservation biology: the science of

scarcity and diversity, M. E. Soulé, Ed. (Sinauer Associates, Sunderland, Massachusetts, USA, 1986), pp. 182-204.

5. M. L. Rosenzweig, in The Ecology and Evolution of Communities,. (Harvard Univ. Press., 1975), pp. 121-140.

6. C. Rahbek, G. R. Graves, PNAS 98, 4534 (2001).7. A. A. Fedorov, Journal of Ecology 54, 1 (1966 ).8. N. Myers, R. A. Mittermeier, C. G. Mittermeier, G. A. B. da Fonseca, J. Kent,

Nature 403, 403:853 (2000).

9

Page 10: bien.nceas.ucsb.edubien.nceas.ucsb.edu/bien/wp-content/uploads/2012/03/BIEN_Rarity…  · Web viewThe Distribution of Commonness and Rarity at Continental Scales: Ecoinformatic Insights

9. R. Ohlemüller et al., Biology Letters 4, 568 (2008).10. J. R. Malcolm, C. Liu, R. P. Neilson, L. Hansen, L. E. E. Hannah,

Conservation Biology 20, 538 (2006).11. G. A. Black, T. Dobzhansky, C. Pavan, Botanical Gazette 111, 413 (1950).12. G. G. Mittelbach et al., Ecology Letters 10, 315 (Apr, 2007).13. S. P. Hubbell, Science 203, 1299 (1979).14. J. L. Edwards, Science 289, 2312 (2000).15. C. Thomas, 324, 1632 (2009).16. J. Alroy, Journal of Mammalogy 84, 431 (2003).17. A. Bortolus, AMBIO: A Journal of the Human Environment 37, 114 (2008).18. D. J. Patterson, J. Cooper, P. M. Kirk, R. L. Pyle, D. P. Remsen, Trends in

Ecology & Evolution 25, 686 (2012).19. B. Boyle, et al., (Ms submitted).20. N. J. Gotelli, R. K. Colwell, Ecology Letters 4, 379 (2001).21. H. Kreft, W. Jetz, J. Mutke, G. Kier, W. Barthlott, Ecology Letters 11, 116

(2008).22. F. W. Preston, Ecology 43(2), 185 (1962).23. H. T. S. Clifford, W. , An Introduction to Numerical Classification.

(Academic Press, New York, 1975 ).

10