metrology for identity and other nominal properties

28
Metrology for Identity and Other Nominal Properties David Lee Duewer Chemical Sciences Division Materials Measurement Laboratory National Institute of Standards and Technology Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Upload: nist-spin

Post on 21-Jul-2015

97 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Metrology for Identity and Other Nominal Properties

Metrologyfor Identity and Other Nominal Properties

David Lee DuewerChemical Sciences Division

Materials Measurement LaboratoryNational Institute of Standards and Technology

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 2: Metrology for Identity and Other Nominal Properties

And we take ourselves very seriously…When I Say “We”…

PhD 1985 Analytical chemist 5 y Perkin-Elmer – Instrument Design/Development

24 y NIST “Innovator”

PhD 1976 Analytical chemist11 y Monsanto - process & biodiscovery

23y NIST “Data Jock”

Marc Salit Dave DuewerLeader,

Genome Scale Measurements Group

Co-Director, NIST/Stanford U.Joint Initiative on Measurements in Biology

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 3: Metrology for Identity and Other Nominal Properties

Metrology (Measurement Science)

• Metrology is the stuff needed so data can support informed decision making. • in a good world, decisions are informed with data

• which are the results of measurements!

• Calculus of Confidence• we posit that metrology is the ‘formal’ system

that tells us how well we trust those data

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 4: Metrology for Identity and Other Nominal Properties

Calculus of Confidence

• The tools of metrology:• Traceability

• Uncertainty

• Validation

• enable this calculus of confidence by which decisions are informed by measurement results with established confidence.

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 5: Metrology for Identity and Other Nominal Properties

Craft

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

• Metrology is more a craft than a technology• this doesn’t mean that 7 year apprenticeships are

required!

• it does mean that two different skilled metrologistsmight take very different approaches to the same problem• but they should both come to largely equivalent

solutions!

• matter of style

• must be defensible

Page 6: Metrology for Identity and Other Nominal Properties

The “How Much” Worldviewas seen by chemists/biochemists

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 7: Metrology for Identity and Other Nominal Properties

Tools of the Trade

Workshop on DNA Methods for Quality Control of Botanical Products USP, 23-Oct-2014

www.bipm.org/en/publications/guides/#vimwww.nist.gov/pml/pubs/sp811/www.bipm.org/en/publications/guides/#gum

“GUM” “VIM”

Page 8: Metrology for Identity and Other Nominal Properties

Metrological Traceabilityenables comparisons to be made over time and place

SI unit(amount of substance)

purity analysis

Result

primary methods

reference methods

routine methods

high purity primary RM

primary calibration CRM

secondary calibration RM

routine sample

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 9: Metrology for Identity and Other Nominal Properties

Validationensures measurement processes are well-understood

• “checks the measurement model”• tests completeness

• tests assumptions

• helps establish an uncertainty budget

• identifies relevant parameters to keep under control

• tests scope

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 10: Metrology for Identity and Other Nominal Properties

• “how much” results are only useful when compared• different results in different places or

measured at different times…• “comparability over space-and-time”

• Are these results the same?• is there significant bias?

• Is measurement precision fit-for-purpose

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Metrological Uncertaintyenables meaningful comparison of results

Page 11: Metrology for Identity and Other Nominal Properties

“We think our reported value is good to 1 part in 10,000: we are willing to bet our own money at even odds that it is correct to 2 parts in 10,000. Furthermore, if by any chance our value is shown to be in error by more than 1 part in 1000, we are prepared to eat the apparatus and drink the ammonia.”

Perhaps NIST’s Best Uncertainty Statement

Quote from: Doiron T and Stoup J, Uncertainty and Dimensional Calibrations, JNIST 1997;102:647-676

http://dx.doi.org/10.6028/jres.102.044

Dr. C.H. Meyers, on his measurements of the heat capacity of ammonia (circa 1920):

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 12: Metrology for Identity and Other Nominal Properties

The “What” Worldview

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 13: Metrology for Identity and Other Nominal Properties

Several Different “What”s• Identification

• “Pure substance” Certified Reference Material (CRM)• Use/develop convincingly specific methods

• Inclusion• exclusion

• Define and certify unambiguous “barcode”• CRMs are expensive

• Verification• Secondary reference materials (RMs) and controls• Check “barcode” against CRM• Can be commercial or home-brew

• Recognition• Component of a mixture• Check “barcode” against library

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 14: Metrology for Identity and Other Nominal Properties

Barcode of Life

http://www.barcodeoflife.org/content/about/what-dna-barcoding

Identification Validation

Recognition

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 15: Metrology for Identity and Other Nominal Properties

Metrological Traceabilityenables comparisons to be made over time and place

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Authoritychemical structure, biological nomenclature

identification methods

Result

verification methods

recognition methods

“pure” primary RM

QC and secondary RMs

routine samples

{CAS, IUPAC} {ICZN, ICN}

Page 16: Metrology for Identity and Other Nominal Properties

Taxonomic Hierarchy

Ginkgo biloba L.

Kingdom Plantae – plantes, Planta, Vegetal, plants Subkingdom Viridaeplantae – green plants

Infrakingdom Streptophyta – land plants Division Tracheophyta – vascular plants, tracheophytes

Subdivision Spermatophytina – spermatophytes, seed plants, phanérogamesInfradivision Gymnospermae – gymnosperms, gymnospermes, gimnosperma

Class Ginkgoopsida – ginkgo Order Ginkgoales

Family GinkgoaceaeGenus Ginkgo L. – ginkgo

Species Ginkgo biloba L. – maidenhair tree, common ginkgo

en.wikipedia.org/wiki/Ginkgo_biloba

http://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=183269

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 17: Metrology for Identity and Other Nominal Properties

Validationensures measurement processes are well-understood

• “checks the measurement model”• tests if identification criteria fit-for-purpose

• includes everything wanted

• excludes everything else• (Ideally, this can be done in silico)

• tests if measurements consistent with identification criteria

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 18: Metrology for Identity and Other Nominal Properties

Specificity Validation DesignChloroplast DNA sequences from authenticated Ginkgo biloba samples are used to establish inclusivity

Chloroplast DNA sequences from close relatives are used to establish exclusivity

Labudde, R.; Harnly, J.M.; Probability of identification (POI): A Statistical Model for the Validation of Qualitative Botanical Identification Methods

Official Methods of Analysis of AOAC International., Vol. 95, pp. 273–285, (2012).

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

https://www-s.nist.gov/srmors/view_cert.cfm?srm=3246

Page 19: Metrology for Identity and Other Nominal Properties

psbA-trnH Intergenic Spacer Phylogeny trnL Intron Phylogeny

Specificity Validation Results

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

https://www-s.nist.gov/srmors/view_cert.cfm?srm=3246

Page 20: Metrology for Identity and Other Nominal Properties

• “what” results are only useful when• The same “things” can be compared

• “measurand” is the metrology-speak term

• Are these barcodes the same?• how confident are you in the result?

• essential part of being able to compare!

Metrological Confidenceenables meaningful interpretation of results

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 21: Metrology for Identity and Other Nominal Properties

“Where uncertainty is assessed qualitatively, it is characterised by providing a relative sense of the amount and quality of evidence (that is, information from theory, observations or models indicating whether a belief or proposition is true or valid) and the degree of agreement… This approach is used by WG III through a series of self-explanatory terms such as: high agreement, much evidence; high agreement, medium evidence; medium agreement, medium evidence; etc.”

Defining “Confidence”

Climate Change 2007: Synthesis Reportwww.ipcc.ch/publications_and_data/ar4/syr/en/contents.html

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 22: Metrology for Identity and Other Nominal Properties

“Confidence”: NIST’s Initial DefinitionsDNA Sequence

via Sanger sequencing

Workshop on DNA Methods for Quality Control of Botanical Products USP, 23-Oct-2014

Page 23: Metrology for Identity and Other Nominal Properties

On Further Thought…

• Highest confidence• sufficient evidence• no ambiguities or contradictions

• Very confident• sufficient evidence• all ambiguities unambiguously resolved

• Confident• sufficient evidence• all ambiguities “understood”

• but insufficient evidence to prove it

• Insufficient evidence to Certify

Acquire Evidence

Sufficient?

HighestUnambiguous?

Resolved? Very

Understood? Confident

Yes

Yes

Yes

Yes

No

No

No

No Confidence

Maybe

No

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 24: Metrology for Identity and Other Nominal Properties

Who Defines “Sufficient”?

You!and the rest of the experts within your community

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 25: Metrology for Identity and Other Nominal Properties

Criteria for Identification of Seized Drugs

SWGDRUG Recommendations :If one technique from A, then one other (A, B, or C).If no techniques from A, then three others (two from B).

Category A Category B Category C

Infrared Spectroscopy Capillary Electrophoresis Color Tests

Mass Spectrometry Gas Chromatography Fluorescence Spectroscopy

Nuclear Magnetic Resonance Spectroscopy Ion Mobility Spectrometry Immunoassay

Raman Spectroscopy Liquid Chromatography Melting Point

X-ray Diffractometry Microcrystalline Tests Ultraviolet Spectroscopy

Pharmaceutical Identifiers

Thin Layer Chromatography

http://www.swgdrug.org/approved.htm

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 26: Metrology for Identity and Other Nominal Properties

Barcode of Life: Standards and Guidelines

www.barcodeoflife.org/content/resources/standards-and-guidelines

2.D.ii In November 2009, CBOL approved rbcL and matK as the barcode regions for vascular plants. They are defined relative to the Arabidopsis thaliana chloroplast NC_000932 sequence annotation as follows: the rbcL barcode region is at the 5' end of the rbcL gene between bp1-599 (27-579 excluding primer sequences); the matK barcode region is between bp205-1046 (227- 1019 excluding primer sequences).

4.C In deciding whether a record will be repeatable and reliable for species identification, submitters should select as potential BARCODE records only those for which the contigwas based on bi-directional coverage with non-N base calls at no less than 40% of the reported sequence. As described below (5D), CBOL can direct GenBank (or another INSDC member) to remove the BARCODE designation from records which have all required elements (1A-I) but have been shown to be unreliable for species identification due to low sequence quality and coverage.

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 27: Metrology for Identity and Other Nominal Properties

Recent Work in “What” Metrology

Chemical Identification and its Quality AssuranceBoris L. MilmanD.I. Mendeleyev Institute for Metrology, St. Petersburg, Russia

January 12, 2011 Springer, 281 pages, English

“Unlike analytical techniques for qualitative and quantitative determinations, well-presented in books and reviews, theoretical principles of identification and general experimental approaches to its implementation have not received comprehensive treatment in the literature.”

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Page 28: Metrology for Identity and Other Nominal Properties

Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014

Thank you for your attention!