data formatting: challenges and potential solutions

32
Adam R. Ferguson, Ph.D. Director of Data Science Brain and Spinal Injury Center (BASIC) Associate Professor, Department of Neurological Surgery Weill Institute for Neurosciences, UCSF Principal Investigator, SFVAHCS Data Formatting: Challenges and Potential Solutions Central Nervous System Injury as a Use Case

Upload: others

Post on 18-Dec-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Formatting: Challenges and Potential Solutions

Adam R. Ferguson, Ph.D.Director of Data Science Brain and Spinal Injury Center (BASIC)

Associate Professor, Department of Neurological Surgery Weill Institute for Neurosciences, UCSF

Principal Investigator, SFVAHCS

Data Formatting: Challenges and Potential Solutions

Central Nervous System Injury as a Use Case

Page 2: Data Formatting: Challenges and Potential Solutions

© 2004 Pearson Education, Inc.

Central Nervous

System (CNS)

Injury is Complex!

Page 3: Data Formatting: Challenges and Potential Solutions

Gazillions of tiny measures of biofunction

Figure Sources: Rosenzweig et al., 2010; 2019 Nat Neurosci; Ferguson et al., 2013 PloS One; Nielson et al., 2014, J Neurotrauma; Nielson et al., 2015, Brain Res.; Friedli et al., 2015 Science TM; Rosenzweig et al., 2019 Nat Medicine

Page 4: Data Formatting: Challenges and Potential Solutions

Figure Sources: Rosenzweig et al., 2010; 2019 Nat Neurosci; Ferguson et al., 2013 PloS One; Nielson et al., 2014, J Neurotrauma; Nielson et al., 2015, Brain Res.; Friedli et al., 2015 Science TM; Rosenzweig et al., 2019 Nat Medicine

Page 5: Data Formatting: Challenges and Potential Solutions

Figure Sources: Rosenzweig et al., 2010; 2019 Nat Neurosci; Ferguson et al., 2013 PloS One; Nielson et al., 2014, J Neurotrauma; Nielson et al., 2015, Brain Res.; Friedli et al., 2015 Science TM; Rosenzweig et al., 2019 Nat Medicine

Page 6: Data Formatting: Challenges and Potential Solutions

-volume

-velocity

-variety

CNS Injury as a ‘big data’ problem?

Ferguson et al., Nat Neuro 2014; Huie et al., 2019, Curr Opin. Neurology

Page 7: Data Formatting: Challenges and Potential Solutions

-volume

-velocity

-variety

Ferguson et al., Nat Neuro 2014; Huie et al., 2019, Curr Opin. Neurology

CNS Injury as a ‘big data’ problem?

Page 8: Data Formatting: Challenges and Potential Solutions

Ferguson et al., 2014, Nature Neuroscience Huie et al., 2019, Current Opinion in NeurologyHawkins et al., 2019, Journal of Neurotrauma

TOWARD ‘DATAFICATION’ OF BIOMEDICAL RESEARCH

Page 9: Data Formatting: Challenges and Potential Solutions

Ferguson et al., 2014, Nature Neuroscience Huie et al., 2019, Current Opinion in NeurologyHawkins et al., 2019, Journal of Neurotrauma

TOWARD ‘DATAFICATION’ OF BIOMEDICAL RESEARCH

Page 10: Data Formatting: Challenges and Potential Solutions

Ferguson et al., 2014, Nature Neuroscience Huie et al., 2019, Current Opinion in NeurologyHawkins et al., 2019, Journal of Neurotrauma

TOWARD ‘DATAFICATION’ OF BIOMEDICAL RESEARCH

Page 11: Data Formatting: Challenges and Potential Solutions

Ferguson et al., 2014, Nature Neuroscience Huie et al., 2019, Current Opinion in NeurologyHawkins et al., 2019, Journal of Neurotrauma

TOWARD ‘DATAFICATION’ OF BIOMEDICAL RESEARCH

Page 12: Data Formatting: Challenges and Potential Solutions

Raw DataSources

MultispeciesData Repository

Page 13: Data Formatting: Challenges and Potential Solutions

Open Data

Commons for

SCI

https://odc-sci.org

Open Data

Commons for

TBI

https://odc-tbi.org

VA Gordon Mansfield SCI Consortium

Raw DataSources

MultispeciesData Repository

Page 14: Data Formatting: Challenges and Potential Solutions

FAIR-SCI Ahead

SCI Preclinical Community Readiness and Next Steps

Washington DC,

November 10, 2017

SCI Preclinical Community

Readiness and Next Steps

November 4, 2018

San Diego, CA

STREET-FAIRSCI Team Research, Enabling Expansion

and Translation of FAIR data sharing

Page 15: Data Formatting: Challenges and Potential Solutions

Private Public

Fouad et al., 2021 Journal of Neurotrauma

Page 16: Data Formatting: Challenges and Potential Solutions

HTTP://ODC-SCI.ORG

Fouad et al., 2021 Journal of Neurotrauma

Page 17: Data Formatting: Challenges and Potential Solutions

HTTP://ODC-TBI.ORG

Chou et al., submitted. Preprint: https://doi.org/10.1101/2021.03.15.435178

Open Data Commons for Traumatic Brain Injury Currently has 17 labs and 40 datasets. JOIN NOW

Page 18: Data Formatting: Challenges and Potential Solutions

So you now have big data…THEN WHAT??

?? ?

??

Page 19: Data Formatting: Challenges and Potential Solutions

Abel Torres-Espin, PhDNow: Assistant Prof. UCSF

Page 20: Data Formatting: Challenges and Potential Solutions

Jessica Nielson, PhDNow: Assistant Professor University of MinnesotaDept. Psychiatry and Institute of Health Informatics (IHI)

Page 21: Data Formatting: Challenges and Potential Solutions

Long-tail data archive: Multicenter animal spinal cord

injury study (MASCIS) 1994-1996

Page 22: Data Formatting: Challenges and Potential Solutions

Multicenter Animal SCI Study (MASCIS 1994-96)

Nielson et al., 2015Nature Communications

LocomotorRecovery

Page 23: Data Formatting: Challenges and Potential Solutions

Multicenter Animal SCI Study (MASCIS 1994-96)

Nielson et al., 2015Nature Communications

LocomotorRecovery

Why are these groups different?

Page 24: Data Formatting: Challenges and Potential Solutions

Multicenter Animal SCI Study (MASCIS 1994-96)

Nielson et al., 2015Nature Communications

Blood Pressure During Surgery

LocomotorRecovery

Why are these groups different?

Page 25: Data Formatting: Challenges and Potential Solutions

Multicenter Animal SCI Study (MASCIS 1994-96)

Nielson et al., 2015Nature Communications

Blood Pressure During Surgery

Jessica Nielson, PhD

LocomotorRecovery

Page 26: Data Formatting: Challenges and Potential Solutions

Carlos Almeida, MSCurrent PhD Student, University of Louisville

Page 27: Data Formatting: Challenges and Potential Solutions

Example Reuse Case

Almeida et al., 2021. Neuroinformatics

Page 28: Data Formatting: Challenges and Potential Solutions

Almeida et al., 2021. Neuroinformatics

Page 29: Data Formatting: Challenges and Potential Solutions

Takeaway Messages:

1) Long-tail and Dark data comprise majority of biomedical research data.

2) User-facing (community) repositories provide a way to make these data FAIR, open and reusable by both human and machine intelligence tools.

Page 30: Data Formatting: Challenges and Potential Solutions

Lab

Page 31: Data Formatting: Challenges and Potential Solutions

FERGUSON LAB:Carlos Almeida, BS, MA (cand.) Carla Arellano, DPTAustin Chou, PhDDaniel Fong, DPT Jenny Haefeli, PhDJ Russell Huie, PhDAnastasia V Keller, PhDSheena McCormack, DPT Kazuhito Morioka, MD, PhDJonathan Namnath, BSDave Namnath, BSJessica L Nielson, PhDCleopa Omondi, BS, MSEllen Stuck, DPT studentDolores Torres, BS, MS2Christina Tsutsumi, DPT studentJennifer Truong, DPT studentLauren VanCitters, DPTNadine Joseph, BS (cand.)

CollaboratorsUCSF Beattie/Bresnahan Lab:Michael S. Beattie, PhDJacqueline C. Bresnahan, PhDTomoo Inoue, MD,PhDAmity Lin, BSSang Mi Lee, PhDJeffery Sacramento, B.A.Ernesto Salegio, PhD

UCSF Manley Lab:Tomoo Inoue, MD,PhDGeoffery T Manley MD, PhDMary Vassar, RN, MSJohn Yue, BA, MS2

UCSF/PT Rehab:Susanna Rosi, PhD

UCSF Biostatistics:Mark R Segal, PhD

UCSF Neurology/SFVA:Ray Swanson, MDSteve Massa, MD/PhDRaquel Gardner, MD

UCSF/Anesth & Periop Care:Jonathan Pan, MD/PhDMervyn Maze, MDHua Su, MD

UCSF/Radiology:Esther Yuh, MD/PhDPratik Mukherjee, MD/PhDJason Talbott, MD/PhDSharmila Majumdar, PhD

UCSF Orthopedics:Jeff Lotz, PhDChelsey Bahney, PhDRalph Marcucio, PhD

UCSF Psychiatry:Bruno Biagianti, MDAoife O’donovan, PhDRachel Loewy, PhD

UCSD:Mark H. Tuszynski, MD, PhDEphron Rosenzweig, PhDJohn Brock, PhD

UCSD/NIF:Maryann Martone, PhDJeff Grethe, PhD

UCLA:Reggie Edgerton, PhDSharon ZdunowskiLeif Havton, MD,PhD

UC Davis:Rod MoseankoStephanie Hawbecker

Swiss Federal Institute of Technology:Gregoire Courtine, PhD

UC IrvineOs Steward, PhD

University of KentuckyJohn C. Gensel, PhD Sasha Rabchevsky, PhD

U Miami /Miami Project Vance Lemmon, PhDJohn Bixby, PhD

University of AlbertaKarim Fouad, PhD

Stanford:Karen-Amanda Irvine, PhDAlison Callahan, PhDSteve McKenna, MDGraham Creasey, MD

Funding US National Institutes of Health: R01NS088475,

R01CA213441, R01AG056770, R01MH116156,

U01NS086090, P30AR066262, UG3NS106899,

U19AR076737

US Veterans Affairs: 1I01RX002245, I01RX002787

US Department of Defense: SC150198, SC150177

US Department of Energy

Craig H. Neilsen Foundation

Wings for Life FoundationTexas A&M UniversityJim W. Grau, PhDMichelle A. Hook, PhD

University of ZurichArmin Curt, MD

University of MinnesotaJessica Nielson, PhDSophia Vinogradov, MD

Ohio State University:Phillip G. Popovich, PhDDana M. McTigue, PhDJan Schwab, MD,PhDMichele Basso, EdD

University of Louisville:David Magnuson, PhDDarlene Burke, MSScott Whittemore, PhD

NINDS TOP-NT

NIH/NINDS Patrick Bellgowan, PhD

University of FloridaKevin Wang, PhD

Johns Hopkins UniversityJinyuan Zhou, PhD

Georgetown UniversityMark Burns, PhD

UCLANeil Harris, PhDIna Wanner, PhD

Page 32: Data Formatting: Challenges and Potential Solutions

Questions?