access to open data through noaa's big · improving access to open data through noaa's...

21
Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information Officer NIST Big Data Workshop Gaithersburg, MD Friday June 2, 2017 6/2/17 1

Upload: others

Post on 07-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

Improving Access to Open Data through NOAA's Big Data Project

Edward J. Kearns, Ph.D.NOAA Chief Data OfficerOffice of the Chief Information Officer

NIST Big Data WorkshopGaithersburg, MDFriday June 2, 2017

NIST Big Data WorkshopGaithersburg, MDFriday June 2, 2017

6/2/17 1

Page 2: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 2

NOAA’s Mission(s)

6/2/17 2

Life and Property Aviation Maritime Space Operations Forests

Emergency Management

Commerce Ports Energy Hydropower

Reservoir Control Infrastructure Construction Agriculture Recreation

Ecosystems Health Environment

Page 3: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 3

NOAA and Open Data

• NOAA is responsible for services related to Weather, Oceans, Coasts, Marine Ecosystems, Fisheries, Space Weather, ….

• Almost all of NOAA’s data are “open data” – free to all, no limitations on use, in the public domain• International partners• Industry

• How do we share these data and enable their use in Big Data Applications, both for NOAA and others?

6/2/17 3

Page 4: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 44National Centers for Environmental Information | Center for Weather and Climate

By Targeted Sector

12.4%

10.9%

7.6%6.9%

Science, Technology, & Engineering

Ecosystems (Agriculture/Aquaculture)Transportation and 

Infrastructure

Energy

Insurance, Finance, &

Legal

Health & Emergency Management

By Resolved Theme

Precipitation

Temperature

Wind

Snow

Storm

PressureHumidityForecasts

NOAA NCEI Information Users

6/2/17 4

Page 5: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 5

NOAA NCEI’s General User Community

6/2/175

Fraction (%)

TypicalUser

Data or Info Need

PreferredFormat

AccessVolume

Access Frequency

~70General 

business, media, public

QualitativePoint‐and‐click, 

graphics, assessments

Low High

~15Researchers, business 

consultantsQuantitative Digital 

downloads High Low

~15 Value‐added Providers  Quantitative

Digital downloads machine to machine

Low High

Page 6: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 6

NOAA/NCEI’s Environmental Data ArchiveIncreasing volumes from station, model, radar, & satellite data

6/2/176

2016 Total: 28.6 Petabytes

Due to increasein satellite andmodel data

Volume (Petabytes)

Page 7: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 7

NCEI Projections for NOAA Data Archives

7

Data Volumes Rapidly Increasing

Page 8: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 8

NOAA NCEI Access by Volume

8

Demand for Access Increasing

NOAA Budgets Are Not Increasing Like This!

NOAA Budgets Are Not Increasing Like This!

Page 9: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 9

● NOAA struggles to keep up with increasing public demand

○ Budgets for capacity and security aren’t keeping pace with rapidly increasing data access costs.

● NOAA wants to learn about collaborative solutions

○ Seek opportunities for industry to support data access

○ Promote use and democratize data access

○ Utilize new technologies

○ Enable new economic opportunities for partners.9

Why is NOAA so interested in Partnerships for Open Data?

Page 10: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 10

Keys

Leverage the value of NOAA’s data to increase their utilization

The Big Data Project (BDP)

• Bring users to the dataNot “just” about access

• CRADAs - research activity (2015)• NOAA’s open data - freely available• NOAA’s subject matter expertise• Industry’s infrastructure expertise• Level playing field

No privileged access

• Democratization of NOAA data

New opportunities for business

10

Page 11: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 11

Augment

Amplify

BDP Data Access Strategy

11

Leveraging Industrial Partners

Add Capabilities

Enable Big Data Applications

Add  Capacity

Extend reach

Page 12: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 12

BDP Elements

12

● NOAA Basic Access - Limited capacity

○ Stewardship and service

○ Data Integrity and Authenticity

● Industrial Partners - Augment and Amplify

○ Capabilities, Technologies, Capacities

○ Scales to meet demand

○ Market pays for what it needs

Page 13: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 1313

NIST Big Data Reference Architecture (NBDRA)

Page 14: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 1414

NIST Big Data Reference Architecture (NBDRA)

BDP

BDP

BDPBDP

BDP

BDP

Page 15: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 15

BDP Collaborator Services

6/2/17 15

• AWS• https://aws.amazon.com/noaa‐big‐data/nexrad/

• Google Cloud Platform• https://cloud.google.com/bigquery/public‐data/

• IBM• https://noaa‐crada.mybluemix.net/node/32

• Microsoft

• Open Commons Consortium• https://www.opensciencedatacloud.org/publicdata/?commons_type=Envi

ronmental

Page 16: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 16

Example BDP Success: Google

6/2/17 16

• https://www.youtube.com/watch?v=OVc9GZzQu‐4&index=7&list=PLIivdWyY5sqI6Jd0SbqviEgoA853EvDsq

• 1.2 PB  climate and weather data accessed through Big Query, from Jan‐Apr 2017

• Without “trying”• Joins, joins, joins• 30‐100x of NOAA 

deliveries in that time

• Images in Earth Engine• GOES‐16 (June 2017)• National Water Model• Climate models• Climate data records

Page 17: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 17

Example BDP Success: AWS

17

80% of Orders

Through AWS

~50% of Data Stayed on

AWSAmazingly

Quick Results

AWS Wins:Revenue

generatedNOAA Wins:

Reduced loads

End User Wins:Easier access &

production

NEXRAD Weather Radar Data on AWS

Page 18: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 18

BDP Challenges

18

● How do we manage public data access costs?○ Egress versus “marginal cost of dissemination”○ Importance of 3rd parties in understanding the market values○ Do all NOAA’s data have a commercially-viable market?

● How to transfer knowledge of and curate many complex datasets?○ How to ensure data integrity and authenticity?○ How to best facilitate Big Data applications?○ Real-time, e.g. satellites, weather observations, coastal data○ Retrospective, e.g. reanalyses and observations, fisheries○ Forecasts, e.g. CFSv2 and NMME climate models

● What comes next, after the CRADA expires?○ Spin off new agreements or partnerships as April 2018 nears?○ Have we learned enough yet? If not, extend CRADAs?

Page 19: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 19

Acknowledgements

19

Many thanks to:BDP Team: Andy Bailey, Shane Glass, Jeff de la Beaujardiere, Tony LaVoi, Jay MorrisNOAA: Brian Eiler, Zach Goldstein, Dave Michaud, Glenn Tallia, Derek Hansen, Amy Gaskins*, Alan Steremberg*, Maia Hansen*, Steve Ansari, Steve Del Greco*,  Brian Nelson, Carlos Rivero*, Ken Casey, Rich BaldwinNC State University / CICS‐NC: Otis Brown, Scott Wilkins, Jonathon Brannock, Lou Vazquez, Scott Stevens, Paula Hennon, Andrew Buddenberg, Angel Li 

NOAA’s Big Data Collaborators and their partners (not an all inclusive list)Amazon: Jed Sundwall, Arial Gold (now @DOT), Jeff LaytonMicrosoft: Sam Khoury, Sid Krishna, Shannon MurphyGoogle: Will Curran, Matt Hancher, Mike Hamberg, Eli Bixby, Tino Tereshko, Amy Unruh, Tanya Shastri, Ossama Alami, Valliappa “Lak” Lakshmanan (formerly @TCC)Open Commons Consortium: Walt Wells, Maria Patterson,  Zac FlamigUnidata: Mohan Ramamurthy, Jeff WeberIBM: James Stevenson, Stefani Jones, Mary Glackin, Peter Neilley, John AvilesThe Climate Corporation: Adam Pasch

Page 20: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 20

Questions?

20

[email protected]

#NOAABigData

http://www.noaa.gov/big-data-project

Page 21: Access to Open Data through NOAA's Big · Improving Access to Open Data through NOAA's Big Data Project Edward J. Kearns, Ph.D. NOAA Chief Data Officer Office of the Chief Information

NOAA National Centers for Environmental Information 21

More Datasets to Come…

6/2/17 21

• In Progress• Satellite data GOES‐16• National Water Model• Climate models CFSv2/NMME• Fish catch/bycatch data• Marine Genomics• Surface Temp and Precip• Surface weather• Multi‐Radar Multi‐Sensor SST time series• Forecast database

• Others in discussion• Collections of realtime data• Collections of event data• Marine data around theme• Sandbox for experimentation• Weather model intermediate products