scaling sensors with data synthesis
Post on 20-Jan-2016
22 Views
Preview:
DESCRIPTION
TRANSCRIPT
Scaling Sensors with
Data Synthesis
Catharine van IngeneScience Group
Microsoft Research
It was six* men of Indostan, to learning much inclined,Who went to see the elephant though all of them were blind, That each by observation, might satisfy his mind.
*data reporting error
Unprecedented Data Availability• Created by the confluence of
fast internet connectivity, commodity computing and advanced sensor technologies
• Ever more pressing challenge is how to make sense of it all
Navigatingin Real-Time
and Real-Space
Crop cycles 100 y
Competition, Gap Creation 101 y
Succession, Mortality102 y
Species migration, Soil formation103 y
Photosynthesis 10-6 -10-3 y
Speciation, Extinction 106 y
Evolution 109 y
Stomata 10-5 m
Leaf 10-2 – 10-1 m
Plant 10-1 - 100 m
Canopy 100 - 103 m
Landscape 103 - 105 m
Chloroplast 10-6 m
Continent 106 m
Globe 107 m
Sensors are the ante; Synthesis is the game
• Challenge: How do we use data to think about the future when the past is no longer a good predictor?
• Strategy: Scale up and down to bridge understanding and observational capabilities
• Approach: {mashup, derive, validate, analyze} repeat
• Hope: There are some technologies and methodologies that generalize to other disciplines with time and space drivers
Data-Driven Science Meets Public Policy and Economics
• GPP, or gross photosynthetic production is component of carbon fixation and tied to water balance
• Implications for biofuels – GPP is higher in southern temperate forests than in the mid-west Corn Belt
Thanks to Dennis Baldocchi and Youngryel Ryu (UC Berkeley) 2010
About That Map• Existing upscaling methods leverage sensor categorical
aggregates • Black(ish) box statistics applied to land cover informed by
modeled or remote sensed meteorology• Parameterization for biophysical model synthesis computation• Simulation is not an option• Radiative transfer meets turbulence meet ssystem biology• Existing climate models “do not evince much skill” at capturing
the biological processes • Science disclaimer: Biofuel is more complex
• Efficient and renewable biofuel production includes factors such as harvest efficiency and transportation costs
Theory Meets Reality
• Big reduction : many inputs• Not a matrix : some inputs
have geospatial categorical dependencies
ൌ��ο � ሺ ��ሻȀሺο ቀࢽ ��
�ቁሻ�
ET = Water volume evapotranspired (m3 s-1 m-2)
Δ = Change rate of sat. specific humidity with air temp.(Pa K-1) λv = Latent heat of vaporization (J/g)
Rn = Net radiation (W m-2)
cp = Specific heat capacity of air (J kg-1 K-1)
ρa = dry air density (kg m-3)
δq = vapor pressure deficit (Pa)ra = Resistance of air (m s-1)
rs = Resistance of plant stoma, air (m s-1)
γ = Psychrometric constant (γ ≈ 66 Pa K-1)
Estimating resistance across a catchment can be tricky
Penman-Monteith (1964)
Heterogeneous Data Sources
Remote sensingof CO2
Tem
pora
l sc
ale
Spatial scale [km]
hour
day
week
month
year
decade
century
local 0.1 1 10 100 1000 10 000 global
forestinventory
plot
Countries EUplot/site
Talltower sensor
obser-vatories
Forest/soil inventories
Eddycovariance
sensor towers
Landsurface remote sensing
Thanks to Markus Reichstein (Max Planck) 2010
Sourcing from Imagery, Sensors, Models, Field Data and
Wisdom
NCEP/NCAR ~100MB (4K files)
Vegetative clumping~5MB (1file)
Climate classification~1MB (1file)
FLUXNET curated field
dataset2 KB (1 file)
FLUXNET curated sensor
dataset 30GB
(960 files)
NASA MODIS imagery archives5 TB (600K files)
10 US years1 global year ~ 13 US years
http://www.fluxdata.org
Validation Classic
Local: direct pixel comparison with ground deployment
• Known good or known bad
Global: qualitative map views and large aggregates comparison
• Includes inter-annual variations
Global GPP 118± 26 PgG/y literature range 107-167
Radiation model expected to underestimate in the tropics
Shows high summer water use in the rice growing region of the Sacramento Valley and (blue) rock outcrop
The great frontier of unknown unknowns•Qualitative map observations require local knowledge – crowd source via citizen science?•Geospatial feature determination errors can be significant
Validation Vanguard
Scaling: The Synthesis Trifecta• Science
• Incorporate discovered or known omissions such as elevation, fires, storms, fertilizer
• Regional analysis flame tests• Sensors
• Refining existing sensors and variable derivations
• Incorporating new emerging sensors such as web cams
• Substrate • Move compute to data• Supercomputer size, but not
supercomputing friendly• Data discovery, reuse, harmonization Sensors are ~20 KM apart – one
shows impact of calibration drift
Phenocam detecting leaf green up and green down
Sacramento Delta 10 year average evapotranspiration
Anecdote, Analysis, Action
I was walking Dry Creek and saw stranded fish…
..had local farmers turned on sprinklers?
Flow vs Temperature 2008 Detail
top related