big data: challenges in agriculture - research park...
Post on 22-Mar-2018
218 Views
Preview:
TRANSCRIPT
Big Data: Challenges in Agriculture
Big Data Summit, November 2014 Moorea Brega: Agronomic Modeling Lead – The Climate Corporation
Outline
THE AGRICULTURAL
CHALLENGE
THE ROLE OF DATA SCIENCE
DATA SCIENCE MEETS
AGRICULTURE
The Agricultural Challenge
The Agricultural Challenge
Forty decisions, Forty outcomes
Key Decisions
equipment selection results analysis seed selection crop selection
fertility management water management
planting logistics planting practices
scouting inputs fertility
harvest logistics harvest timing grain marketing
Each season about 40 management decisions are made
Sequential Multi-Arm Bandit
Bayesian network
Each season about 40 management decisions made
many decisions
one outcome
Forty outcomes A typical farmer will manage 40 seasons
a typical farmer
manages 40 seasons
providing 40 outcomes
Data Science Meets Agriculture
The Next Revolution?
GREEN REVOLUTION GREEN DATA REVOLUTION
INTENSIFY Apply breeding, fertilization
to increase yields.
OPTIMIZE Apply data science to optimize
management.
BIOTECH Marker assisted selection, traits,
chemistries, microbials.
BIOTECH REVOLUTION
Data Available One Crop, One Season, One Country
YIELD MONITOR DATA 14B OBSERVATIONS
REMOTE SENSING DATA 260B OBSERVATIONS
WEATHER DATA 20B OBSERVATIONS
Yield Modeling
yield genetics environment practices variability
y = f (g, e, p) + ε
Yield is a function of genetics, the environment & farming practices
Yield Optimization
OPTIMIZED YIELD Yield optimized for environment by optimization of genetics and management using predictive model.
YIELD Yield optimized for environment by optimization of genetics and management traditional practices.
Challenges in Applying Data Science
Challenges
Data Challenges • Spatio-Temporal Data
• Heterogeneous Data
• Missing Data
• Noisy Data
Learning Challenges • Latent Features
• Curse of Dimensionality
• Multi-task Learning
2005
1950
1870
Multi-sensor Data high spatial resolution gridded data observed data
Reanalysis Data coarse spatial resolution gridded data produced by deterministic weather models
Gauge Data sparse spatially and temporally observed data
Data Challenges Spatial Misalignment
Data Challenges Spatial and Temporal Misalignment
Low resolution, higher temporal frequency
High resolution, lower temporal frequency
Data Challenges Heterogeneous Data
Data Challenges Missing Data
Yield data can be missing due to: ● pest/disease ● low yield due to
other causes (heat, drought, frost, ponding)
● equipment malfunction
● data post-processing by an outside entity
Data Challenges Noisy Data
Noise can come from many sources: ● Clouds and
atmospheric disturbance
● Equipment malfunction/equipment calibration issues
● Measurement error ● Human error ● Mislabeled data or data
with no labels
Inherent Complexity
Zea mays (corn)
Genetics, Environment, Practices
Soil Processes
Nutrient Processes
Crop Processes
YIELD
The Role of Data Science
The Role of Data Science A Coherent View of the Field
weather sensors remote sensors ground sensors
Utilizing multiple sources of data before and during the growing season to provide growers with insights and recommendations.
The Role of Data Science Identifying Crop Stress
Insights When is the crop under stress? Recommendations What actions can I take to correct this?
The Role of Data Science Nutrient Applications
Grower Practices
weather data
Insights How much nitrogen is available to my crop?
Recommendation How much fertilizer should I apply?
The Role of Data Science
Bayesian network
Optimize each decision for risk adjusted return
step-wise optimization of conditional expected utility
The Next Revolution?
GREEN REVOLUTION GREEN DATA REVOLUTION
INTENSIFY Apply breeding, fertilization
to increase yields.
OPTIMIZE Apply data science to optimize
management.
BIOTECH Marker assisted selection, traits,
chemistries, microbials.
BIOTECH REVOLUTION
Questions? moorea@climate.com
top related