helping the world’s farmers adapt to climate change

Helping the World’s Farmers Adapt to Climate Change

Strata Conference Oct 2012Siraj Khaliq, CTO, The Climate Corporation

Fritchton, IN – late summer, 2012

Louisville, IL

Wichita, KA

Click to edit Master title style

• Click to edit Master text styles– Second level

• Third level– Fourth level

» Fifth level 195620121988

Worst US Droughts in the Last Fifty Years

-16%2012 Estimated Corn Yield (USDA)

+6%World food prices month-on-month

change in July 2012 (UNFAO)

Large capital outlays at start of season (April)

Seed, equipment, pesticide, and land

Revenue comes in at harvest

1-2 years of revenue shortfall could be catastrophic

Futures help with price volatility, not weather

Farm Economics

Farmer Rich Vernon talks to NPR's David Schaper (audio)

A real-life example

This is set to continue

To help all the world's people & businesses manage and

adapt to climate change

Our Mission

Evaluating Markets

$4.2 Trillion2012 Estimated Corn Yield (USDA)

Total Weather Insurance (TWI)

TWI Demo

OutcomeWeather DataPolicy

Modeled Outcomes

Weather Simulations

Structure

StructureHow does weather impact crop yield?

Structure

Varies based on many inputs: Temperature Precipitation Soil type Topography Farming practices Crop varietal

Structure

Agronomically deduced candidates Model at large scale Every farm in the US (20M)

Structure

Modeled Outcomes

Weather Simulations

Structure

What weather dowe expect?

Weather Simulations

1M locations (2.5mi x 2.5mi grid)10k scenarios/location

going 2 years out

2 measurements

60Tb of data per

simulation set

every couple of weeks

Weather Simulations

2.5 x 2.5Square Miles

Weather Simulations

Expensive computation Parallelizing hard due to correlations

Would take 80+ years on one fast modern server-class machine

We need to generate these within days

Soil Moisture Modeling

What's the soil moisture at farm X?

Soil Moisture Modeling

soil type, weather, topography, crop

Evolution of Our Technology

Java frontend

PricingServer (Rserve)

400 stations All data in MySQLPricing servers (Rserve)Java-based webapp

Java frontend

PricingServer (Rserve)

2000 stations Weather data now on disk Versioning hard Java-R bridge messy

Java frontend

PricingServer (java)

SimulationService

Weather dataServiceSim gen (hadoop)

SimpleDB / S3 SimpleDB / S3

2009-2010

22,000 locations Rserve replaced by java Simulations & S3/SimpleDB Model gen in Hadoop Moved fully to EC2

Rails frontend

PricingServer (java)

Marty (HBase)Geo data storeSim gen

(cascalog)

2011 – today

1,000,000 locations Own big geo-data store Many more hadoop jobs Eliminated SimpleDB

Soil moisture dataset gen (cascalog)

Structures gen (cascalog)

Other hadoop jobs

MapReduce at TCC

Python (Hadoop streaming) Some native java Most are higher-level frameworks

Big Wins

Cascalog/Clojure EC2 Spot Instances “NoSQL”

Big Win #1 - Cascalog

(defn weather-map-q "Creates a Cascalog query to extract individual measurement values of ObservationSet data and produces tuples of [date JSON-encoded map], in which each JSON-encoded map is keyed by station-id" [stations interval measurement sources start end nostra] (<- [?date ?json-aggregated-values] ; from hfs-textline (stations ?station-id) (fetch-obs-for-station [interval measurement sources start end nostra] ?station-id :> ?obs) (extract-values-by-date ?obs :> ?date ?value) (aggregate-values ?value :> ?aggregated-values) (json/generate-string ?aggregated-values :> ?json-aggregated-values)))

Big Win #1 - Cascalog

Easily composable workflows Can unit test Hadoop flows Quick iteration

Big Win #2 – EC2 Spot Instances

Good fit to our compute approach Can be very cheap Good availability

MapReduce at TCC

Big Win #3: NoSQL

Datasets must be: Repeatably Generated Versioned Indexed

Big Win #3 – NoSQL

Why not SQL? Time-series data, not relational Large size and ad hoc structure Specific query patterns 10s of Terabytes in size

NoSQL at TCC - Marty

Own big geo-data store Built on HBase Billions of records

Learning #1 – Embrace Hadoop

Defines problem clearly Focus on problem more than architecture Great tools and community support

Learning #2 – Be Careful

Fail-fast code Test, test, test Run at small scale first

Learning #3 – Architecture Matters

Eliminate single points of failure Consider memory usage and I/O Write simple flows with checkpointing Monitoring is invaluable

TCC Today

150 employees Half engineering 20 PhDs Reputation for hard science problems

… by standing on the shoulders of giants

Open Source at TCC

github.com/TheClimateCorporation

Lemur (EMR / Clojure) Repoman (coming soon) Marty (coming)

helping the world’s farmers adapt to climate change

stations weather data

master title styleclick

master title styleto

cascalogdefn weathermap

big geodata

obs extractvalues

tcc python hadoop

farm x

Documents

bs 5534 helping you adapt to the changes - klober uk...

“bee” responsible - croplife international€¦ ·...

eighty years of helping farmers succeed...eighty years of...

lost in transition - helping students to adapt to new...

shanxi farmers embrance modern irrigation methods to adapt...

agriculture helping agriculture adapt · dr pierluigi...

helping china’s farmers adapt to climate change

how to encourage farmers to adapt to climate change? · pdf...

helping farmers help themselves: ciat’s contribution to...

helping farmers feed the world with apis and data

helping the world’s poorest farmers adapt to a changing...

“when life happens” helping people adapt to change...

helping the worlds farmers adapt to climate change strata...

how to encourage farmers to adapt to climate change

a rural revival in tanzania: how agroforestry is helping...

preface - farmers helping farmers · 5.3 manure compost and...

helping your woodland adapt to a changing climate · of...

helping farmers grow more with less€¦ · naty barak |...

helping manx farmers evolve and grow agri-news march...

proceedings of the helping islands adapt workshop€¦ · 4...