francisco munoz-arriola1,2 3 hallie hohbein4 ...€¦ · • david recic: backed developer; creates...

29
Hydroinformaticsand Integrated Hydrology Research Group Francisco Munoz-Arriola 1,2 Diego Jarquin 3 Hallie Hohbein 4 Parisa Sarzaeim 4 Joseph Carter 4 David Recic 4 Zoe Trautman 4 Anna Zhang 4 Byrav Ramamurthy 4 2020 G2F Collaborator’s Meeting, Phenome Meeting “Plugin”-based architecture of software to predict corn phenotypes 1 Department of Biological Systems Engineering, 2 School of Natural Resources 3 Department of Agronomy and Horticulture 4 Department of Computer Sciences and Engineering

Upload: others

Post on 03-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Hydroinformatics and Integrated Hydrology

Research Group

Francisco Munoz-Arriola1,2

Diego Jarquin3

Hallie Hohbein4

Parisa Sarzaeim4

Joseph Carter4

David Recic4

Zoe Trautman4

Anna Zhang4

Byrav Ramamurthy4

2020 G2F Collaborator’s Meeting, Phenome Meeting

“Plugin”-based architecture of

software to predict corn

phenotypes

1Department of Biological Systems Engineering, 2 School of Natural Resources3Department of Agronomy and Horticulture4Department of Computer Sciences and

Engineering

Page 2: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

AcknowledgementsThis project was supported by the USDA National

Institute of Food and Agriculture, Plant Health

and Production and Plant Products: Plant

Breeding for Agricultural Production, A1211).

Accession No.1015252

Some ideas are associated with the USDA National

Institute of Food and Agriculture, Agriculture and

Food Research Initiative HATCH project NEB-21-

166 Accession No. No.1009760

Genomes to Fields initiative

UNL’s Department of Computer Sciences and

Engineering Senior Design

Page 3: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Motivation

• Consistent increase of

water use efficiency,

farmers revenues and

yields

Ref: USDA NASS

0

40

80

120

160

200

1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015

Yie

ld (

Bu/A

)

Time (Year)

IL IAMN NE

Flood in 1993

Drought in 2012

Maize Production in Midwest

Maize Yield Reduction (1992 -1993) Maize Yield Reduction (2010 -2012)• Drops in water use

efficiency, farmers

revenues and yields

after the occurrence of

floods and droughts

Page 4: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Outline

• Framework

• G2F

• Software Architecture

• Preprocessing

• Option Selection

• Processing

• Postprocessing

• Software Demo

• Complexities

• Conclusion

• Future Work

Page 5: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

FrameworkDevelop a framework to collect, store,

manage, and use weather/climate data to

predict plant phenotypes using GxE modelEnvironment

(E)

Genetics(G)

Phenotypic Responses (Y)

GxE

Model

Genotypes

(Molecular Markers)

Phenotypes

Weather/Climate

Page 6: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

G2F

Genetics

Environments

Phenotypes (traits)

Incorporation of Environmental Information to

Improve Phenotypic Predictability in Maize G2F-

GxE Hybrid Project

G2F Experiments Distribution

Page 7: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Software Architecture

Selection

Processing

Postprocessing

Preprocessing

Database

Option

Selection

GxE Model

GxE Output

Page 8: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Database

Wind Gust

Temperature

Solar Radiation

Rainfall

Dew Point

Relative Humidity

Wind Speed

Wind Direction

2014 2015 2016

2016

2015

2014

IL H1H2

2016

2015

2014

MN H1

2016

2015

2014

NE H1H2H3H4

2016

2015

2014

IA H1H2H3H4

2016

2015

2014

NY H1H2H3H4

2016

2015

2014

DE H1

2016

2015

2014

IN H12016

2015

2014

MO H1H2

2016

2015

2014

NC H12016

2015

2014

TX H1H2

2016

2015

2014

WI H1H2H3

2016

2015

2014

ON H1H2

2016

2015

2014

AR H1H2

2016

2015

2014

GA H1H2

2016

2015

2014

KS H1H2H3

2016

2015

2014

OH H1

2016

2015

2014

SC H1

2016

2015

2014

MI H1

Complete

Uncomplete

Unavailable

Legend

52% (22)29% (12)

19% (8)

2014Complete Incomplete No Data

53% (26)

16% (8)

31% (15)

2015

50% (15)33% (10)

17% (5)

2016

52% (22)29% (12)

19% (8)

2014Complete Incomplete No Data

Page 9: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Pre-processing

Consolidated Database

G2F DataMulti-source Data

1. Integration of

various data sources

2. Correction the data

3. Synthesis the data

The Analytics of Database Improvement

Page 10: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

0

5

10

15

20

25

30

126 138 150 162 174 186 198 210 222 234 246 258 270 282

Tem

pera

ture ( C

)

Time (Day)

G2F Predicted

Jun Jul Aug Sep OctMay

Data consolidation

Temperature

Solar Radiation

Dew Point

Relative Humidity

Wind Speed

Wind Direction

Ref: NSRDB-NREL

Rainfall

Data Sources

Stations

HPRCC NSRDB

NWS

Remote Sensing

MODIS GPM

TRMM

Combination

POWER

Multi-source Data

Page 11: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Data-driven analytics

0

5

10

15

20

25

30

126 138 150 162 174 186 198 210 222 234 246 258 270 282

Tem

pera

ture ( C

)

Time (Day)

G2F Predicted

Jun Jul Aug Sep OctMay

Performance Metric Mean Min Max SD

R2 0.88 0.61 0.96 0.10

Bias -0.52 -1.15 0.13 0.37

RMSE 1.67 1.13 3.00 0.55

NSE 0.87 0.80 0.98 0.05

• R2

• Bias

• RMSE

• NSE

Performance Metrics

Page 12: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Data Processing

Files Control

Input Files

APIs

Option Selection

GxE Model

GxE Output

Meta Files

Start (login)

All required columns

are available?

Yes

NoCorrect the

headers

Database

“VariableYearStateExperiment”.csv

is complete, empty, or missing?

Database

Complete

Phenotypes Files Weather Files

Meta Files Control Phenotypes Files Control Weather Files Control

lat-lon.csv YP1P2.csv “VariableYearStateExperiment”.csv

NSRDB API DayMet API NWS API

Fulfill with Min RMSE

Empty

Missing

Fulfill with ANN

Pre

pro

ces

sin

gPre-Processing

Page 13: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Data Processing

Files Control

Input Files

APIs

Option Selection

GxE Model

GxE Output

Database

Pre-ProcessingP

rep

roces

sin

g

Empty

Complete

Missing

Page 14: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Pre-processing

Data separation for each

experiment

Correction of the Experiment

names and Check the

sequence of days

Charts for experiments

analysis

Page 15: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Pre-processing

Separating data for each

variable

Providing PDFs for each

variable

Providing charts to analyze data

availability for each variable

Page 16: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Data Processing

Files Control

Input Files

APIs

Option Selection

GxE Model

GxE Output

Database

Pre

pro

ces

sin

gTemperature (C) Dew Point (C)

Complete

Missing

Empty

DatabaseANN

Filling

Pre-Processing

Page 17: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Performance Metrics

Page 18: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Select Variables

Database

Select

Variable

Select

Experiment

Select Time Span

Generate Wall.csv

GxE Model

Generate O.csv

YP1P2.csv Markers.csv

Calculate the Correlation

Selection

Select Experiment

Select Time Span

Preprocessing

GxE Model

GxE Output

Op

tio

n S

ele

cti

on

Page 19: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Select Variables

Selection

Select Experiment

Select Time Span

Preprocessing

GxE Model

GxE Output

Op

tio

n S

ele

cti

on

Select Variable(s)

Temperature (C)

Dew Point (C)

Relative Humidity (%)

Solar Radiation (W/m2)

Rainfall (mm)

Wind Speed (m/s)

Wind Direction (degrees)

Pressure (mb)

Precipitable water (mm)

Select Experiment(s)

2014IAH3

2015NEH3

2017MOH1

.

.

.

.

.

Start day =

End day =

CV0

CV1

CV00

Test

ed

En

viro

nm

en

ts

Tested Genotypes

YES

NO

YES NO

CV2

GxE

Execu

tio

n CV00: Predicting performance of unobserved lines in

unobserved environments;

CV0: Predicting performance of unobserved

environments;

CV1: Predicting performance of new developed lines

through relationships with others;

CV2: Predicting Performance of Lines Captured in

Other Environments

Page 20: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Selection

Page 21: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Select Variables

Post-Processing

Select Experiment

Select Time Span

Preprocessing

GxE Model

GxE OutputGxE

Pre

dic

tab

ilit

y

Page 22: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Select Variables

Post-Processing

Select Experiment

Select Time Span

Preprocessing

GxE Model

GxE OutputGxE

Pre

dic

tab

ilit

y

Page 23: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Software Demo

Here we can put the software video

Page 24: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Complexities

Providing AWS (Amazon Web Service) as platform for

the phenotypic predictability application;

Authentication for different users;

Transferring all the data (G2F, NSRDB, DayMet, and

NWS) and scripts (R and Python) to the platform;

Coupling R and Python scripts to develop an integrated

software for phenotype-prediction in the G2F

experiment.

Page 25: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Conclusions

▪ The integration of other data sources to improve G2F database

unclearly improved the predictability of phenotypes;

▪ Transferring and coupling the hydroclimate data analytics and GxE

modeling scripts to the web service platform is feasible;

▪ Increasing the number of experiments may lead to a better accuracy of

phenotype predictability.

Page 26: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Future work

▪ Add climatic spatial and temporal analytics of GxE predictability module;

▪ Add a global sensitivity of GxE accuracy module to estimate sources and

propagation of uncertainty in response to various climatic (environmental)

factors;

▪ Add the remote sensing data plugin module to increase the number of

climatic variables and phenotypes in the database.

Page 27: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Some more future work

Page 28: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Team members and tasks:

• Francisco Munoz-Arriola; Team leader

• Diego Jarquin: GxE model developer; Develops R scripts for phenotypes predictions using GxE

• Hallie Hohbein: Project Manager; Takes care of project management tasks, documentation, and testing

• Parisa Sarzaeim: Hydroclimate data scientist; Develops Python scripts to manage hydroclimate

database

• Joseph Carter: Frontend/Backend Developer; Works on user authentication, frontend development,

and testing.

• David Recic: Backed Developer; Creates the database and works on user authentication.

• Zoe Trautman: Frontend Developer; Develops the frontend and writes documentation.

• Anna Zhang: Development Manager; In charge of AWS and helps with backend development.

• Byrav Ramamurthy and Francisco Munoz-Arriola; Computer science advisers

Page 29: Francisco Munoz-Arriola1,2 3 Hallie Hohbein4 ...€¦ · • David Recic: Backed Developer; Creates the database and works on user authentication. • Zoe Trautman: Frontend Developer;

Thank You

This project was supported by the Agriculture and Food Research Initiative Grant number NEB-21-176

and NEB-21-166 from the USDA National Institute of Food and Agriculture, Plant Health and

Production and Plant Products: Plant Breeding for Agricultural Production, A1211).

Accession Nos.1015252 and No.1009760