z. bowles, p. tissot, p. michaud, a. sadovski, s. duff, g. jeffress

Composite Training Sets: Enhancing the Learning Power of Artificial Neural Networks for Water Level Forecasts

Z. Bowles, P. Tissot, P. Michaud, A. Sadovski, S. Duff, G. Jeffress

Texas A&M University – Corpus Christi

Division of Nearshore Research

DNR

http://lighthouse.tamucc.edu

Texas Coastal Ocean Observation Network

(TCOON)Started 1988

Over 50 stations

Source of study data

Primary sponsors General Land Office Water Devel. Board US Corps of Eng Nat'l Ocean Service

Morgan’s Point

Typical TCOON station

• Wind AnemometerWind Anemometer• Radio AntennaRadio Antenna• Satellite TransmitterSatellite Transmitter• Solar PanelsSolar Panels• Data CollectorData Collector• Water Level SensorWater Level Sensor• Water Quality SensorWater Quality Sensor• Current MeterCurrent Meter

Tides and water levels

Tide: The periodic rise and fall of a body of water resulting from gravitational interactions between Sun, Moon, and Earth.

Tide and Current Glossary, National Ocean Service, 2000

Water Levels: Astronomical + Meteorological forcing + Other effects

Harmonic analysis

Standard method for tide predictions

Represented by constituent cosine waves with known frequencies based on gravitational (periodic) forces

Elevation of water is modeled as

h(t) = H0 + Hc fy,c cos(act + ey,c – kc)

h(t) = elevation of water at time tH0 = datum offsetac = frequency (speed) of constituent tfy,c ey,c = node factors/equilibrium args

Hc = amplitude of constituent ckc = phase offset for constituent cMaximum number of constituents = 37

What we are trying to do...

…what will happen next?

We know what happens in the past...

Harmonic vs. actual (when it works)

(coastal station)

Summertime

Harmonic vs. actual (when it fails)

Frontal Passages

Tropical Storm Season

Summer

(shallow bay)

Frontal Passages

Tropical Storm Season

Summer

(deep bay)

Standard Suite Used by U.S. National Ocean

Service (NOS)

Central Frequency (15cm) >= 90%

Positive Outlier Frequency(30cm) <= 1%

Negative Outlier Frequency(30cm) <= 1%

Maximum Duration of Positive Outliers (30cm) - user based

Maximum Duration of Negative Outliers (30cm) - user based

RMSE=0.12CF=82.71

RMSE=0.16 CF=70.09

RMSE=0.10 CF=89.1

RMSE=0.12 CF=81.7

RMSE=0.16 CF=71.65

RMSE=0.15 CF=74.37

Tide performance along the Texas coast (1997-2001)

Importance of the problem

Gulf Coast ports account for 52.3% of total US tonnage (1995) 1240 ship groundings from 1986 to 1991 in Galveston BayLarge number of barge groundings along the Texas Intracoastal Waterways Worldwide increases in vessel draftGalveston is the 2nd largest port in US

Artificial Neural Network (ANN) modeling

Started in the 60’s

Key innovation in the late 80’s: backpropagation learning algorithms

Number of applications has grown rapidly in the 90’s especially financial applications

Growing number of publications presenting environmental applications

ANN schematic

Philippe Tissot - 2000

H (t+i)

Output LayerHidden Layer

Wind Squared History

Water Level History

Input Layer

Water Level Forecast

(a1,ixi)

b1

b2

(X1+b1)

b3

(X2+b2)

(X3+b3)

(a2,ixi)

(a3,ixi)

Tidal Forecasts

Why ANN’s?

Modeled after human brain

Neurons compute outputs (forecasts) based on inputs, weights and biases

Able to model non-linear systems

Hypothesis…

If the human brain learns best when faced with many situations and challenges, so should an Artificial Neural Network

Therefore, create many challenging training sets to optimize learning patterns and situations

Composite Training Sets

Past models were trained on averaged yearly data sets

These models were trained on specific weather events and patterns of 30 days

The goal was to see the effects of specialized sets on learning and performance of the ANN

Artificial Neural Network setup

ANN models developed within the Matlab and Matlab NN Toolbox environmentFound simple ANNs are optimumUse of ‘tansig’ and ‘purelin’ functionsUse of Levenberg-Marquardt training algorithmANN trained over fourteen 30-day sets of hourly data

Transform Functions

-3 -2 -1 0 1 2 3

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

Tansig Purelin

y = xy = (ex – e-x)/(ex + e-x)

Research Location

Primary Station

Secondary Stations

Optimization (training) process

Used all data sets in training to find best combination of previous water levels and wind dataRanked data set individual performanceSuccessively added data sets from most successful to worst to investigate performanceChanged forecast hours to assess trend

ANN Model

Primary Station: Morgan’s Point48 Hours of previous WL36 Hours of previous winds

Secondary Station: Point Bolivar24 Hours of previous WL24 Hours of previous winds

Example data set

(Julian Days) 2003265 - 2003295

Training with one set (X = 15cm)Morgan’s Point

Data set ranking

Effects of increasing data sets(Morgan’s Point)

NOS Standard

Performance applied to 1998

Hours (1998)

Water level (m)

Close up…

Hours (1998)

WL (m)

Model Comparison

Forecast trendMorgan’s Point

NOS Standard

Conclusions

Large difference in performance due to training sets

Increasing the number of data sets increases performance

Future Direction

Analyze environmental factors of successful training sets

Research significance of subtle differences in ANN model training

Web-based predictions

The End!

Acknowledgements: General Land Office Texas Water Devel. Board US Corps of Eng Nat'l Ocean Service NASA Grant # NCC5-517

Division of Nearshore Research (DNR) http://lighthouse.tamucc.edu

z. bowles, p. tissot, p. michaud, a. sadovski, s. duff, g. jeffress

Documents

water level forecastsz

water levelstide

body of water

learning patterns

c kcht

frequency speed of constituent

challenging training

amplitude of constituent