gpu acceleration of weather forecasting and meteorological...
TRANSCRIPT
Allen Huang, Ph.D.
CTO, Tempo Quest Inc.
GTC 2016
San Jose, CA
5 April, 2016
GPU Acceleration of Weather Forecasting and
Meteorological Satellite Data Assimilation,
Processing and Applications
http://www.tempoquest.com
1
• Why Weather Forecast is not accurate enough– Model is not Perfect yet – evolving scientific understanding & algorithm
development– Data is not always accurate – actual and accurate initial data are expensive
to collect & process– High performance computer is expensive – only can afford limited resource
to deploy & operate HPC
• Acceleration of Weather Forecasting S/W– Same forecasts faster, much faster– Better forecasts take much more computations
• Location, timing, intensity, next hour, tomorrow, next week, …. • Most of the legacy S/W can’t take advantage of the new H/W
• Acceleration of Satellite Data Processing– Hyperspectral Data Retrieval– Hyperspectral Data Compression
• Summary
GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications
2
• Why Weather Forecast is not accurate enough– Model is not Perfect yet – evolving scientific understanding & algorithm
development– Data is not always accurate – actual and accurate initial data are expensive
to collect & process– High performance computer is expensive – only can afford limited resource
to deploy & operate HPC
• Acceleration of Weather Forecasting S/W– Same forecasts faster, much faster– Better forecasts take much more computations
• Location, timing, intensity, next hour, tomorrow, next week, …. • Most of the legacy S/W can’t take advantage of the new H/W
• Acceleration of Satellite Data Processing– Hyperspectral Data Retrieval– Hyperspectral Data Compression
• Summary
GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications
3
Why are the Weather Forecast Models
not accurate enough?
4
Three critical factors:1. Imperfect MODEL2. Lack of/Erroneous INITIAL
DATA/CONDITIONS No data or sparse
coverage, infrequent Unknown attributes;
not coupled3. Lack of COMPUTING
POWER
4
Why are the Weather Forecast Models
not accurate enough?
5
Three critical factors:1. Imperfect MODEL2. Lack of/Erroneous INITIAL
DATA/CONDITIONS3. Lack of COMPUTING POWER
Increasing needs of ensemble runs
Increasing demands for higher resolution
Increasing high frequency of assimilations
Increasing model complexityResulting to high demand in computing resources
100,000 to 200,000 CPU cores required for:
Global cloud resolvingNIM @2KM resolution, 2x/day
Regional ModelsNorth American (NA) DomainHRRR @<1KM, hourly
EnsemblesHRRR @3KM NA, 100 members, hourly
Reference : 250,000 CPU cost ~$100M; use 7,000KW & ~$8M/year energy bill
5
Why are the Weather Forecast Models not accurate enough?
6
Operational (T574~ 27km)
Experiment (T1500~ 13km)
Note: Last 24h of the
high resolution
experiment track based
on 6h model output2X resolution ≈ 10X of computing cost 6
1 Zflops = 1021 flops
1 million trillion (1 billion billion) flop per sec, or 1 exaflops
7
• Why Weather Forecast is not accurate enough– Model is not Perfect yet – evolving scientific understanding & algorithm
development– Data is not always accurate – actual and accurate initial data are expensive
to collect & process– High performance computer is expensive – only can afford limited resource
to deploy & operate HPC
• Acceleration of Satellite Data Processing– Hyperspectral Data Retrieval– Hyperspectral Data Compression
• Acceleration of Weather Forecasting S/W– Same forecasts faster, much faster– Better forecasts take much more computations
• Location, timing, intensity, next hour, tomorrow, next week, …. • Most of the legacy S/W can’t take advantage of the new H/W
• Summary
GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications
8
9
Processing times – CPU Vs. GPU
Early Result (2009)
Our experiments on the Intel i7 970 CPU running at 3.20 GHz and a single GPU out of two GPUs on NVIDIA GTX 590
Time [ms]
The original Fortran code on CPU 16928
CUDA C with I/O on GPU 83.6
CUDA C without I/O on GPU 48.3
The Fast Radiative Transfer Model
with the regression-based transmittances:
0
( )( ) ( ) ( )
spv
v v v s v s v
d pR B T p B T p dp
dp
Without losing the generality of our GPU implementation, we
consider the following radiative transfer model:
11
12
A forward model to concurrently compute 40 radiance spectra was further
developed to take advantage of GPU’s massive parallelism capability.
To compute one day's amount of 1,296,000 IASI spectra,
the original RTM (with –O2 optimization) will take ~10 days on a 3.0 GHz CPU core;
the single-input GPU-RTM will take ~ 10 minutes (with 1455x speedup), whereas
the multi-input GPU-RTM will take ~ 5 minutes (with 3024x speedup).
GPU-based Multi-input RTM
GPU Acceleration of Satellite Hyper SpectralMaximum Likelihood Retrieval
14
GPU Acceleration of Predictive Partitioned Vector
Quantization for Ultraspectral Sounder Data Compression
15
• Why Weather Forecast is not accurate enough– Model is not Perfect yet – evolving scientific understanding & algorithm
development– Data is not always accurate – actual and accurate initial data are expensive
to collect & process– High performance computer is expensive – only can afford limited resource
to deploy & operate HPC
• Acceleration of Satellite Data Processing– Hyperspectral Data Retrieval– Hyperspectral Data Compression
• Acceleration of Weather Forecasting S/W– Same forecasts faster, much faster– Accleration of Weather Research and Forecasting (WRF) Model
• Radiation; PBL, Surface• Cumulus Parameterization, Cloud Microphysics and Dynamic Core
• Summary
GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications
16
CONtinental United States (CONUS) benchmark data set for 12 km resolution domain for October 24, 2001
• The size of the CONUS 12 km domain is 433 x 308 horizontal grid points with 35 verticallevels.
• The test problem is a 12 km resolution 48-hour forecast over the Continental U.S.capturing the development of a strong baroclinic cyclone and a frontal boundary thatextends from north to south across the entire U.S. 17
18
RRTMG LW 123x / 127x (GPU) JSTARS, 7, 3660-3667, 2014
RRTMG SW 202x / 207x (GPU) JSTARS, PP, 1-11, 2015
Goddard SW 92x / 134x (GPU) JSTARS, 5, 555-562, 2012
Dudhia SW 19x / 409x
MYNN SL 6x / 113x
TEMF SL 5x / 214x
Thermal Diffusion
LS
10x / 311x [ 2.1 x ] (GPU) JSTARS, 8, 2249-2259, 2015
YSU PBL 34x / 193x [ 2.4x ] (GPU) GMD, 8, 2977-2990, 2015
TEMF PBL [14.8x ] (MIC) SPIE:doi:10.1117/12.2055040
Betts-Miller-Janjic
(BMJ) convetion
55x / 105x
Rad
iati
on
Su
rfa
ceP
BL
CU
P
GPU speedup: speedup with IO / speedup without IO
MIC improvement factor in [ ]: w.r.t. 1st version multi-threading code before any improvement
Kessler MP 70x / 816x J. Comp. & GeoSci., 52, 292-299, 2012
Purdue-Lin MP 156x / 692x [ 4.2x] (GPU) SPIE: doi:10.1117/12.901825
WSM 3-class MP 150x / 331x
WSM 5-class MP 202x / 350x (GPU) JSTARS, 5, 1256-1265, 2012
Eta MP 37x / 272x SPIE: doi:10.1117/12.976908
WSM 6-class MP 165x / 216x (GPU) J. Comp. & GeoSci., 83, 17-26,
2015
Goddard GCE MP 348x / 361x [ 4.7x] (GPU) JSTARS, 8, 2260-2272, 2015
Thompson MP 76x / 153x [ 2.3x] (MIC) SPIE: doi:10.1117/12.2055038
SBU 5-class MP 213x / 896x JSTARS, 5, 625-633, 2012
WDM 5-class MP 147x / 206x
WDM 6-class MP 150x / 206x J. Atmo. Ocean. Tech., 30, 2896, 2013
Clo
ud
Mic
rop
hy
sics
GPU speedup: speedup with IO / speedup without IO
MIC improvement factor in [ ]: w.r.t. 1st version multi-threading code before any improvement20
Tempo Quest Inc. (TQI) S/W Product PipelineWeather/Environment Domain
AceCAST Lite: 6 months out
Pre AceCAST (CPU/GPU “Hybrid” WRF)
AceCAST: 12 months out (subject to funding)
CUDA GPU WRF
Beyond AceCAST: 2-3 years out (subject to funding)
DataCAST (CUDA WRF Data Assimilation)
ChemCAST (CUDA WRF Chem)
HurCAST (CUDA Hurricane WRF)
HydroCAST (CUDA WRF Hydro)
FireCAST (CUDA WRF Fire)
21
GPU Acceleration of Weather Forecasting and Meteorological
Satellite Data Assimilation, Processing and Applications
22
Thank you for your Attention
Questions are Welcomed