development of numerical coupled analysis method by air ... · 3rd-order adams-bashforth methods...

2
Our research is about snow accretion on a train. The train travels over snow covered tracks, snow accretes to train bogies. When the accreted snow dropped off from train bogies, they might damage the railway ground facilities along the tracks, the train devices, etc. To establish countermeasures against the damage, we have developed a snow accretion simulator. By performing snow accretion analysis for various modified train shape, we will find the train shape which reduces the amount of accreted snow. Motivation and Objective We have validated the results obtained from our snow accretion by comparing them with those obtained from the experiments by the use the snowfall wind tunnel. Additionally, since this snow accretion simulator is made by the distributed memory parallel calculation programing, it allows us to solve a very huge calculation model such as a whole train bogie. Air flow analysis by “Airflow simulator” Trajectory calculation of flying snow by “Equation of motion for gravity and drag” Snow accretion analysis by “Particle simulator” Distribution of velocity of air flow Snow trajectory If space-filling ration > β (≈0.65) and collision angle < γ (≈60°), then the target flying snow particle becomes the snow accretion particle. Snow accretion algorithm Our aim Numerical Coupled Analysis Method by Air Flow Analysis and Snow Accretion Analysis Wall particle Snow accretion particle Flying snow particle Target flying snow particle Kohei Murotani, Koji Nakade, Yasushi Kamata, Daisuke Takahashi (Railway Technical Research Institute) Development of Numerical Coupled Analysis Method by Air Flow Analysis and Snow Accretion Analysis d U snow dt = 3 4 C d ρ air ρ snow 1 d snow U r U r + g Collision angle Velocity of target flying snow particle Normal vector of snow accretion surface Second judging method Observation Space-filling ratio First judging method Analysis Analysis area: 5m×1m×1m About 60 million particles Particle diameter: 1mm 1,000,000 steps 264 nodes of Cray XC50 Total wall-clock time: 12hours Analysis area: 5m×1m×1m 188 million grids (1710x440x250) Minimum grid spacing: 1mm About 1,000,000 steps 264 nodes of Cray XC50 Total wall-clock time: 5hours Air flow analysis Snow accretion analysis Inflow if max div v Inflow Inflow Inflow Experiment Snow accretion analysis and air flow analysis Transparent particles are flying snow particles. Opacity particles are snow accretion particles. Colors are velocity magnitude of air flow. Inflow Inflow Analysis area: 4m×1m×1m About 10 million particles Particle diameter: 1mm 1,000,000 steps 264 nodes of Cray XC50 Total wall-clock time: 3hours Analysis area: 4m×1m×1m 257million grids (1100x500x500) Minimum grid spacing: 1mm About 1,000,000 steps 264 nodes of Cray XC50 Total wall-clock time: 7hours Air flow analysis Snow accretion analysis Inflow Experiment Analysis If two new snow accretion layer are generated. Inflow Experiment of snow accretion at snowfall wind tunnel at Snow and Ice Research Center, National Research Institute for Earth Science and Disaster Resilience The experiment team realizes the observed actual phenomenon for simple and small model using the snowfall wind tunnel. The analysis team develops the snow accretion analysis method using Cartesian grid methods and particle method. The developed snow accretion simulator is validated by trial and error with the experiment using simple and small models. Inflow Inflow Inflow Modification of boundary shape Snow accretion analysis Cubic shape Cubic shape Cubic shape Cubic shape Cubic shape

Upload: others

Post on 17-Oct-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Development of Numerical Coupled Analysis Method by Air ... · 3rd-order Adams-Bashforth methods Poisson equation solver by Jacobi method LES model by coherent structure Smagorisky

Our research is about snow accretion on a train. The train travels over snow covered tracks, snow accretes to train bogies. When theaccreted snow dropped off from train bogies, they might damage the railway ground facilities along the tracks, the train devices, etc.

To establish countermeasures against the damage, we have developed a snow accretion simulator. By performing snow accretionanalysis for various modified train shape, we will find the train shape which reduces the amount of accreted snow.

Motivation and Objective

We have validated the results obtained from our snow accretion by comparing them with those obtained from the experiments by the use the snowfall wind tunnel. Additionally, since this snow accretion simulator is made by the distributedmemory parallel calculation programing, it allows us to solve a very huge calculation model such as a whole train bogie.

Air flow analysis by “Airflow simulator”

Trajectory calculation of flying snow by “Equation of motion for gravity and drag”

Snow accretion analysis by “Particle simulator”

Distribution of velocity of air flow

Snow trajectory

If space-filling ration > β (≈0.65)and collision angle < γ (≈60°) ,

then the target flying snow particle becomes the snow accretion particle.

Snow accretion algorithm

Our aim

Numerical Coupled Analysis Method by Air Flow Analysis and Snow Accretion Analysis

Wall particle

Snow accretion particle

Flying snow particle

Target flying snow particle

Kohei Murotani, Koji Nakade, Yasushi Kamata, Daisuke Takahashi (Railway Technical Research Institute)

Development of Numerical Coupled Analysis Method by Air Flow Analysis and Snow Accretion Analysis

dUsnow

dt=3

4Cd

ρair

ρsnow

1

dsnowUr Ur + g

Collision angleVelocity of target

flying snow particle

Normal vector of snow accretion surface

Second judging method

Observation

Space-filling ratio

First judging method

Analysis

Analysis area: 5m×1m×1m About 60 million particles Particle diameter: 1mm 1,000,000 steps 264 nodes of Cray XC50 Total wall-clock time: 12hours

Analysis area: 5m×1m×1m 188 million grids (1710x440x250)Minimum grid spacing: 1mm About 1,000,000 steps 264 nodes of Cray XC50 Total wall-clock time: 5hours

Air flow analysis

Snow accretion analysis

Inflow

if max div v < α

InflowInflow

Inflow

Experiment

Snow accretion analysis and air flow analysis Transparent particles are flying snow particles. Opacity particles are snow

accretion particles. Colors are velocity magnitude of air flow.

Inflow Inflow

Analysis area: 4m×1m×1m About 10 million particles Particle diameter: 1mm 1,000,000 steps 264 nodes of Cray XC50 Total wall-clock time: 3hours

Analysis area: 4m×1m×1m 257million grids (1100x500x500)Minimum grid spacing: 1mm About 1,000,000 steps 264 nodes of Cray XC50 Total wall-clock time: 7hours

Air flow analysis

Snow accretion analysis

Inflow

Experiment

Analysis

If two new snow accretion layer are

generated.

Inflow

Experiment of snow accretion at snowfall wind tunnel

at Snow and Ice Research Center, National Research Institute for Earth Science and Disaster Resilience

The experiment team realizes the observed actual phenomenon for simple and small model using the snowfall wind tunnel. The analysis team develops the snow accretion analysis method using Cartesian grid methods and particle method. The developed snow accretion simulator is validated by trial and error with the experiment using simple and small models.

Inflow

Inflow

Inflow

Modification of boundary shape

Snow accretion analysis

Cubic shape

Cubic shapeCubic shape

Cubic shape

Cubic shape

Page 2: Development of Numerical Coupled Analysis Method by Air ... · 3rd-order Adams-Bashforth methods Poisson equation solver by Jacobi method LES model by coherent structure Smagorisky

Airflow simulator

Particle Simulator

Halo area is generated. Domain decomposition for buckets by ParMETIS library

The number of particles in each domain is equally partitioned.

Embedding particles in buckets

0.1976 sec 0.1984 sec

4,000

4,500

5,000

5,500

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Number of particles

Time

PE0 PE1

PE2 PE3Halo

exchange

Halo exchange

Halo exchange

PE 0 PE 1

PE 2 PE 3

Navier‐Stokes equation for Incompressible fluid flow

Finite difference method for nonuniform grid

Fractional step method

2nd-order central difference

3rd-order Adams-Bashforth methods

Poisson equation solver by Jacobi method

LES model by coherent structure Smagorisky model

Orthogonal domain decomposition

Target problem size : 10 million - 100 billion grids

Compiler : Fortran 90

Particle simulator by particle interaction

Moving Particle Simulation (MPS) method

Corrosion detection amongparticles using buckets

Dynamic load balancing by ParMETIS

Target problem size : 1 million -10 billion particles

Compiler : C

Overview

Overview

Domain decomposition and Halo exchange Dynamic load balancing

Strong scaling of 10 billion particles by K computer

System Site Processor (peak FLOPS) Nodes

K computer RIKEN2.0 GHz SPARC64 VIIIfx,

8-Core (128 GFLOPS)88,128

(1 CPUs / node)

FX100 Nagoya Univ.2.2 GHz SPARC64 XIfx,

32-Core (1,126 GFLOPS)2,880

(1 CPUs / node)

XC50Railway Technical Research Institute

2.7 GHz Intel Xeon Gold 6150,18-Core (1,555 GFLOPS)

264 (2 CPUs / node)

Supercomputer used in this researchACKNOWLEDGMENTS

This research used computational resources of the K

computer provided by the RIKEN Advanced Institute for

Computational Science and the FX100 provided by the

Nagoya university through the HPCI System Research

project (Project ID: hp170067, hp180014). This work was

supported by JSPS KAKENHI Grant Number 17K05152.

Communication method between airflow simulator and particle simulator

Calculation area 3 for particle simulator

Calculation area 0 for particle simulator

Calculation area 1 for particle simulator

Calculation area 2 for particle simulator

Calculation area 0 for airflow simulator

Calculation area 1 for airflow simulator

Calculation area 2 for airflow simulator

Calculation area 3 for airflow simulator

Calculation area 3 for particle simulator

Calculation area 0 for particle simulator

Calculation area 1 for particle simulator

Calculation area 2 for particle simulator

Calculation area 0 for airflow simulator

Calculation area 1 for airflow simulator

Calculation area 2 for airflow simulator

Calculation area 3 for airflow simulator

Communication of velocity distribution from airflow simulator to particle simulator

Communication of boundary shapefrom particle simulator to airflow simulator

4.01

2.28

1.23

0.80

0.550.49

1.00

0.88

0.81 0.84

0.70 0.61

0.0

0.2

0.4

0.6

0.8

1.0

0.25

0.50

1.00

2.00

4.00

6,144 12,288 24,576 49,152

The number of nodes

Wall-clock time (s)

6,144 12,288 24,576 36,864

Wall-clock time

Parallel efficiency

Parallel efficiency

64,512 82,944

76.66

38.69

19.38

9.79

5.02

2.48 1.73

1.37 1.19

1.00 0.99 0.99 0.98

0.95 0.97

0.92

0.88

0.67

0.5

0.6

0.7

0.8

0.9

1.0

0.25

1.00

4.00

16.00

64.00

768 1536 3072 6144 12288 24576 49152

Wall-clock time (s)

Parallel efficiency

The number of nodes

Wall-clock time (s)

768 3,072 24,576

36,864

6,144 49,1521,536 12,288

Wall-clock time

Parallel efficiency

Parallel efficiency

73,728

Strong scaling of 100 billion grids by K computer

Elapsed (s)MFLOPS / PEAK (%)

Memory access throughput /

PEAK (%)SIMD (%)

L1 cache miss (%)

L2 cache miss (%)

TLB miss (%)

Average Maximum

LES model 0.94 5.08 32.04 71.55 4.32 3.22 0.0285

Viscosity and Advection 0.69 4.74 35.95 80.19 4.94 3.24 0.0240

Poisson's equation 2.24 3.71 27.90 15.97 2.89 1.79 0.0007

One Loop 4.44 3.85 29.96 29.75 3.41 2.33 0.0077

Elapsed (s)MFLOPS / PEAK (%)

Memory access throughput /

PEAK (%)SIMD (%)

L1 cache miss (%)

L2 cache miss (%)

TLB miss (%)

Average Maximum

One loop 0.73 25.8 0.30 49.2 5.09 0.02 0.00

Computational Performances of 100 billion grids by 6,144 nodes

Computational Performances of 10 billion particles by 768 nodes

0.89

0.42

0.21

0.0

0.5

1.0

K FX100 XC50

Tims (s)3.85

0.92

0.0

2.0

4.0

K FX100 XC50

FLOPS/PEAK (%)

6.91

0.94 0.380.0

4.0

8.0

K FX100 XC50

Tims (s)25.82

21.66

0.0

10.0

20.0

30.0

K FX100 XC50

FLOPS/PEAK (%)

Calculation time of one loop of 42M grids by 12 nodes FLOPS / PEAK of one loop of 42M grids by 12 nodes

Calculation time of one loop of 16M particles by 12 nodes FLOPS / PEAK of one loop of 16M particles by 12 nodes

unmeasured

unmeasured