deep learning structures for time series data in manufacturing:...
TRANSCRIPT
Deep Learning Structures for Time Series Data in Manufacturing:
RNN Focus
Suhyun Kim, Seungtae Park, and Seungchul Lee*
UNIST
Motivation
• Manufacturing data
– Contain mechanical information
– Cheap acquisition
2
Big Data
Data
Computational PowerLearning Algorithm
Data Acquisition
- Deep Learning Model
- Feature-based Model
Data-driven Diagnostics
• (Traditional) machine learning– Data acquisition
– Feature extraction
– Classification
3
0 200 400 600 800 1000 1200 1400 1600 1800 2000-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
Data Number
Am
plit
ud
e
Time Signal
Peak to
peak
Data Acquisition데이터취득
Feature Extraction특성인자추출
Classification분류경계결정
- Time domain
- Frequency domain
Feature Extraction
• Require highly specialized domain knowledge and expertise– Feature engineering
– Feature selection
– Interpreting the feature
4
0 200 400 600 800 1000 1200 1400 1600 1800 2000-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
Data Number
Am
plit
ud
e
Time Signal
Peak to
peak
Data Acquisition데이터취득
Feature Extraction특성인자추출
Classification분류경계결정
- Time domain
- Frequency domain
• Include autonomous feature learning
Deep Learning
5
… …
Class 1
Class 2
• Include autonomous feature learning– Automatic feature extract
Deep Learning
6
Feature learning
… …
Class 1
Class 2
• Include autonomous feature learning– Automatic feature extract
– Transform the raw data for classification
Deep Learning
7
Feature learning
… …
Class 1
Class 2
Classification
Sequential Data in Manufacturing
8
Machinery
Vibration
Sound
Deep Learning
?…
…
…Sequential Data
…
Traditional Deep Learning/Machine Learning
9
Traditional Deep Learning/Machine Learning
• Slice sequential data using window
10
1x 2x 3x 4x
Traditional Deep Learning/Machine Learning
• Slice sequential data using window
• Limitation: snapshot information (ignore time dependency)
11
1o
…
…
…
…
…
…
2o 3o
…
…
…
…
…
…
4oOutput
Layer
Hidden
State
Input
Window 1 Window 2 Window 3 Window 4
Traditional Deep Learning/Machine Learning
• Time dependency
12
1o
…
…
…
…
…
…
2o 3o
…
…
…
…
…
…
4oOutput
Layer
Hidden
State
Input
Window 1 Window 2 Window 3 Window 4
• Sequence matters
Sequential Data
13
1x 2x3x4x1x 2x 3x 4x
Deep Learning: RNN
• Deep learning structure for sequential data– Recurrent Neural Network (RNN)
– Information to be passed
14
Window 1 Window 2 Window 3 Window 4
Input
Deep Learning: RNN
• Deep learning structure for sequential data– Recurrent Neural Network (RNN)
– Information to be passed
15
Window 1
1o
…
…
…
State 1
Window 2 Window 3 Window 4
Output
Layer
Hidden
State
Input
Deep Learning: RNN
• Deep learning structure for sequential data– Recurrent Neural Network (RNN)
– Information to be passed
16
Window 1
1o
…
…
…
State 1W
Window 2
…
…
…
State 2
Window 3 Window 4
Output
Layer
Hidden
State
Input
Deep Learning: RNN
• Deep learning structure for sequential data– Recurrent Neural Network (RNN)
– Information to be passed
17
Window 1
1o
…
…
…
State 1W
Window 2
…
…
…
State 2
2o
Window 3 Window 4
Output
Layer
Hidden
State
Input
Deep Learning: RNN
• Deep learning structure for sequential data– Recurrent Neural Network (RNN)
– Information to be passed
18
W
3o
…
…
…
State 3W
…
…
…
State 4
4o
Window 1
1o
…
…
…
State 1W
Window 2
…
…
…
State 2
2oOutput
Layer
Hidden
State
Input
Window 3 Window 4
Deep Learning: RNN for Classification
• Deep learning structure for sequential data– Recurrent Neural Network (RNN)
– Information to be passed
19
diagnosis
W
…
…
…
State 3W
…
…
…
State 4
4o
Window 1
…
…
…
State 1W
Window 2
…
…
…
State 2Hidden
State
Input
Window 3 Window 4
Deep Learning: RNN for Classification
• Deep learning structure for sequential data– Recurrent Neural Network (RNN)
– Information to be passed
20
diagnosis
W
…
…
…
W
…
…
…
4o
Window 1
…
…
…
W
Window 2
…
…
…
Hidden
State
Input
Window 3 Window 4
RNN Experiment
21
Class 1
Class 2
Model Accuracy (%)
PCA-SVM 92.93
RNN 92.45
Detailed information cannot be disclosed
Issue with RNN Structure
• Vanishing/Exploding Gradient Problem– Need new structure
22
Output
Layer
Hidden
State
Input
Advanced RNN structures
• Long short-term memory (LSTM)
• Skip connected LSTM
• Bidirectional LSTM
23
• Gate mechanism– Augment the hidden states with gates
• Input, output and forget gate (with parameters to be learned)
– Can remember the important information in the distance past
Long Short-Term Memory (LSTM)
24
Output
Layer
Hidden
State
Input
Hochreiter, S., & Schmidhuber, J. (1997), Long short-term memory, Neural computation, 9(8), 1735-1780.
Advanced RNN structures
• Long short-term memory (LSTM)
• Skip connected LSTM
• Bidirectional LSTM
25
Skip Connected LSTM
26
Hidden
State
Input
Window 1
…
…
…
State 1W
Window 2
…
…
…
State 2W
Window 3
…
…
…
State 3W
Window 4
…
…
…
State 4
diagnosis
…4o3o2o1o
Wang, Y., & Tian, F. Recurrent Residual Learning for Sequence Classification.
Skip Connected LSTM
27
Hidden
State
Input
Window 1
…
…
…
State 1W
Window 2
…
…
…
State 2W
Window 3
…
…
…
State 3W
Window 4
…
…State 4
diagnosisdiagnosisdiagnosisdiagnosis
Wang, Y., & Tian, F. Recurrent Residual Learning for Sequence Classification.
…4o3o2o1o
Skip Connected LSTM
28
Hidden
State
Input
Window 1
…
…
…
State 1W
Window 2
…
…
…
State 2W
Window 3
…
…
…
State 3W
Window 4
…
…
…
State 4
diagnosis
…4o3o2o1o
Wang, Y., & Tian, F. Recurrent Residual Learning for Sequence Classification.
Skip Connected LSTM
29
Hidden
State
Input
window1
…
…
…
State 1W
window2
…
…
…
State 2W
window3
…
…
…
State 3W
window4
…
…
…
State 4
diagnosis3o
2o1o
+ + ++
Wang, Y., & Tian, F. Recurrent Residual Learning for Sequence Classification.
Advanced RNN structures
• Long short-term memory (LSTM)
• Skip connected LSTM
• Bidirectional LSTM
30
Bidirectional LSTM
• Use forward and backward context
31Graves, A., Mohamed, A. R., & Hinton, G., Speech recognition with deep recurrent neural networks,
2013 IEEE international conference on (pp. 6645-6649).
Forward
1to to 1to
1ts ts 1ts
Window 1 Window 2 Window 3
Output
Layer
Input
Bidirectional LSTM
• Use forward and backward context
32
Backward
Forward
1to to 1to
1ts ts 1ts
1ts ts 1ts
Window 1 Window 2 Window 3
Output
Layer
Input
Graves, A., Mohamed, A. R., & Hinton, G., Speech recognition with deep recurrent neural networks, 2013 IEEE international conference on (pp. 6645-6649).
Computation Environment
• Development environment– Ubuntu 14.04
– Python3
– TensorFlow
• Machine– GPU: GeForce GTX TITAN X (PASCAL)
– CPU: Intel i7-5930k 6 Core 3.5GHz processor
33
Network Structure
• The number of hidden nodes– RNN, LSTM, skip connected LSTM: 20
– Bidirectional LSTM: 10
34
state
…
Network Structure
• The number of hidden nodes– RNN, LSTM, skip connected LSTM: 20
– Bidirectional LSTM: 10
• Window size
35
0.01s 0.01s
…
1n 100n
Network Structure
• The number of hidden nodes– RNN, LSTM, skip connected LSTM: 20
– Bidirectional LSTM: 10
• Window size – The number of windows: 100
– Window size: 100
• Cost function– Cross entropy
– L2 regularization
36
1
1ˆ ˆ[ log( ) (1 ) log(1 )]
N
n n n n fc
n
y y y yN
Performance Result
37
Model Accuracy (%) Learning time (s) Run time (s)
PCA-SVM 92.93 0.29 0.05
RNN 92.45 128.85 0.07
LSTM 93.45 169.54 0.11
Skip connected LSTM
95.08 172.27 0.12
Bidirectional LSTM
95.39 226.08 0.20
Detailed information cannot be disclosed
Conclusion
• Deep learning models for sequential data
– RNN
– LSTM
– Skip connected LSTM
– Bidirectional LSTM
• Limitation
– Interpretable deep learning model• Deep learning Model
• Focus on accuracy
38