wind turbine fault prediction using soft label...
Post on 01-Jun-2020
12 Views
Preview:
TRANSCRIPT
Wind Turbine Fault Prediction
Using Soft Label SVM
Rui Zhao Md Ridwan Al Iqbal Kristin P. Bennett Qiang Ji
Contact: zhaor@rpi.edu
Introduction: Motivation
Increasing growth of wind energy consumption
High cost of maintenance and repair
Significant lost of when turbine is forced out
[1]
Diavik Diamond
Mine, Canada [1]
Introduction: Background
Forced outage: turbine shutdown due to
unexpected internal fault of the system
Three main categories of prognostics strategies [3]
Physical modeling based approach
Signal analysis based approach
Machine learning based approach
Different components
of a wind turbine [2]
Introduction: Challenges
The developing process of fault is often
unknown – No exact label information
The signature of fault is often unknown –
No single indicative feature
Different turbines may have different
symptoms and causes of fault – Significant
heterogeneity
Problem Statement: Goal
Notation:
𝑥𝑡 ∈ 𝑅𝑑 is 𝑑-dimensional feature vector at time 𝑡
𝐱𝑛𝑙 = {𝑥𝑛, … , 𝑥𝑛+𝑙−1} is a subsequence of time series
𝑦𝑛𝑙 is binary hidden label of subsequence
-1: Normal
+1: Pre-fault
Goal: learn a mapping 𝑓: 𝑦𝑛𝑙 = sign(𝑓 𝐱𝑛
𝑙 )
Major challenge: Classification without exact labels
Problem Statement:
Assumptions
Exploit the uncertainty of label information
Empirical observations: The closer the time to the
forced outage, the more likely the turbine is at pre-
fault status.
Assumptions:
Non-decreasing probability of being in pre-fault
status as time approaches the forced outage
event.
𝑃 𝑦𝑛𝑙 = 1 𝐱𝑛
𝑙 ≥ 𝑃 𝑦𝑚𝑙 = 1 𝐱𝑚
𝑙 , ∀ 𝑛 > 𝑚
For testing purpose, assume
𝑦𝑛𝑙 =−1, 𝑛 < 𝑛−
1, 𝑛 > 𝑛+
Methods: SVM
Support Vector Machines (SVM) [4] as base framework
Parameterization: 𝑦𝑛 = sign 𝑓(𝐱𝑛) ≡ sign(𝐰𝑇𝜙 𝐱𝑛 + 𝑏)
Training data: 𝑦𝑛, 𝐱𝑛 𝑛=1𝑁
Model parameters: 𝐰, 𝑏
Primal problem:
min𝐰,𝑏
1
2𝐰2+ 𝛾
𝑛=1
𝑁
𝜉𝑛
subject to 𝑦𝑛 𝐰𝑇𝜙 𝐱𝑛 + 𝑏 ≥ 1 − 𝜉𝑛,
𝜉𝑛≥ 0, 𝑛 = 1,… ,𝑁
Requires fully supervision supplied with label 𝑦𝑛
(SVM-P)
Methods: Our Approach
Soft Label Support Vector Machines (SLSVM)
Same parameterization and model parameters as SVM
𝑦𝑛 = sign 𝑓(𝐱𝑛) ≡ sign(𝐰𝑇𝜙 𝐱𝑛 + 𝑏)
Training data: 𝑢𝑛+, 𝑢𝑛−, 𝐱𝑛 𝑛=1
𝑁
𝑃 𝑦𝑛 = 1 𝐱𝑛 = 𝑢𝑛+
𝑃 𝑦𝑛 = −1 𝐱𝑛 = 𝑢𝑛−
𝑢𝑛+ + 𝑢𝑛
− = 1, 𝑛 = 1,… ,𝑁
Primal problem:
min𝐰,𝑏
1
2𝐰2+ 𝛾
𝑛=1
𝑁
(𝑢𝑛+𝜉𝑛+ + 𝑢𝑛
−𝜉𝑛−)
subject to 𝐰𝑇𝜙 𝐱𝑛 + 𝑏 ≥ 1 − 𝜉𝑛+,
−𝐰𝑇𝜙 𝐱𝑛 − 𝑏 ≥ 1 − 𝜉𝑛−
𝜉𝑛+≥ 0, 𝜉𝑛
− ≥ 0, 𝑛 = 1,… , 𝑁
(SLSVM-P)
Methods: Comparison
Illustration of slack variables
SVM SLSVM
𝑦 = −1
𝑦 = 0
𝑦 = 1𝜉 = 0
𝜉 < 1 𝜉 < 1
𝜉 > 1
𝜉 = 0
𝑦 = −1
𝑦 = 0
𝑦 = 1
𝜉+ < 1𝜉− > 1
𝜉+ > 1𝜉− < 1
𝜉+ > 1𝜉− < 1
𝜉+ = 0𝜉− > 1
𝜉+ > 1𝜉− = 0
Methods: SLSVM
Lagrangian Dual problem
max𝜶𝒏+,𝜶𝒏−
𝑛=1
𝑁
(𝛼𝑛+ + 𝛼𝑛
−) −1
2
𝑚=1
𝑁
𝑛=1
𝑁
𝛼𝑚+ − 𝛼𝑚
− 𝛼𝑛+ − 𝛼𝑛
− 𝑘(𝐱𝑚, 𝐱𝑛)
subject to 0 ≤ 𝛼𝑛+ ≤ 𝛾𝑢𝑛
+, 0 ≤ 𝛼𝑛− ≤ 𝛾𝑢𝑛
−
𝑛=1𝑁 (𝛼𝑛
+ − 𝛼𝑛−) = 0, 𝑛 = 1, … , 𝑁
Where 𝑘 𝐱𝑚, 𝐱𝑛 = 𝜙 𝐱𝑚𝑇𝜙(𝐱𝑛) is the kernel function
(SLSVM-D)
Methods: Solution
(SLSVM-D) can be solved by quadratic
programming [5]
From KKT condition, we can recover 𝐰, 𝑏
Given new data 𝐱, we can evaluate its score
𝑓 𝐱 = 𝐰𝑇𝜙 𝐱 + 𝑏 =
𝑛=1
𝑁
𝛼𝑛+ − 𝛼𝑛
− 𝑘 𝐱, 𝐱𝑛 + 𝑏
Methods: Generalized
Formulation
Generalized primal problem
min𝐰,𝑏
1
2𝐰𝑝+ 𝛾
𝑐=1
2
𝑛=1
𝑁
𝑢𝑛𝑐𝐸[𝑓 𝐱𝑛 , 𝑦𝑐]
𝑝 > 0 : order of regularization
𝑢𝑛𝑐 = 𝑃 𝑦𝑛 = 𝑦𝑐 𝐱𝑛 , 𝑦𝑐 = −1𝑐 : soft label
𝐸(𝑓 𝐱𝑛 , 𝑦𝑐) : loss function, e.g.
Hinge loss: 𝐸 = max (0,1 − 𝑦𝑐𝑓(𝐱𝑛))
Squared hinge loss: 𝐸 = max 0,1 − 𝑦𝑐𝑓 𝐱𝑛2
Squared loss: 𝐸 = 1 − 𝑦𝑐𝑓 𝐱𝑛2
Optimization: ADMM [6] for primal problem and quadratic
programming for dual problem
(SLSVM-PG)
Experiment – Data
Source: 55 channels of sensor data collected
from 125 wind turbines (GE 1.6 MW)
Granularity: data are sampled at sub-second level and averaged over 10 minutes period.
Other facts:
Time span: June 2013 to May 2014
Number of forced outages: 38
Total down time: 2350 hours
Fault related component: electrical subsystem
Experiment – Process
sensor
series
SLSVM
trainingClassification
Data pre-
processing
Feature
extraction
AUC,
feature
rank
Empirical
probability
assignment
Output: Input:
Locate
forced
outages
1. Data cleaning
2. Channel pruning
Experiment – Process
1. Customized normalization
2. Spatial feature: covariance
3. Temporal feature: autoregressive model coefficients
sensor
series
SLSVM
trainingClassification
Data pre-
processingFeature
extraction
AUC,
feature
rank
Empirical
probability
assignment
Output: Input:
Locate
forced
outages
Experiment – Process
1. Logged 38 forced outages events
2. Truncate up to 12 days prior to event
sensor
series
SLSVM
trainingClassification
Data pre-
processing
Feature
extraction
AUC,
feature
rank
Empirical
probability
assignment
Output: Input:
Locate
forced
outages
Experiment – Process
1. Linear
2. Sigmoid
3. Exponential
sensor
series
SLSVM
trainingClassification
Data pre-
processing
Feature
extraction
AUC,
feature
rank
Empirical
probability
assignment
Output: Input:
Locate
forced
outages
Experiment – Process
3 different loss functions 𝐸2 different values of regularization order 𝑝Linear kernel is used
sensor
series
SLSVM
trainingClassification
Data pre-
processing
Feature
extraction
AUC,
feature
rank
Empirical
probability
assignment
Output: Input:
Locate
forced
outages
Experiment – Process
Based on assumed
groundtruth labels
sensor
series
SLSVM
trainingClassification
Data pre-
processing
Feature
extraction
AUC,
feature
rank
Empirical
probability
assignment
Output: Input:
Locate
forced
outages
Experiment – Soft Labels
Experiment – Process
sensor
series
SLSVM
trainingClassification
Data pre-
processing
Feature
extraction
AUC,
feature
rank
Empirical
probability
assignment
Output: Input:
Locate
forced
outages
Results – Classification
Prediction quality
All soft label approach outperforms hard label approach
(standard SVM) with exponential soft label performs the best
Results – Classification
Prediction quality
All soft label approach outperforms hard label approach
(standard SVM) with exponential soft label performs the best
Extension for feature selection is considered
Results – Classification
Prediction quality
All soft label approach outperforms hard label approach
(standard SVM) with exponential soft label performs the best
Extension for feature selection is considered
Simple clustering method performs the worst
Results – Classification
Prediction horizon
Varies the prediction horizon and subsequence length
18 hours ahead achieves the highest average AUC value 0.91
Results – Feature selection
L1-norm regularized formulation for feature selection
Summary
Formalize a time-series classification problem for wind
turbine prognostics
Proposed a classification framework SLSVM that can
handle uncertainty of label information
Extend SLSVM with feature selection capability to
provide insight for prognostics
Demonstrate the effectives of SLSVM for fault
prediction on real turbine operation data
Thank you
References
[1] Global Energy Council, Global Wind Energy Report, 2016
[2] M. Schlechtingena, I. F. Santosb and S. Achichec, Wind turbine condition monitoring based on SCADA data using normal behavior models, Applied Soft Computing, 2013
[3] Z. Hameed, Y. Hong, Y. Cho, S. Ahn, and C. Song, Condition monitoring and fault detection of wind turbines and related algorithms: A review, Renewable and Sustainable Energy Reviews, 2009
[4] B. E. Boser, I. M. Guyon, and V. N. Vapnik, A training algorithm for optimal margin classifier, ACM annual workshop on Computational Learning theory, 1992
[5]S. Boyd and L. Vandenberghe, Convex Programming, Cambridge University Press, 2004
[6] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends in Machine Learning, 2011
Contact: zhaor@rpi.edu
Q&A
top related