neuro-fyzzy methods for modeling and identification
DESCRIPTION
Neuro-Fyzzy Methods for Modeling and Identification. Presented by: Ali Maleki. Presentation Agenda. Introduction Fuzzy Systems Artificial Neural Networks Neuro-Fuzzy Modeling Simulation Examples. Introduction. Control Systems Competion Environment requirements - PowerPoint PPT PresentationTRANSCRIPT
Neuro-Fyzzy Methods
for
Modeling and Identification
Presented by:Ali Maleki
Presentation Agenda
Introduction Fuzzy Systems Artificial Neural Networks Neuro-Fuzzy Modeling Simulation Examples
Introduction Control Systems
CompetionEnvironment requirementsEnergy and material costsDemand for robust, fault-tolerant systems
Extra needs for Effective process modeling techniques
Conventional modeling?
lack precise, formal knowledg about the systemStrongly nonlinear behavior,High degree of uncertainty,Time varying characteristics
Introduction Solution: Neuro-fuzzy modeling
A powerful tool which can facilitate the effective development of models by combining information from different source:
Empirical modelsHeuristicsData
Neuro-fuzzy modelsDescribe systems by means of fuzzy if-then rulesRepresented in a network structureApply algorithms from the area of Neural Networks
IntroductionNeuro-fuzzy modeling
- neural networks- fuzzy systems
Both are motivated by imitating human reasoning process
Relationships:In neural networks :
implicitly, coded in the network and its parameters
In fuzzy systems:explicitly, in the form of if–then rules
Neuro–fuzzy systems combine the semantic transparency of rule-based fuzzy systems with the learning capability of neural networks
Nonlinear system identification
NARX (nonlinear autoregressive with exogenous input) model
Regressor vector:
Dynamic order of the system: Represented by the number of lags nu and ny
Task of nonlinear system identification:Infer unknown function f from available data sequences
Nonlinear system identification
Multivariable systems:
Nonlinear state-space discription
Task of nonlinear system identification:Infer unknown functions g and h from available data sequences
Neural NetworksNeuro-fuzzy systemsSplinesInterpolated look-up tablesAccurate PredictorAccurate predictor +model that can be used to learn something about system +Analyse system properties
Fuzzy Models
Fuzzy Models
Definition: A mathematical model which in some way uses
fuzzy setsIn system identification:
rule-based fuzzy modelsExample: IF heating is high THEN temperature increase is fast
Linguistic terms
To make such a model operational: Linguistic terms must be defined more precisely using fuzzy setsFuzzy sets defined through their membership functions
Fuzzy Models
Types of fuzzy models: (depending on the structure of if-then rules)
Mamdani Model
IF D1 is low and D2 is high THEN D is medium
Takagi-Sugeno Model
IF D1 is low and D2 is high THEN D=k (zero-order)
IF D1 is low and D2 is high THEN D=0.7D1+0.2D2+0.1 (first-order)
Mamdani Model
Linguistic terms Number of rules
Linguistic fuzzy model is useful for representing qualitative knowledge
Example – Mamdani Model
Linguistic terms is defined by membership function
Gas burner Heating powerOxygen flow rate
Example – Mamdani Model
Membership functions can be defined by the model developer based on prior
knowledgeor (automatically) by using data (constructed - adjusted)
Gas burner Heating powerOxygen flow rate
Takagi-Sugeno Model
Mamdani model is typically used in knowledge-based (expert) systemsIn data driven identification, Takagi_Sageno model has becom popular
Consequent parameter vector Scalar offset Number of rules
Takagi-Sugeno Model
The output y is computed by taking the weighted average of the individual rules contribution:
Degree of fulfilment of the ith rule
In a special case:
Takagi-Sugeno Model (zero-order)Rules:
Input-output equation:
This model is a special case of the Mamdani system, in whichConsequent fuzzy sets degenerate to singletons (real numbers)
Takagi-Sugeno Model
usually,antecedent fuzzy sets are usually defined to describe distinct, partly overlapping regions in the input space
Then,The parameters ai are approximate local linear models of the considered nonlinear system
TS model:Piece-wise linear approximation of a nonlinear function
Example: Takagi-Sugeno Model
static characteristic of an actuator withdead zone and anon-symmetrical response for positive and negative inputs
Fuzzy Logic Operators
In fuzzy systems with multiple inputs, the antecedent proposition is usually represented as a combination of terms,by using logic operators
‘and’ (conjunction)‘or’ (disjunction) and‘not’ (complement)
In fuzzy set theory,several families of operators have been introduced for these logical connectives
Example: Fuzzy Logic Operators
conjunctive form of the antecedent
degree of fulfillment:
minimum conjunction operator
product conjunction operator
Dynamic Fuzzy Models
TS NARX model
Regressor vector
Dynamic Fuzzy Models
TS state-space model
Advantages of the state-space modeling approach:• structure of the model can easily be related to the
physical structure of the real system (model parameters are physically relevant)
• This is not necessarily the case with input-output models.
• Dimension of the regression problem in state-space modeling is often smaller than with input–output models
Artificial Neural Networks
ANNs:• Inspired by the functionality of biological neural
networks• Can generalizing from a limited amount of training
data• black-box models of nonlinear, multivariable static
and dynamic systems• can be trained by using input–output data
ANNs consist of:• Neurons• Interconnection among them• Weights assigned to these interconnections
Multi-Layer Neural Network
• One input Layer• One output layer• A number of hidden layers
Activation function: Linear neurons• Tangent hyperbolic
• Threshold function• Sigmoidal function
Multi-Layer Neural Network - Training
Training definition:• adaptation of weights in a multi-layer network
such that the error between the desired output and the network output is minimized
Training steps:• Feedforward computation• Weight adaptation
Gradient-descent optimizationError backpropagation
Multi-Layer Neural Network - Structure
A network with one hidden layer is sufficient for most approximation tasks
More layers:• Can give a better fit• But the training takes longer
number of neurons in the hidden layer:• Too few neurons give a poor fit• Too many neurons result in overtraining of
the net (poor generalization to unseen data)
Dynamic Neural Networks
static feedforward network combined with an externalfeedback connection
First-order NARX model
Dynamic Neural Networks
Recurrent Networks
Feedback:• Internally in the neurons (Elman network)• Internally to other neurons in the same layer• Internally to neurons in preceding layer (Hopfield Network)
Hopfield Network
Elman Network:
Error Backpropagation
Input:
Desired output:
Error:
Cost function:
Adjusting the weights: minimization of the cost function
Error Backpropagation
network’s output y is nonlinear in the weightsTherefore,The training of a MNN is thus a nonlinear optimization
problem
Methods:• Error backpropagation (first-order gradient)• Newton, Levenberg-Marquardt methods (second-
order gradient)• Genetic algorithms and many others techniques
Error Backpropagation
First-order gradient
Update rule:
Weight vector in iteration nLearning rate
Jacobian of the network
nonlinear optimization problem is thus solved by using the first term of its Taylor series expansion
Error Backpropagation
Second-order gradient:
Second-order gradient method make use of the second term
Hessian
Update rule:
Error Backpropagation
Difference between first-order and Second-order gradient methods:
Size and direction of gradient-decent step
Second order methods are usually more effective than first-order ones
Error Backpropagation
For output layer
Update law for output weights:
For hidden layer
Update law for hidden layer weights:
Backpropagation error
Radial Basis Function Network
RBF network is a twolayer network as figure below
Ususal choise for basis function is Gaussian function
Radial Basis Function Network
Adjustable weights are only present in the output layer
Free parameters of RBF nets are• Output weights• Parameters of the basis functions (centers and
radii)
Output is linear in the weights, and these weights can be estimated by least-squares methods
adaptation of the RBF parameters (center and radial) is a nonlinear optimization problem that can be solved by the gradient-descent techniques
Neuro-Fuzzy Modeling
Fuzzy system as a layered structure (network), similar to ANNs of the RBF type
gradientdescent training algorithms for parameter optimization
This approach is usually referred to as Neuro-Fuzzy Modeling
• Zero-order TS fuzzy model• First-order TS fuzzy model
Neuro-Fuzzy Modeling
Zero-order TS fuzzy model
Typical membership function
Input-output equation
Neuro-Fuzzy Modeling
First-order TS fuzzy model
Input-output equation
Neuro-Fuzzy Networks - Constructing
• Prior knowledge (can be of a rather approximate nature)
• Process data
Integration of knowledge and data:• Expert knowledge as a collection of if–then rules
(Initial model creation – fine tune using process data)• Fuzzy rules are constructed from scratch by using
numerical data(expert can confront the information stored in the rule base with his own knowledge)(can modify the rules)(supply additional rules to extend the validity of the model)
Comparision the second method with Truly black-box structures
possibility to interpret the obtained results
Neuro-Fuzzy Networks – Structure and parameters
System identification steps:• Structure identification• Parameter estimation
• choice of the model’s structure determines the flexibility of the model in the approximation of (unknown) systems
• model with a rich structure can approximate more complicated functions, but, will have worse generalization properties
• Good generalization means that a model fitted to one data set will also perform well on another data set from the same process.
Neuro-Fuzzy Networks – Structure and parameters
Structure selection process involves:
• Selection of input variables
• Number and type of membership functions, number of rules(These two structural parameters are mutually related)
Neuro-Fuzzy Networks – Structure and parameters
Selection of input variables• Physical inputs• Dynamic regressors (defined by the input and output
lags)
Typical sources of information:• Prior knowledge• Insight in the process behavior• Purpose of the modeling exercise
Automatic data-driven selection can then be used to compare different structures in terms of some specified performance criteria.
Neuro-Fuzzy Networks – Structure and parameters
Number and type of membership functions, number of rules
• Determine the level of detail (granularity) of the model
Typical criteria• Purpose of modeling• Amount of available information (knowledge and
data)
Automated methods can be used to add or remove membership functions and rules.
Neuro-Fuzzy Networks – Gradient-based learning
Zero-order ANFIS model
Consequent parameters
Jacobian
Update law
Centers and spreads of the Gaussian membership functions
Neuro-Fuzzy Networks – Hybrid Learning
Output-layer parameters in RBF networks can be estimated by linear least-squares (LS) techniques
LS methods are more effective than the gradient-based update rule
Hybrid methods:• One-shot least-squares estimation of the consequent
parameters• Iterative gradient-based optimization of the membership
functions
Choice of LS estimation method:• In terms of error minimization is not crucial• If consequent parameters are to be interpreted as local
models great care must be taken
PROBLEM: over-parameterizationnumerical problems - over-fitting - meaningless parameter
estimates
Example – Hybrid Learning
Approximation of second-order polynomial by a first-order ANFIS model
Membership function for t1<u<t2
Model:
Output of TS model
• Model has four free parameters, while three are sufficient to fit the polynomial
Neuro-Fuzzy Networks – Hybrid Learning
To avoid over-parameterization, the basic least-squares criterion can be combined with additional criteria for local fit, or with constraints on the parameter values
• Local Least-Squares Estimation• Constrained Estimation• Multi-Objective Optimization
Hybrid Learning - Global LS Estimation
Hybrid Learning - Local LS Estimation
While the global solution gives the minimal prediction error, it may bias the estimates of the consequents as parameters of local models.
If locally relevant model parameters are required, a weighted LS approach applied per rule should be used.
• The consequent parameters of the individual rules are estimated independently (result is not influenced by the interactions of the rules)
• Larger prediction error is obtained than with global least squares
Example – Global and Local LS Estimation
Approximation of second-order polynomial by a first-order ANFIS model
Model:
Local LS estimation Global LS estimation
Hybrid Learning - Constrained Estimation
Knowledge about the dynamic system such as its stability, minimal or maximal static gain, or its settling time can be translated into convex constraints on the consequent parameters
Hybrid Learning - Multi-Objective Optimization
Minimize the weighted sum of the global and local identification criteria
Initialization of Antecedent Membership Functions
For a successful application of gradient-descent learning to the membership function parameters, good initialization is important
Initialization methods:• Template-Based Membership Functions• Discrete Search Methods• Fuzzy Clustering
Initialization - Template-Based Membership Functions
domains of the antecedent variables are a priori partitioned by a number of membership functions (usually evenly spaced and shaped)
severe drawback of this approach is that the number of rules in the model grows exponentially
• Complexity of the system’s behavior is typically not uniform.• Some operating regions can be well approximated by a local
linear model, while other regions require a rather fine partitioning.
• In order to obtain an efficient representation with as few rules as possible, the membership functions must be placed such that they capture the non-uniform behavior of the system.
Initialization - Discrete Search Methods
Iterative tree-search algorithms can be applied to decompose the antecedent space into hyper-rectangles by axis-orthogonal splits.
Advantage: Its effectiveness for high-dimensional data and the transparency of the obtained partition.
Drawback: Tree building procedure is sub-optimal (greedy) and hence the number of rules obtained can be quite large
Initialization – Fuzzy Clutering
Based on the similarity, data vectors are clustered such that the data within a cluster are as similar as possible, and data from different clusters are as dissimilar as possible.
The number of clusters in the data can either be determined a priori or sought automatically by using cluster validity measures and merging techniques
Simulation Example
1. A simple fitting problem of a univariate static function
• It demonstrates the typical construction procedure of a neuro-fuzzy model. Numerical results show that an improvement in performance is achieved at the expense of obtaining if-then rules that are not completely relevant as local descriptions of the system.
2. Modeling of a nonlinear dynamic system
• Illustrates that the performance of a neuro-fuzzy model does not necessarily improve after training. This is due to overfitting which in the case of dynamic systems can easily occur when the data only sparsely cover the domains
Simulation Example - Static Function
ANFIS model with linear consequent functionNumber of rules: five rulesConstruction of initial model:
Gustafson-Kessel algorithm
Fit of the function with initial model – local models - membership functions
Simulation Example - Static Function
this initial model can easily be interpreted in terms of the local behavior
It is reasonably accurate (RMS= 0.0258)
ANFIS method, 100 learning epochsanfis function of the MATLAB Fuzzy Logic Toolbox
Fit of the function with fine-tuned model, local models, membership functions
Simulation Example - Static Function
RMS error is about 23 times better than the initial model
Initial model Fine-tuned modelRMS error = 0.0258 RMS error =
0.0011
Simulation Example - Static Function
after learning, the local models are much further from the true local description of the function
Initial model Fine-tuned model
Fine-tuned model are thus less accurate in describing the system locally
Simulation Example – pH Neutralization Process
Neutralization tank Effluent stream
Acid Buffer Base
Influent streams
Neutralization tank pH in the tank
Acid flowrate = cteBuffer flowrate = cteBase stream flowrate
Simulation Example - pH Neutralization Process
Identification and validation data sets:• Simulating the model by Hall and Seborg for random change of
the influent base stream flow rate• N = 499 samples with the sampling time of 15 s.
The process is approximated as a first–order discrete-time NARX model
Simulation Example - pH Neutralization Process
Membership functions
Befor training After training
Simulation Example - pH Neutralization Process
Rules: Initial Rules:
Fine Tuned Rules: after 1000 epochs of hybrid learning using the ANFIS function of the MATLAB Fuzzy Logic Toolbox
Simulation Example - pH Neutralization Process
Overtraining Problem: • Comparision of RMS ERROR befor and after training
• Prediction befor and after training
References
[1] Robert Babuska, “Neuro-Fuzzy Methods for Modeling and Identification”, Recent Advances in Intelligent Paradigms and Application, Springer-Verilag, 2002
[2] Robert Babuska, “Fuzzy Modeling and Identification Toolbox User’s Guide - For Use with MATLAB”, 1998.
[3] A. Trabelsi, F. Lafont, M. Kamoun and G. Enea, “Identification of Nonlinear Systems by Adaptive Fuzzy Takagi-Sugeno Model”, International Journal of Computational Cognition, Volume 2, Number 3, Pages 137–153, September 2004.
[4] Jan Jantzen, “Neurofuzzy Modelling”, Technical University of Denmark, Department of Automation,1998.
THANK YOU VERY MUCH
For your
Attention
Presented by:Ali Maleki
Titration Curve
Neutralization tank pH in the tank
Acid flowrate = cteBuffer flowrate = cteBase stream flowrate