fuzzy models for pattern recognition 1.def.: a)a field concerned with machine recognition of...

Download Fuzzy Models for Pattern Recognition 1.Def.: a)A field concerned with machine recognition of meaningful regularities in noisy or complex environment. b)The

If you can't read please download the document

Upload: joy-adelia-stafford

Post on 14-Dec-2015

220 views

Category:

Documents


4 download

TRANSCRIPT

  • Slide 1

Fuzzy Models for Pattern Recognition 1.Def.: a)A field concerned with machine recognition of meaningful regularities in noisy or complex environment. b)The search for structure in data. 2.Categories: a)Numerical pattern recognition, b)Syntactic pattern recognition. The pattern primitives are themselves considered to be labels of fuzzy sets. (sharp, fair, gentle) The structural relations among the subpatterns may be fuzzy, so that the formal grammar is fuzzified by weighted production rules. Slide 2 Slide 3 3.Elements of a numerical pattern recognition system: 1)Process description: data space pattern space Data: drawn from any physical process or phenomenon. Pattern space (structure): the manner in which this information can be organized so that relationships between the variables in the process can be identified. 2)Feature analysis: feature space Feature space has a much lower dimension than the data space.essential for applying efficient pattern search technique. Searches for internal structure in data items. That is, for features or properties of the data which allow us to recognize and display their structure. Slide 4 3)Cluster analysis: search for structure in data sets. 4)Classifier design: classification space. Search for structure in data spaces. A classifier itself is a device, means, or algorithm by which the data space is partitioned into c decision regions. Slide 5 4.Fuzzy Clustering There is no universally optimal cluster criteria: distance, connectivity, intensity, Hierarchical clustering 1)Generate a hierarchy of partitions by means of a successive merging or splitting of clusters. 2)Can be represented by a dendogram, which might be used to estimate an appropriate number of clusters for other clustering methods. 3)On each level of merging or splitting a locally optimal strategy can be used, without taking into consideration policies used on preceding levels. 4)The methods are not iterative; they cannot change the assignment of objects to clusters made on proceeding levels. 5)Advantage: conceptual and computational simplicity. 6)Correspond to the determination of similarity trees. Slide 6 Slide 7 Slide 8 Graph-theoretic clustering 1)Based on some kind of connectivity of the nodes of a graph representing the data set. 2)The clustering strategy is often breaking edges in a minimum spanning tree to form subgraphs. 3)Fuzzy data set fuzzy graph. 4)Let G = [V,R] be a symmetric fuzzy graph. Then the degree of a vertex v is defined as d(v) = u/=v R (u,v). The minimum degree of G is (G) = min v V {d(v)}. Slide 9 Slide 10 Slide 11 5)Let G be a symmetric fuzzy graph. G is said to be connected if, for each pair of vertices u and v in V, G is called Connected for some And G is connected. 6)Let G be a symmetric fuzzy graph. Clusters are then Defined as maximalConnected subgraph of G. Objective-function clustering 1)The most precise formulation of the clustering criterion. 2)Local extrema of the objective function are defined as optimal clusterings. 3)Bezdeks c-means algorithm. Slide 12 Objective-function clustering 1)The most precise formulation of the clustering criterion. 2)Local extrema of the objective function are defined as optimal clusterings. 3)Bezdeks c-means algorithm. 4)Butterfly example. 5)Similarity measure: distance of two objects d: X XR + which satisfies D(x k,x 1 ) = d k1 0 d k1 = 0 x k = x 1 d k1 = d 1k (x k,x 1 are the points in the p-dimensional space.) Slide 13 2)Let X = {x 1,,x n } be any finite set. V cn is the set of all real c X n matrixes, and 2 c n is an integer. The matrix U = [u ik ] V cn is called a crisp c-partition if it satisfies the following conditions: The set of all matrixes that satisfy these conditions is called M c. Clustering: 1)Each partition of the set X into crisp or fuzzy subsets S i (i = 1,.,c) can fully be described by an indicator function Slide 14 Slide 15 The set of all matrixes that satisfy these conditions is called M fc. 4)Cluster center: v i = (v i1, ,v ip ): represents the location of a cluster. vector of all cluster centers v = (v i,,v c ). 3)Let X = {x 1,,x n } be any finite set. V cn is the set of all real c X n matrixes, and 2 c n is an integer. The matrix U = [u ik ] V cn is called a fuzzy-c partition if it satisfies the following conditions: Slide 16 5)Variance criterion: measures the dissimilarity between the points in a cluster and its cluster center by the Euclidean distance. minimize the sum of the variances of all variables j in each cluster i (sum of the squared Euclidean distances) For crisp c-partition: Slide 17 For fuzzy c-partition: Slide 18 6)Fuzzy c-means algorithm Step1: Choose c and m. Initialize U 0 M fc, set r= 0 Setp2: Calculate the c fuzzy cluster centers {v r } by using U r from Eq. 1. Setp3: Calculate the new membership U 1+1 by using {v r } Step4: CalculateSet r = r+1 and Go to step2. IF,stop. Slide 19 Slide 20 Slide 21 Slide 22 Slide 23 Decision Making 1.Characterized by 1)A set of decision alternatives (decision space; constraints); 2)A set of states of nature (state space); 3)Utility (objective ) function: orders the results according to their desirability. 2.Fuzzy decision model: Bellman and Zadeh [1970] Consider a situation of decision making under certainty, in which the objective function as well as the constraints are fuzzy. The decision can be viewed as the intersection of fuzzy constraints and fuzzy objective function. Slide 24 The relationship between constraints and objective functions in a fuzzy environment is therefore fully symmetric, that is, there is no longer a difference between the former and the latter. The interpretation of the intersection depends on the context. Intersection (minimum): no positive compensation (trade-off) between the membership degrees of the fuzzy sets in question. Union (max): leads to a full compensation for lower membership degrees. Decision = Confluence of Goads and Constraints. Slide 25 Neither the noncompensatory and (min, product, Yager-conjunction) nor the fully compensatory or (max, algebraic sum, Yager-disjunction) are appropriate to model the aggregation of fuzzy sets representing managerial decisions. Def: Let Ci (x), i=1,,m, x X, be membership functions of constraints, defining the decision space and Gj (x), j=1,,n, x X the membership functions of objective functions or goals. A decision is then defined by its membership function wheredenote appropriate, possibly context- dependent aggregators. Slide 26 Individual decision making Slide 27 Multiperson decision making Difference with individual decision making Each places a different ordering on the alternatives Each have access to different information n-person game theories: both Team theories: the second Group decision theories: the first. Slide 28 Multiperson decision making Individual preference ordering: Social choice function: The degree of group preference of x i over x j procedure to arrive at the unique crisp ordering that constitutes the group choice. Slide 29 3.Fuzzy Linear Programming Classical model: maximize f(x) = c T x such that Ax b x 0 with c,x R n,b R m,A R mxn. Modification for fuzzy LP: 1)Do not maximize or minimize the objective function; might want to reach some aspiration levels which might not even be definable crisply. improve the present cost situation considerably 2)The constraints might be vague: coefficients, relations 3)Might accept small violations of constraints but might also attach different degrees of importance to violations of different constraints. Slide 30 4.Symmetric fuzzy IP: Find x such that c T x z (aspiration level) Ax b x 0 The membership function of the fuzzy set decision the above model is i (x) can be interpreted as the degree to which x satisfies the fuzzy unequality B i x d i. Crisp optimal solution: Slide 31 Membership function: e.g., optimal solution: that is maximize Slide 32 such that (,x 0 ) the maximum solution can be found by solving one crisp LP with only one more variable and one more constraint. Slide 33 Multistage Decision Making Task-oriented control belongs to such kind of decision-making problem Fuzzy decision making fuzzy dynamic programming a decision problem regarding a fuzzy finite-state automaton State-transition relation is crisp Next internal state is also utilized as output. Slide 34 S one-time storage xtxt ztzt z t+1 S one-time storage AtAt CtCt C t+1 Slide 35 Multistage Decision Making Fuzzy input states as constraints: A 0, A 1 Fuzzy internal state as goal: C N Principle of optimality: An optimal decision sequence has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with the state resulting from the first decision. Slide 36 Multistage Decision Making Slide 37 5.Fuzzy LP with crisp objective function Constraints: define the decision space in a crisp of fuzzy way. Objective function: induce an order of the decision alternatives. Problem: the determination of an extremum of a crisp function over a fuzzy domain. Approaches: 1)The determination of the fuzzy set decision. 2)The determination of a crisp maximizing decision by aggregating the objective function after appropriate transformations with the constraints. Slide 38 Fuzzy decision 1)Decision space is (partially) fuzzy. 2)Compute the corresponding optimal values of the objective function for all -level sets of the decision space. 3)Consider as the fuzzy set decision the optimal values of the objective functions with the degree of membership equal to the corresponding -level of the solution space. Crisp maximizing decision. Slide 39 6.Fuzzy Multi Criteria Analysis Problems can not be done by using a single criterion or a single objective function. Multi Objective Decision Making: concentrates on continuous decision space. Multi Attribute Decision Making: focuses on problems with discrete decision spaces. Slide 40 MODM: also called vector-maximum problem Def.: maximized {Z(x)|x X} where Z(x) = (z 1 (x),,z k (x)) is a vector-valued function of x R n into R k and X is the solution space Stage in vector-maximum optimization: 1)The determination of efficient solution 2)The determination of an optimal compromise solution Efficient solution: x a is an efficient solution if there is no x b X such that Z i (x b ) z i (x a ) I=1,,k and Z i (x b )>z i (x a ) for at least one i =1,,k. Complete solution: the set of all efficient solutions. Example: Slide 41 MADM: Def.: Let X = {x i | i = 1,,n} be a set of decision alternatives and G = {g j | j = 1,,m} a set of goals according to which the desirability of an action is judged. Determine the optimal alternative x 0 with the highest degree of desirability with respect to all relevant goals g j. Stages: 1)The aggregation of the judgments with respect to all goals and per decision alternative. 2)The rank ordering of the decision alternatives according to the aggregated judgments. Slide 42 Fuzzy MADM: Yager model: Let X = {x i | i = 1,,n} be a set of decision alternatives. The goals are represented by the fuzzy sets G j, j = 1,,m. The importance (weight) of goal j is expressed by w j. The attainment of goal G j by alternative xi is expressed by the degree of membership G j (x j ). The decision is defined as the intersection of all fuzzy goals, that is D = G 1 G 2 G m. The optimal alternative is defined as that achieving the highest degree of membership in D. Slide 43 FUZZY IMAGE TRANSFORM CODING 1.Transform coding: a transformation, perhaps an energy-preserving transform such as the discrete cosine transform (DCT), converts an image to uncorrelated data, (keep the transform coefficients with high energy and discard the coefficients with low energy, and thus compress the image data.) 2.(HDTV) systems have reinvigorated the image-coding field. (TV images correlate more highly in the time domain than in the spatial domain. Such time correlation permits even higher compression than we can achieve with still image coding.) Slide 44 3.Adaptive cosine transform coding [Chen, 1977] produces high-quality compressed images at the less than I-bit/pixel rate. 1)Classifies subimages into four classes according to their AC energy level and encodes each class with different bit maps. 2)Assigns more bits to a subimage if the subimage contains much detail (large AC energy), and less bits if it contains less detail (small AC energy). 3)DC energy refers to the constant background intensity in an image and behaves as an average. 4)AC energy measures intensity deviations about the background DC average. So the AC energy behaves as a sample-variance statistic. Slide 45 DCT X Coding Subimage Classification Decoding DCT -1 X,X, Figure10.1 Block diagram of adaptive cosine transform coding. Slide 46 Slide 47 4.Selection of quantizing fuzzy-set values 1)Use percentage-scaled values of Ti and Li scaled by the maximum possible AC power value. 2)Compute the maximum AC power Tmax form the DCT coefficients of the subimage filled with random numbers from 0 to 255. 3)Calculate the arithmetic average AC powers for each class. Slide 48 ADAPTIVE FAM SYSTEMS FOR TRANSFORM CODING 1.Classified subimage into four fuzzy classes B: HI, MH, ML, LO. (encode the HI subimage with more bits and the LO subimage with less bits.) 2.The four fuzzy sets BG, MD, SL, and VS quantized the total AC power T of a subimage. 3.L (low-frequency AC power): assumed only the two fuzzy-set values SM and LG. Slide 49 4.Fuzzy transform image coding uses common-sense fuzzy rules for subimage classification. 1)Fuzzy associative memory (FAM) rules encode structured knowledge as fuzzy associations. 2)The fuzzy association (Ai, Bi) represents the linguistic rule IF X is Ai, THEN Y is Bi. 3)In fuzzy transform image coding, Ai represents the AC energy distribution of a subimage, and Bi denotes its class membership 4)Product-space clustering estimates FAM rules from training data generated by the Chen system. Slide 50 5)The resulting FAM system estimates the nonlinear subimage classification function f: Em, where E denotes the AC energy distribution of a subimage, and m denotes the class membership of a subimage. 6)We added a FAM rule to the FAM system if a DCL- trained synaptic vector fell in the FAM cell. (DCL- hased product-space clustering estimated the five FMA rules (1,2,6,7,and 8). We added three common-sense FAM rules (3,4,and 5) to cover the whole input space.) 7)FAM rule 1 (BG, LG; HI) represents the association;, IF the total AC power T is BG AND the low-frequency AC power L is LG, THEN encode the subimage with the class B corresponding to HI. Slide 51 8)The Chen system sorts subimages according to their AC-energy content to produce the subimage- classification mapping. (requires comparatively heavy computations.) 9)The FAM system does not sort subimages. Once we have trained the FAM system, the FAM system classifies sublimage with almost no computation. (FAM only adds and multiplies comparatively few real numbers.) Slide 52 5.Product-Space Clustering to Estimate FAM Rules 1)Product-space clustering with competitive learning adaptively quantizes pattern clusters in the input- output product-space R n. 2)Stochastic competitive learning systems are neural adaptive vector quantization (AVQ) systems. P neurons compete for the activation induced by randomly sampled input-output patterns. The corresponding synaptic fan-in vectors m j adaptively quantize the pattern space R n. The p synaptic vectors m j define the p columns of a synaptic connection matrix M. Slide 53 3)Fuzzy rules (Ti, Li; Bi) define cluster or FAM cells in the input-output product-space R 3. 4)Define FAM-cell edges with the nonoverlapping intervals of the fuzzy-set values. (There are total 32 possible FAM cells and thus 32 possible FAM rules.) 5)Differential competitive learning (DCL) classified each of the 256 input-output data vectors generated from the Chen system into one of the 32 FAM cells. Slide 54 6.Simulation: Lenna image F-16 image 1)FAM also performed well for F16 image. 2)When we encode multiple images with fixed bit maps, we cannot optimize or tune the bit maps to a specific image. 3)FAM encoding performed slightly better (had a larger signal-to-noise ratio) than did Chen encoding and maintained a slightly higher compression ratio (fewer bits/pixel). 4)FAM reduces side information and uses only 8 FAM rules to achieve 16-to-1 image compression. 5)If a system leaves numerical I/O footprints in the data, an AFAM system can leave similar footprints in similar contexts. Judicious fuzzy engineering can then refine the system and sharpen the footprints. Slide 55 Slide 56 Slide 57 Slide 58 Slide 59 Slide 60 Slide 61