© 2003, prentice-hall chapter 4 - 1 chapter 4: machines than can learn modern data warehousing,...

© 2003, Prentice-Hall Chapter 4 - 1

Chapter 4: Machines Than Can Learn

Modern Data Warehousing, Mining, and Visualization: Core Concepts

by George M. Marakas


4-1: Fuzzy Logic and Linguistic Ambiguity

Our language is replete with vague and imprecise concepts, and allows for conveyance of meaning through semantic approximations.

These approximations are useful to humans, but do not readily lend themselves to the rule-based reasoning done on computers.

Use of fuzzy logic is how computers handle this ambiguity.


The Basics of Fuzzy Logic

In a “pure” logical comparison, the result is either false (0) or true (1) and can be stored in a binary fashion.

The results of a fuzzy logic operation range from 0 (absolutely false) to 1 (absolutely true), with stops in between.

These operations utilize functions that assign a degree of “membership” in a set.


A Simple Membership Function Example

The “Tallness” function takes a person’s height and converts it to a numerical scale from 0 to 1.

Here the statement “He is Tall” is absolutely false for heights below 5 feet and absolutely true for heights above 7 feet

0.00

0.50

1.00

0 1 2 3 4 5 6 7 8 9 10

Height in Feet

Degree of Tallness


Fuzziness Versus Probability

There are some subtle differences:

Probability deals with the likelihood that something has a particular property.

Fuzzy logic deals with the degree to which the property is present. For example, a person 6 feet in height has a .5 degree of tallness.


Advantages and Limitations of Fuzzy Logic

Advantages: fuzzy logic allows for the modeling and inclusion of contradiction in a knowledge base. It also increases the system autonomy (the rules in the knowledge base function independent of each other).Disadvantages: In a highly complex system, use of fuzzy logic may become an obstacle to the verification of system reliability. Also, fuzzy reasoning mechanisms cannot learn from their mistakes.


4-2: Artificial Neural Networks

First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.

They have ability to model complex, yet poorly understood problems.

ANNs are simple computer-based programs whose function is to model a problem space based on trial and error.


Learning From Experience

The process is:1. A piece of data is presented to a neural net.

The ANN “guesses” an output.2. The prediction is compared with the actual or

correct value. If the guess was correct, no action is taken.

3. An incorrect guess causes the ANN to examine itself to determine which parameters to adjust.

4. Another piece of data is presented and the process is repeated.


Fundamentals of Neural Computing

The basic processing element in the human nervous system is the neuron. Networks of these interconnected cells receive information from sensors in the eye, ear, etc.Information received by a neuron will either excite it (and it will pass a message along the network) or will inhibit it (suppressing information flow).Sensitivity can change with passing of time or gaining of experience.


Putting a Brain in a BoxAn ANN is composed of three basic layers:1. The input layer receives the data2. The internal or hidden layer processes the data.3. The output layer relays the final result of the net.


Inside the Neurode

The neurode usually has multiple inputs, each input with its own weight or importance.

A bias input can be used to amplify the output.

The state function consolidates the weights of the various inputs into a single value.

The transfer function processes this state value and makes the output.


Training the Artificial Neural Network


Sending the Net to School: Learning Paradigms

In unsupervised learning paradigms, the ANN receives input data but not any feedback about desired results. It develops clusters of the training records based on data similarities.

In a supervised learning paradigm, the ANN gets to compare its guess to feedback containing the desired results. The most common of these is back propagation, which does the comparison with squared errors.


Benefits Associated with Neural Computing

Avoidance of explicit programmingReduced need for expertsANNs are adaptable to changed inputsNo need for refined knowledge baseANNs are dynamic and improve with useAble to process erroneous or incomplete dataAllows for generalization from specific infoAllows inclusion of common sense into the problem-solving domain


Limitations Associated with Neural Computing

ANNs cannot “explain” their inference

The “black box” nature makes accountability and reliability issues difficult

Repetitive training process is time consuming

Highly skilled machine learning analysts and designers are still a scare resource

ANN technology pushes the limits of current hardware

ANN require “faith” be imparted to the output


4-3: Genetic Algorithms and Genetically Evolved Networks

If a problem has any solution, it suggests that there is an optimal solution somewhere.

The field of management science has been able to tackle increasingly complex problems and find optimal solutions.

This success leads us to tackle problems even more complicated, creating a need for more innovative solution methods.

One such method is the genetic algorithm.


Introduction to Genetic Algorithms

Like neural nets, genetic algorithms (GA) are based on biological theory.

Here, however, GAs find their roots in the evolutionary theories of natural selection and adaptation.

The power of a GA results from the mating of two population members to produce offspring that are sometimes better than the parents.


Basic Components of a Genetic AlgorithmThe smallest units of information are dubbed genes, which combine into chromosomes.

After a GA is initialized, it uses a “fitness function” to evaluate each chromosome.

The GA then experiments by combining the most fit chromosomes.

Next, the crossover phase sees these two “good” chromosomes exchange gene information.

The mutated chromosomes then join the pool.


Basic Process Flow of a Genetic Algorithm


Benefits and Limitations Associated With GAs

Population size is a critical factor in the speed of finding a solution, but at least it is relatively easy to predict this speed.Crossover and mutation are interesting ideas, but they should not be used too frequently (or too sparingly, either).One advantage is that you are always guaranteed to come up with at least a “reasonable” solution.We can also apply them to problems for which we really have no clue on how to solve.Finally, their power comes from simple concepts, not from a complicated algorithmic procedure.


4-4: Applications of Machines That Learn

Nippon Steel: blast furnace control system that uses ANNsDaiwa Securities and NEC: stock price chart pattern recognitionMitsubishi Electric: neural net and optical scanning to recognize textNippon Oil: neural net used for diagnosis of pump vibrationCredit scoring on loan applications, both to individuals and corporations


The Future of Machine Learning

Already, artificial neural nets exceed human capacity for isolated instances.

Theoretically, a computer can process data million times faster than a human.

Fortunately for us, humans are so much better at acquiring data. Computers just don’t have anything like the five senses.

© 2003, prentice-hall chapter 4 - 1 chapter 4: machines than can learn modern data warehousing,...

Documents

fuzzy logic deals

use of fuzzy logic

fuzzy logic operation

fuzzy reasoning mechanisms

complex system

degree of membership

system autonomy

piece of data