cs 484 – artificial intelligence1 announcements project 1 is due tuesday, october 16 send me the...
TRANSCRIPT
CS 484 – Artificial Intelligence 1
Announcements
• Project 1 is due Tuesday, October 16• Send me the name of your konane bot
• Midterm is Thursday, October 18• Bring 1 8x11.5 piece of paper with anything
you want written on it• Book Review is due Thursday, October 25• Andrew's Current Event
• Volunteer for Tuesday?
CS 484 – Artificial Intelligence 3
Effects of Programs that Learn
• Application areas• Learning from medical records which treatments are
most effective for new diseases
• Houses learning from experience to optimize energy costs based on usage patterns of their occupants
• Personal software assistants learning the evolving interests of users in order to highlight especially relevant stories from the online morning newspaper
CS 484 – Artificial Intelligence 4
Effective Applications of Learning
• Speech recognition• outperform all other approaches that have been
attempted to date• Data mining
• Learning algorithms being used to discover valuable knowledge from large commercial databases
• detect fraudulent use of credit cards• Play Games
• Play backgammon at levels approaching the performance of human world champions
CS 484 – Artificial Intelligence 5
Learning Programs
• A computer program is said to learn from experiences E with respect to some class of tasks T and performance P, if its performance at tasks in T, as measured by P, improves with experience E
• Examples• A checkers learning problem:
• Task T: playing checkers• Performance measure P: percent of games won against opponents• Training experience E: playing practice games against itself
• Handwriting recognition learning problem:• Task T: recognizing and classifying handwritten words within images• Performance measure P: percent of words correctly classified• Training experience E: a database of handwritten words with given
classifications
CS 484 – Artificial Intelligence 6
Designing a Learning System
• Consider designing a program to learn to play checkers, with the goal of entering it in the world checkers tournament
• Requires the following sets• Choosing Training Experience
• Choosing the Target Function
• Choosing the Representation of the Target Function
• Choosing the Function Approximation Algorithm
CS 484 – Artificial Intelligence 7
Choosing the Training Experience (1)
• Will the training experience provide direct or indirect feedback?• Direct Feedback: system learns from examples of individual
checkers board states and the correct move for each• Indirect Feedback: Move sequences and final outcomes of various
games played• Credit assignment problem: Value of early states must be inferred
from the outcome• Degree to which the learner controls the sequence of
training examples• Teacher selects informative boards and gives correct move• Learner proposes board states that it finds particularly confusing.
Teacher provides correct moves• Learner controls board states and (indirect) training classifications
CS 484 – Artificial Intelligence 8
Choosing the Training Experience (2)
• How well the training experience represents the distribution of examples over which the final system performance P will be measured• If training the checkers program consists only of
experiences played against itself, it may never encounter crucial board states that are likely to be played by the human checkers champion
• Most theory of machine learning rests on the assumption that the distribution of training examples is identical to the distribution of test examples
CS 484 – Artificial Intelligence 9
Partial Design of Checkers Learning Program
• A checkers learning problem:• Task T: playing checkers
• Performance measure P: percent of games won in the world tournament
• Training experience E: games played against itself
• Remaining choices• The exact type of knowledge to be learned
• A representation for this target knowledge
• A learning mechanism
CS 484 – Artificial Intelligence 10
Choosing the Target Function (1)• Assume that you can determine legal moves• Program needs to learn the best move from among legal
moves• Defines large search space known a priori• target function: ChooseMove : B → M
• ChooseMove is difficult to learn given indirect training• Alternative target function
• An evaluation function that assigns a numerical score to any given board state
• V : B → ( where is the set of real numbers)• V(b) for an arbitrary board state b in B
• if b is a final board state that is won, then V(b) = 100• if b is a final board state that is lost, then V(b) = -100• if b is a final board state that is drawn, then V(b) = 0• if b is not a final state, then V(b) = V(b '), where b' is the best final board
state that can be achieved starting from b and playing optimally until the end of the game
CS 484 – Artificial Intelligence 11
Choosing the Target Function (2)
• V(b) gives a recursive definition for board state b• Not usable because not efficient to compute except is
first three trivial cases
• nonoperational definition
• Goal of learning is to discover an operational description of V
• Learning the target function is often called function approximation• Referred to as V̂
CS 484 – Artificial Intelligence 12
Choosing a Representation for the Target Function
• Choice of representations involve trade offs• Pick a very expressive representation to allow close approximation to the
ideal target function V• More expressive, more training data required to choose among alternative
hypotheses• Use linear combination of the following board features:
• x1: the number of black pieces on the board• x2: the number of red pieces on the board• x3: the number of black kings on the board• x4: the number of red kings on the board• x5: the number of black pieces threatened by red (i.e. which can be
captured on red's next turn)• x6: the number of red pieces threatened by black
6655443322110)(ˆ xwxwxwxwxwxwwbV
CS 484 – Artificial Intelligence 13
Partial Design of Checkers Learning Program
• A checkers learning problem:• Task T: playing checkers
• Performance measure P: percent of games won in the world tournament
• Training experience E: games played against itself
• Target Function: V: Board →
• Target function representation
6655443322110)(ˆ xwxwxwxwxwxwwbV
CS 484 – Artificial Intelligence 14
Choosing a Function Approximation Algorithm
• To learn we require a set of training examples describing the board b and the training value Vtrain(b)
• Ordered pair
V̂
bVb train,
100,0,0,0,1,0,3 654321 xxxxxx
CS 484 – Artificial Intelligence 15
Estimating Training Values
• Need to assign specific scores to intermediate board states
• Approximate intermediate board state b using the learner's current approximation of the next board state following b
• Simple and successful approach• More accurate for states closer to end states
))((ˆ)( bSuccessorVbVtrain
CS 484 – Artificial Intelligence 16
Adjusting the Weights
• Choose the weights wi to best fit the set of training examples
• Minimize the squared error E between the train values and the values predicted by the hypothesis
• Require an algorithm that• will incrementally refine weights as new training examples
become available • will be robust to errors in these estimated training values
• Least Mean Squares (LMS) is one such algorithm
examplestrainingbVbtrain
train
bVbVE ,
2ˆ
CS 484 – Artificial Intelligence 17
LMS Weight Update Rule
• For each train example• Use the current weights to calculate
• For each weight wi, update it as
• where• is a small constant (e.g. 0.1)
bVb train,
bV̂
itrainii xbVbVww ˆ
CS 484 – Artificial Intelligence 18
Final Design
ExperimentGenerator
PerformanceSystem
Critic
Generalizer
HypothesisNew problem(initial game board)
Solution trace(game history)
Training examples
V̂
,,,, 2211 bVbbVb traintrain
CS 484 – Artificial Intelligence 19
Summary of Design ChoicesDetermine Type
of Training Experience
Determine Target Function
Determine Representationof Learned Function
Determine Learning Algorithm
Complete Design
Games against itself Table of correct Moves
Games againstExperts
…
Board → valueBoard → move…
Linear function of six features
PolynomialArtificial neural
network…
Gradient descent Linear Programming …
CS 484 – Artificial Intelligence 20
Training Classification Problems
• Many learning problems involve classifying inputs into a discrete set of possible categories.
• Learning is only possible if there is a relationship between the data and the classifications.
• Training involves providing the system with data which has been manually classified.
• Learning systems use the training data to learn to classify unseen data.
CS 484 – Artificial Intelligence 21
Rote Learning
• A very simple learning method.• Simply involves memorizing the
classifications of the training data.• Can only classify previously seen data –
unseen data cannot be classified by a rote learner.
CS 484 – Artificial Intelligence 22
Concept Learning
• Concept learning involves determining a mapping from a set of input variables to a Boolean value.
• Such methods are known as inductive learning methods.
• If a function can be found which maps training data to correct classifications, then it will also work well for unseen data – hopefully!
• This process is known as generalization.
CS 484 – Artificial Intelligence 23
Example Learning Task
• Learn the "days on which my friend Aldo enjoys his favorite water sport"
Example Sky AirTemp Humidity Wind Water Forecast EnjoySport
1 Sunny Warm Normal Strong Warm Same Yes
2 Sunny Warm High Strong Warm Same Yes
3 Rainy Cold High Strong Warm Change No
4 Sunny Warm High Strong Cool Change Yes
CS 484 – Artificial Intelligence 24
Hypotheses
• A hypothesis is a vector of constraints for each attribute• indicate by a "?" that any value is acceptable for this
attribute• specify a single required value for the attribute• indication by a "Ø" that no value is acceptable
• If some instance x satisfies all the constraints of hypothesis h, then h classifies x as a positive example (h(x) = 1)
• Example hypothesis for EnjoySport:
CS 484 – Artificial Intelligence 25
EnjoySport concept learning task
• Given• Instances X: Possible days, each described by the attributes
• Sky (with possible values Sunny, Cloudy, and Rainy)• AirTemp (with values Warm and Cold)• Humidity (with values Normal and High)• Wind (with values Strong and Weak)• Water (with values Warm and Cool), and• Forecast (with values Same and Change)
• Hypothesis H: Each hypothesis is described by a conjunction of constraints on the attributes. The constraints may be "?", "Ø", or a specific value
• Target concept c: EnjoySport : X → {0,1}• Training Examples D: Positive or negative examples of the target function
• Determine• A hypothesis h in H such that h(x) = c(x) for all x in X
CS 484 – Artificial Intelligence 26
Inductive Learning Hypothesis
• Determine a hypothesis h identical to the target concept c over the entire set of instances X• only information about c is its values over the training
examples• Inductive learning at best guarantees that the output
hypothesis fits the target concept over the training data• Fundamental assumption of inductive learning
• Inductive Learning Hypothesis: Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples
CS 484 – Artificial Intelligence 27
Concept Learning As Search
• Search through a large space of hypothesis implicitly defined by the hypothesis representation
• Find the hypothesis that best fits the training examples
• How big is the hypothesis space?• In EnjoySport six attributes: Sky has 3 values, and
the rest have 2• How many distinct instances? • How many hypothesis?
CS 484 – Artificial Intelligence 28
General to Specific Ordering • This hypothesis is the most general hypothesis. It represents the idea that every day is a positive example:
hg = <?, ?, ?, ?, ?, ?>• The following hypothesis is the most specific hypothesis: it says that no
day is a positive example:
hs = <Ø, Ø, Ø, Ø, Ø, Ø>• We can define a partial order over the set of hypotheses:
h1 >g h2
• This states that h1 is more general than h2
• Let h1 = <Sunny, ?, ?, ?, ?, ?>• Let h2 = <Sunny, ?, ?, Strong, ?, ?>
• Given hypothesis hj and hk, hj is more_general_than_or_equal_to hk if and only if any instance that satisfies hk also satisfies hj
• One learning method is to determine the most specific hypothesis that matches all the training data.
CS 484 – Artificial Intelligence 29
Partial Ordering
x1
x2
Specific
General
Hypotheses HInstances X
h1 h3
h2
x1 = <Sunny, Warm, High, Strong, Cool, Same>x2 = <Sunny, Warm, High, Light, Warm, Same>
h1 = <Sunny, ?, ?, Strong, ?, ?>h2 = <Sunny, ?, ?, ?, ?, ?>h3 = <Sunny, ?, ?, ?, Cool, ?>
CS 484 – Artificial Intelligence 30
Find-S: Finding a maximally Specific Hypothesis
1. Initialize h to the most specific hypothesis in H2. For each positive training instance x
• For each attribute constraint ai in h• If the constraint ai is satisfied by x• Then do nothing• Else replace ai in h by the next more general constraint that is
satisfied by x
3. Output hypothesis h• Begin: h ← <Ø, Ø, Ø, Ø, Ø, Ø>
Example Sky AirTemp Humidity Wind Water Forecast EnjoySport
1 Sunny Warm Normal Strong Warm Same Yes
2 Sunny Warm High Strong Warm Same Yes
3 Rainy Cold High Strong Warm Change No
4 Sunny Warm High Strong Cool Change Yes