artificial intelligence [intelligent agents paradigm]

ARTIFICIAL INTELLIGENCE[INTELLIGENT AGENTS PARADIGM]

Professor Janis Grundspenkis

Riga Technical University

Faculty of Computer Science and Information Technology

Institute of Applied Computer Systems

Department of Systems Theory and Design

E-mail: [email protected]

LEARNING AGENTS

LEARNING FROM OBSERVATIONS

Learning in intelligent agents is essential for dealing with unknown environments (compensating for the designer’s lack of full amount of knowledge about the agent’s environment).

The idea behind learning is that percepts should be used not only for acting, but also for improving the agent's ability to act in the future.

Learning takes place as a result of the interaction between the agent and the world, and from observation by the agent of its own decision-making process.

LEARNING AGENT (1)

Four conceptual components:• LEARNING ELEMENT is responsible for making

improvements and for efficiency of the performance element.

• PERFORMANCE ELEMENT is responsible for selecting external actions.

• CRITIC is designed to tell the learning element how well the agent is doing.

• PROBLEM GENERATOR is responsible for suggesting actions that will lead to new and informative experience.

LEARNING AGENT (2)E

nvironment

Agent

Sensors

Effectors

Performance standard

feedback

Critic

Learningelement

Performanceelement

Problemgenerator

learninggoals

changes

knowledge

THE DESIGN OF THE LEARNING ELEMENT

Four major issues:• Which components of the performance

element are to be improved.• What representation is used for those

components.• What feedback is available.• What prior knowledge is available.

COMPONENTS OF THE PERFORMACE ELEMENT (1)

1. A direct mapping from conditions on the current state to actions.

2. A means to infer relevant properties of the world from the percept sequence.

3. Information about the way the world evolves.


4. Utility information indicating the desirability of world states.

5. Action-value information indicating the desirability of particular actions in particular states.

6. Goals that describe classes of states whose achievement maximizes the agent’s utility.


Each of the components can be learned, given the appropriate feedback.

For example, if the agent does an action and then perceives the resulting state of the environment, this information can be used to learn a description of the results of actions (the fourth component).

If the critic can use the performance standard to deduce utility values from the percepts, then the agent can learn a useful representation of its utility function (the fifth component).


Each of the seven components of the performance element can be described mathematically as a function.

Learning any particular component of the performance element can be seen as learning an accurate representation of a function.

The difficulty of learning depends on the chosen representation. Functions can be represented by logical sentences, belief networks, neural networks, etc.


Learning takes many forms, depending on the nature of the performance element, the available feedback, and the available knowledge.

For example, supervised learning – any situation in which both the inputs and outputs of a component can be perceived.

If agent receives some evaluation of its actions but is not told a correct action in learning the condition-action component, this is called reinforcement learning.

INDUCTIVE LEARNING

Learning a function (constructing a description of a function) from a set of input/output examples is called inductive learning.

An example is a pair (x, f(x)), where x is the input and f(x) is the output of the function applied to x.

The task of induction is this: given a collection of examples of f return a function h that approximates f.

The function h is called a hypothesis.

EXAMPLES AND DIFFERENT HYPOTHESIS

x

f(x)

a)x

f(x)

b)x

f(x)

c)

xd)

xe)

The true function f is unknown, so there are many choices for h.

REFLEX LEARNING AGENT

EXAMPLES are (percept, action) pairs.

procedure REFLEX-LEARNING-ELEMENT (percept, action)

INPUTS: percept, feedback perceptaction, feedback action

EXAMPLES EXAMPLES { (percept, action) }

function REFLEX-PERFORMACE-ELEMENT (percept) returns an action

IF (percept, action) is in EXAMPLES THEN return actionELSE call the learning algorithm INDUCE (EXAMPLES) h

which the agent uses to choose the actionreturn h (percept)

LEARNING DECISION TREES

A decision tree takes as input an object or situation described by a set of properties, and outputs a YES/NO decision.

Decision trees are an effective method for learning deterministic Boolean functions.

Each internal node in the tree corresponds to a test of the value of one of the properties (attributes).

Each leaf node in the tree specifies the Boolean value to be returned if that leaf is reached.

EXAMPLE OF THE DECISION TREE LEARNING (1)

The problem: whether to wait for a table in restaurant.The aim is to learn a definition for the goal predicate

WILL WAIT, where the definition is expressed as a decision tree.

The examples are described by the following attributes:1. Alternate: whether there is a suitable alternative

restaurant nearby.2. Bar: whether the restaurant has a comfortable bar

area to wait in.

EXAMPLE OF THE DECISION TREE LEARNING (2)

3. Friday/Saturday: true on Fridays and Saturdays.4. Hungry: whether we are hungry.5. Patrons: how many people are in the restaurant (values

are NONE, SOME, and FULL).6. Price: the restaurant’s price range ($, $$, $$$).7. Raining: whether it is raining outside.8. Reservation: whether we made a reservation.9. Type: the kind of restaurant (French, Italian, Thai, or

Burger).10. Wait Estimate: the wait estimate by the host (0-10

minutes, 10-30 minutes, 30-60, > 60).

Patrons?

No

None

Yes WaitEstimate?

SomeFull

Alternate?

30-60

No

>60

Hungry? Yes

10-30 0-10

Reservation?

No

Fri/Sat?

Yes

Yes Alternate?

No Yes

Bar?

No

Yes No Yes

Yes No Yes

Yes Raining?

No Yes

No

No

Yes

Yes

No

No

Yes

Yes

A DECISION TREE FOR DECIDINGWHETHER TO WAIT FOR A TABLE

REPRESENTATION OF THE DECISION TREE

The tree can be expressed as a conjunction of individual implications corresponding to the paths through the tree ending in YES nodes.

Example: The path for a restaurant full of patrons, with an estimates wait of 10-30 minutes when the agent is not hungry is expressed by the logical sentence

r Patrons(r, FULL) WaitEstimate(r, 10-30) Hungry (r, NOT) WillWait(r)

EXPRESSIVENESS OF DECISION TREES (1)

The decision trees can not represent any set because decision trees are implicitly limited to talking about a single object.

The decision tree language is essentially propositional, with each attribute test being a proposition.

The decision trees are fully expressive within the class of propositional languages, that is, any Boolean function can be written as a decision tree.

EXPRESSIVENESS OF DECISION TREES (2)

Decision trees are good for some kinds of functions, and bad for others.

If we have n attributes than there are 22n

different functions.For example, with just six Boolean attributes,

there are about 21019 different functions to choose from. How to find consistent hypothesis in such a large space?

INDUCING DECISION TREES FROM EXAMPLES (1)

An example is described by the values of the attributes and the value of the goal predicate.

The value of the goal predicate is called the classification of the example.

If the goal predicate is true for some example, it is a positive example, otherwise it is a negative example.

INDUCING DECISION TREES FROM EXAMPLES (2)EXAMPLE

Example Attributes Goal

WillWaitAlt Bar Fri Hun Pat Price Rain Res Type Est

X1 Yes No No Yes Some $$$ No Yes French 0-10 Yes

X2 Yes No No Yes Full $ No No Thai 30-60 No

X3 No Yes No No Some $ No No Burger 0-10 Yes

X4 Yes No Yes Yes Full $ No No Thai 10-30 Yes

X5 Yes No Yes No Full $$$ No Yes French >60 No

X6 No Yes No Yes Some $$ Yes Yes Italian 0-10 Yes

X7 No Yes No No None $ Yes No Burger 0-10 No

X8 No No No Yes Some $$ Yes Yes Thai 0-10 Yes

X9 No Yes Yes No Full $ Yes No Burger >60 No

X10 Yes Yes Yes Yes Full $$$ No Yes Italian 10-30 No

X11 No No No No None $ No No Thai 0-10 No

X12 Yes Yes Yes Yes Full $ No No Burger 30-60 Yes

CONTRUCTION OF THE DECISION TREE (1)

Simple solution: construct a decision tree that has one path to a leaf for each example, where the path tests each attribute in turn and follows the value for the example, and the leaf has the classification of the example.

This is a simple way how to find the decision tree that agrees with the training set of examples.

When given the example with the same description again, the decision tree will come up with the right classification.


The problem with a trivial tree is that it just memorizes the observations.

It does no extract any pattern from the examples and so we can not expect it to be able to extrapolate to examples it has not seen.

Extracting a pattern means being able to describe a large number of cases in a concise way.

We should try to find a concise decision tree.


This is an example of a general principle of inductive learning called Ockham’s razor:

The most likely hypothesis is the simplest one that is consistent with all observations (examples).

A simple hypothesis that is consistent with the observations is more likely to be correct than a complex one.

FINDING THE SMALLEST DECISION TREE (1)

The basic idea:Test the most important attribute first.The most important is the attribute that makes

the most difference to the classification of an example.

This way, we hope to get the correct classification with a small number of tests, meaning that all paths in the tree will be short and the tree as a whole will be small.


X1, X3, X4, X6, X8, X12X2, X5, X7, X9, X10, X11

+:-:

Type?

+:-:

X1X5

French

+:-:

X6X10

Italian+:-:

X4, X8X2, X11

Thai

+:-:

X3, X12X7, X9

Burger


X1, X3, X4, X6, X8, X12X2, X5, X7, X9, X10, X11

+:-:

Patrons?

+:-:

X7, X11

None+:-:

X1, X3, X6, X8

Some+:-:

X4, X12X2, X5, X9, X10

Full


X1, X3, X4, X6, X8, X12X2, X5, X7, X9, X10, X11

+:-:

Patrons?

+:-:

X4, X12X2, X5, X9, X10

Full

Hungry?

+:-:

X4, X12X2, X10

Y

+:-:

X5, X9

N

+:-:

X7, X11

None

No

+:-:

X1, X3, X6, X8

Some

Yes


Consider all possible attributes and find the most important one.

After the first attribute test splits up the examples, each outcome is a new decision tree learning problem in itself, with fewer examples and one fewer attribute.


FOUR CASES1. If there are some positive and some negative

examples, then choose the best attribute to split them.

2. If all the remaining examples are positive (or all negative), answer YES or NO respectively.

3. If there are no examples left (no such examples has been observed), return a default value calculated from the majority classification at the node’s parent.


4. If there are no attributes left, but both positive and negative examples, it means that these examples have exactly the same description, but different classification.

This happens when some of the data are incorrect (there is noise in the data).

It also happens when the attributes do not give enough information to fully describe the situation, or when the domain is truly nondeterministic.

One simple way out of the problem is to use a majority vote if no more attributes can be used.

FINDING THE SMALLESTDECISION TREE (8)

X1, X3, X4, X6, X8, X12X2, X5, X7, X9, X10, X11

+:-:

Patrons?

Hungry?No Yes

Type? No

No Fri/Sat? Yes

Yes

No Yes

+:-:

X1, X3, X6, X8

Some

+:-:

X4, X12X2, X5, X9, X10

Full

+:-: X7, X11

None

+:-:

X4, X12X2, X10

Yes

+:-: X5, X9

No

X10

Italian

+:-:

French

X4X2

Thai

+:-:

X12

Burger

+:-:

+:-: X2

No+:-:

X4Yes


CONCLUSIONSThe learning algorithm looks at all examples, not at the

correct function, and in fact, its hypothesis shown as the last decision tree not only agrees with all the examples, but is considerably simpler than the original tree.

The learning algorithm has no reason to include tests for RAINING and RESERVATION, because it can classify all the examples without them.

If we were to gather more examples, we might induce a tree more similar to the original.

ASSESSING THE PERFORMANCE OF THE LEARNING ALGORITHM (1)

A learning algorithm is good if it produces hypothesis that do a good job of predicting the classification of unseen examples.

A prediction is good if it turns out to be true, so we can assess the quality of a hypothesis by checking its predictions against the correct classification once we know it.

It is done on the test set.


THE METHODOLOGY1. Collect a large set of examples.2. Divide it into two distinct sets: the training set and the

test set.3. Use the learning algorithm with the training set as

examples to generate a hypothesis.4. Measure the percentage of examples in the test set

that are correctly classified by a hypothesis.5. Repeat steps 1 to 4 for different sizes of training sets

and different randomly selected training sets of each size.


The key idea of the methodology is to keep the training and test data separate.

The result of the application of the methodology is a set of data that can be processed to give the average prediction quality as a function of the size of the training set.


A learning curve

As the training set grows, the prediction quality increases.

% correcton test set

1

0 10020 40 60 80

Training setsize

artificial intelligence [intelligent agents paradigm]

Documents

performance element

learning agent

performace element

difficulty of learning

conceptual components

utility information

agents utility

performance standard