ai model test paper with answers

UNIT-IQ1. What is Artificial Intelligence? Explain history and importance of AI. Ans:- Artificial Intelligence is the study of how to make computers do things, which at the moment, people can do better.History of AI:- Although the computer provided the technology necessary for AI, it was not until the early 1950's that the link between human intelligence and machines was really observed. Norbert Wiener was one of the first Americans to make observations on the principle of feedback theory feedback theory. The most familiar example of feedback theory is the thermostat: It controls the temperature of an environment by gathering the actual temperature of the house, comparing it to the desired temperature, and responding by turning the heat up or down. What was so important about his research into feedback loops was that Wiener theorized that all intelligent behavior was the result of feedback mechanisms. Mechanisms that could possibly be simulated by machines. This discovery influenced much of early development of AI. Importance of AI:-

The subject of artificial intelligence was originated with game-playing and theorem-proving programs and was gradually enriched with theories from a number of parent disciplines.Learning Systems: Among the subject areas covered under artificial intelligence, learning systems needs special mention. The concept of learning is illustrated here with reference to a natural problem of learning of pronunciation by a child from his mother.Knowledge Representation and Reasoning: In a reasoning problem, one has to reach a pre-defined goal state from one or more given initial states. So, the lesser the number of transitions for reaching the goal state, the higher the efficiency of the reasoning system.Planning: Another significant area of artificial intelligence is planning. The problems of reasoning and planning share many common issues, but have a basic difference that originates from their definitions.Knowledge Acquisition: Acquisition (Elicitation) of knowledge is equally hard for machines as it is for human beings. It includes generation of new pieces of knowledge from given knowledge base, setting dynamic data structures for existing knowledge, learning knowledge from the environment and refinement of knowledge.Logic Programming: For more than a century, mathematicians and logicians were used to designing various tools to represent logical statements by symbolic operators. One outgrowth of such attempts is propositional logic, which deals with a set of binary statements (propositions) connected by Boolean operators.Soft Computing: Soft computing, according to Prof. Zadeh, is "an emerging approach to computing, which parallels the remarkable ability of the human mind to reason and learn in an environment of uncertainty and imprecision”.

Q2. What are application fields of AI?Ans:- Almost every branch of science and engineering currently shares the tools and techniques available in the domain of artificial intelligence. However, for the sake of the convenience of the readers, we mention here a few typical applications, where AI plays a significant and decisive role in engineering automation.Expert Systems: For example, we illustrate the reasoning process involved in an expert system

for a weather forecasting problem with special emphasis to its architecture. An expert system consists of a knowledge base, database and an inference engine for interpreting the database using the knowledge supplied in the knowledge base. The inference engine attempts to match the antecedent clauses (IF parts) of the rules with the data stored in the database. When all the antecedent clauses of a rule are available in the database, the rule is fired, resulting in new inferences. The resulting inferences are added to the database for activating subsequent firing of other rules. In order to keep limited data in the database, a few rules that contain an explicit consequent (THEN) clause to delete specific data from the databases are employed in the knowledge base. On firing of such rules, the unwanted data clauses as suggested by the rule are deleted from the database. Here PR1 fires as both of its antecedent clauses are present in the database. On firing of PR1, the consequent clause "it-will-rain" will be added to the database for subsequent firing of PR2.Image Understanding and Computer Vision: A digital image can be regarded as a two-dimensional array of pixels containing gray levels corresponding to the intensity of the reflected illumination received by a video camera. For interpretation of a scene, its image should be passed through three basic processes: low, medium and high level vision.The importance of low level vision is to pre-process the image by filtering from noise. The medium level vision system deals with enhancement of details and segmentation (i.e., partitioning the image into objects of interest ). The high level vision system includes three steps: recognition of the objects from the segmented image, labeling of the image and interpretation of the scene. Most of the AI tools and techniques are required in high level vision systems. Recognition of objects from its image can be carried out through a process of pattern classification, which at present is realized by supervised learning algorithms. The interpretation process, on the other hand, requires knowledge-based computation.Speech and Natural Language Understanding: Understanding of speech and natural languages is basically two classical problems. In speech analysis, the main problem is to separate the syllables of a spoken word and determine features like amplitude, and fundamental and harmonic frequencies of each syllable. The words then could be identified from the extracted features by pattern classification techniques. Recently, artificial neural networks have been employed to classify words from their features. The problem of understanding natural languages like English, on the other hand, includes syntactic and semantic interpretation of the words in a sentence, and sentences in a paragraph. The syntactic steps are required to analyze the sentences by its grammar and are similar with the steps of compilation. The semantic analysis, which is performed following the syntactic analysis, determines the meaning of the sentences from the association of the words and that of a paragraph from the closeness of the sentences. A robot capable of understanding speech in a natural language will be of immense importance, for it could execute any task verbally communicated to it. The phonetic typewriter, which prints the words pronounced by a person, is another recent invention where speech understanding is employed in a commercial application.Scheduling: In a scheduling problem, one has to plan the time schedule of a set of events to improve the time efficiency of the solution. For instance in a class-routine scheduling problem, the teachers are allocated to different classrooms at different time slots, and we want most classrooms to be occupied most of the time. Flowshop scheduling problems are a NP complete problem and determination of optimal scheduling (for minimizing the make-span) thus requires an exponential order of time with respect to both machine-size and job-size. Finding a sub-optimal solution is thus preferred for such scheduling problems. Recently, artificial neural nets

and genetic algorithms have been employed to solve this problem. The heuristic search, to be discussed shortly, has also been used for handling this problem.Intelligent Control: In process control, the controller is designed from the known models of the process and the required control objective. When the dynamics of the plant is not completely known, the existing techniques for controller design no longer remain valid. Rule-based control is appropriate in such situations. In a rule-based control system, the controller is realized by a set of production rules intuitively set by an expert control engineer. The antecedent (premise) part of the rules in a rule-based system is searched against the dynamic response of the plant parameters. The rule whose antecedent part matches with the plant response is selected and fired. When more than one rule is fir able, the controller resolves the conflict by a set of strategies. On the other hand, there exist situations when the antecedent part of no rules exactly matches with the plant responses. Such situations are handled with fuzzy logic, which is capable of matching the antecedent parts of rules partially/ approximately with the dynamic plant responses. Fuzzy control has been successfully used in many industrial plants. One typical application is the power control in a nuclear reactor. Besides design of the controller, the other issue in process control is to design a plant (process) estimator, which attempts to follow the response of the actual plant, when both the plant and the estimator are jointly excited by a common input signal. The fuzzy and artificial neural network-based learning techniques have recently been identified as new tools for plant estimation.

Q3.Explain State space representation of AI with help of Water jug problem.Ans:- "You are given two jugs, a 4-gallon one and a 3-gallon one. Neither has any measuring markers on it. There is a tap that can be used to fill the jugs with water. How can you get exactly 2 gallons of water into the 4-gallon jug?".

We can look at a state as a pair of numbers, where the first represents the number of gallons of water currently in Jug-A and the second represents the number of gallons in Jug-B.State space search:-

1) (x,y) ->(4,y) if x<42) (x,y) ->(x,3) if y<33) (x,y) ->(x-d,y) if x>04) (x,y) ->(x,y-d) if y>05) (x,y) ->(0,y) if x>06) (x,y) ->(x,0) if y>07) (x,y) ->(4,y-(4-x)) if x+y>=4,y>08) (x,y) ->(x-(3-y),3) if x+y>=3,x>09) (x,y) ->(x+y,0) if x+y<4,y>0

10) (x,y) ->(0,x+y) if x+y<3,x>011) (0,2) ->(2,0) 12) (2,y) ->(0,y)

Solution:-1. current state=(0,0)2.Loop until reaching the goal state=(2,0)(0,0)(0,3)(3,0)(3,3)(4,2)(0,2)(2,0)

Q4. Discuss constraint satisfaction procedure to solve the following cryparithmatic problem:

S E N D +M O R E MO N E Y

Ans:- we have to replace each letter by a distinct digit so that the resulting sum is correct.Two-step process:1. Constraints are discovered and propagated as far as possible.2. If there is still not a solution, then search begins, adding new constraints.

N = 3R = 8 or 92 + D = Y or 2 + D = 10 + Y

M = 1S = 8 or 9O = 0N = E + 1C2 = 1N + R > 8E 9

2 + D = YN + R = 10 + ER = 9S =8

2 + D = 10 + YD = 8 + YD = 8 or 9

Y = 0 Y = 1

E = 2

C1 = 0 C1 = 1

D = 8 D = 9

Initial state:•No two letters have the same value.

•The sum of the digits must be as shown.

SEND

MORE

MONEY

Two kinds of rules:1. Rules that define valid constraint propagation.2. Rules that suggest guesses when necessary.

Q5. Distinguish between A* and AO* algorithms?Ans:- A* algorithm:-

A* uses a best-first search and finds the least-cost path from a given initial node to one goal node (out of one or more possible goals).

It uses a distance-plus-cost heuristic function (usually denoted f(x)) to determine the order in which the search visits nodes in the tree. The distance-plus-cost heuristic is a sum of two functions:

o the path-cost function, which is the cost from the starting node to the current node (usually denoted g(x))

o an admissible "heuristic estimate" of the distance to the goal (usually denoted h(x)).

o The h(x) part of the f(x) function must be an admissible heuristic; that is, it must not overestimate the distance to the goal.

o Thus, for an application like routing, h(x) might represent the straight-line distance to the goal, since that is physically the smallest possible distance between any two points or nodes.

o If the heuristic h satisfies the additional condition for every edge x, y of the graph (where d denotes the length of that edge), then h is called monotone, or consistent.

o In such a case, A* can be implemented more efficiently—roughly speaking, no node needs to be processed more than once (see closed set below)—and A* is equivalent to running Dijkstra's algorithm with the reduced cost d'(x,y): = d(x,y) − h(x) + h(y).

AO* algorithm:-

Initialize the graph to start node. Traverse the graph following the current path accumulating nodes that have not yet been

expanded or solved. Pick any of these nodes and expand it and if it has no successors call this value

FUTILITY otherwise calculate only f’ for each of the successors. If f’ is 0 then mark the node as SOLVED. Change the value of f’ for the newly created node to reflect its successors by back

propagation. Wherever possible use the most promising routes and if a node is marked as SOLVED

then mark the parent node as SOLVED. If starting node is SOLVED or value greater than FUTILITY, stop, else repeat from 2.

Q6. .Discuss Game playing. Explain Alpha-beta pruning.Ans. Game playing

Games are well-defined problems that are generally interpreted as requiring intelligence to play well.

Introduces uncertainty since opponents moves can not be determined in advance• Computer programs which play 2-player games

– game-playing as search– with the complication of an opponent

• General principles of game-playing and search– evaluation functions– minimax principle– alpha-beta-pruning– heuristic techniques

• Status of Game-Playing Systems– in chess, checkers, backgammon, Othello, etc, computers routinely defeat leading

world players• Applications?

– think of “nature” as an opponent– economics, war-gaming, medical drug treatment

Alpha-beta pruning• Idea:

– Do depth first search to generate partial game tree, – Give static evaluation function to leaves,– Compute bound on internal nodes.

• Alpha, Beta bounds:– Alpha value for max node means that max real value is at least alpha.– Beta for min node means that min can guarantee a value below beta.

• Computation:– Alpha of a max node is the maximum value of its seen children. – Beta of a min node is the minimum value seen of its child node .

• Pruning

– Below a Min node whose beta value is lower than or equal to the alpha value of its ancestors.

– Below a Max node having an alpha value greater than or equal to the beta value of any of its Min nodes ancestors.

– Worst-Case– Branches are ordered so that no pruning takes place. In this case alpha-beta gives

no improvement over exhaustive search– Best-Case

– Each player’s best move is the left-most alternative (i.e., evaluated first)– In practice, performance is closer to best rather than worst-case

– In practice often get O(b(d/2)) rather than O(bd) – This is the same as having a branching factor of sqrt(b),

– since (sqrt(b))d = b(d/2) (i.e., we have effectively gone from b to square root of b)

– In chess go from b ~ 35 to b ~ 6– permiting much deeper search in the same amount of time

Alpha-Beta General Principle•Consider a node n where it is Player’s choice of moving to that node. If Player has a better choice m at either the parent node of n or at any choice point further up, then n will never be reached in actual play.•Maintain two parameters in depth-first search, a, the value of the best (highest) value found so far for MAX along any path; and b, the best (lowest) value found along any path for MIN. Prune a subtree once it is known to be worse than the current a or b.Effectiveness of Alpha-Beta•Amount of pruning depends on the order in which siblings are explored.•In optimal case where the best options are explored first, time complexity reduces from O(bd) to O(bd/2), a dramatic improvement. But entails knowledge of best move in advance!•With successors randomly ordered, assymptotic bound is O((b/logb)d) which is not much help but only accurate for b>1000. More realistic expectation is something like O(b3d/4).•Fairly simple ordering heuristic can produce closer to optimal results than random results (e.g. check captures & threats first).•Theoretical analysis makes unrealistic assumptions suchas utility values distributed randomly across leaves and therefore experimental results are necessary.

Q7. Explain Min-Max method of generating the game tree.Ans. The Minmax search procedure is a depth-first, depth-limited procedure. The ideas is to start at the current position & use the plausible-move generator to generate the set of possible successor positions. Now we can apply the static evaluation function to those positions & simply choose the best one.

– 1. Generate the whole game tree to leaves– 2. Apply utility (payoff) function to leaves– 3. Back-up values from leaves toward the root:

• a Max node computes the max of its child values• a Min node computes the Min of its child values

– 4. When value reaches the root: choose max value and the corresponding move.

1. Minimax ProcedureFigure 2 shows a hypothetical game tree with scores assigned at leaves ( terminal nodes). As we are looking ahead, we need the evaluation function only at the leaves of the tree, and the program will make a move based on these values. We start with First Player at the root and examine the whole situation in her perspective. The move the program chooses is a branch coming from the root and the program picks a move to maximize the score in order to minimize mistakes. She also assumes her opponent is as good as her. At the next level down, her opponent selects a move to minimize her score, and so on.

Figure 2 A Game tree with values assigned at leavesBy working up from the bottom of the tree, the program can assign backed-up values to all the nodes. For example, at the right side of Figure 2, the parent of the leaves with score 3 and 8 is at level 3, corresponding to Second Player who moves to minimize the score; so he chooses 3, the minimum of ( 3, 8 ) and we assign score 3 to the node. Its parent, at level 2, corresponding to First Player will choose from ( 3, 12 ) to maximize the score; so she chooses 12 and the program assigns score 12 to the node; the parent of this node again corresponds to Second Player who then chooses 6 from ( 6, 12 ) to minimize the score and so on. The resulted tree is shown in Figure 3

Figure 3 Minimax evaluation of a game tree.Since we alternately take minima and maxima, this process is called a minimax procedure. In a minimax tree, one can view in its entire form, the score values at each each level of the tree at any instant of the game. A player can find out from the tree which moves are the best. In our example, the current situation has a score of 7. So First Player should choose the leftmost branch which leads to the child with the maximum score.

Q8. What are Blind Search techniques? Differentiate them.Ans. Blind/Uninformed Search”

Having no information about the number of steps from the current state to the goal.do not use any specific problem domain information

• e.g., searching for a route on a map without using any information about direction

– yes, it seems dumb: but the power is in the generality– examples: breadth-first, depth-first, etc– we will look at several of these classic algorithms

1.Depth-first search : Expand one of the nodes at the deepest level.

Pseudocode for Depth-First Search Initialize: Let Q = {S}While Q is not empty

pull Q1, the first element in Q

if Q1 is a goalreport(success) and quit

elsechild_nodes = expand(Q1)eliminate child_nodes which represent loopsput remaining child_nodes at the front of Q

endContinue

• Comments– a specific example of the general search tree method– open nodes are stored in a queue Q of nodes– key feature

• new unexpanded nodes are put at front of the queue– convention is that nodes are ordered “left to right

2. Breadth-first search : Expand all the nodes of one level first.

Pseudocode for Breadth-First Search Initialize: Let Q = {S}While Q is not empty

pull Q1, the first element in Qif Q1 is a goal

report(success) and quitelse

child_nodes = expand(Q1)eliminate child_nodes which represent loopsput remaining child_nodes at the back of Q

endContinue

• Comments– another specific example of the general search tree method– open nodes are stored in a queue Q of nodes– differs from depth-first only in that

• new unexpanded nodes are put at back of the queue– convention again is that nodes are ordered “left to right”

Q9. Differentiate betweena. Heuristic and Brute force searchb. LISP and PROLOG

Ans.(a) Heuristic search :

Involving or serving as an aid to learning, discovery, or problem-solving by experimental and especially trial-and-error methods. Heuristic technique improves the efficiency of a search process, possibly by sacrificing claims of completeness or optimality. heuristic is a method that

might not always find the best solution but is guaranteed to find a good solution in reasonable time. By sacrificing completeness it increases efficiency. Useful in solving tough problems which

o could not be solved any other way.o solutions take an infinite time or very long time to compute Heuristic is for combinatorial explosion. Optimal solutions are rarely needed.

Heuristic Search methods Generate and Test Algorithm1. generate a possible solution which can either be a point in the problem space or a path

from the initial state.2. test to see if this possible solution is a real solution by comparing the state reached with

the set of goal states.3. if it is a real solution, return. Otherwise repeat from 1.

This method is basically a depth first search as complete solutions must be created before testing. It is often called the British Museum method as it is like looking for an exhibit at random. A heuristic is needed to sharpen up the search. Consider the problem of four 6-sided cubes, and each side of the cube is painted in one of four colours. The four cubes are placed next to one another and the problem lies in arranging them so that the four available colours are displayed whichever way the 4 cubes are viewed. The problem can only be solved if there are at least four sides coloured in each colour and the number of options tested can be reduced using heuristics if the most popular colour is hidden by the adjacent cube.Hill climbingHere the generate and test method is augmented by an heuristic function which measures the closeness of the current state to the goal state.

1. Evaluate the initial state if it is goal state quit otherwise current state is initial state.2. Select a new operator for this state and generate a new state.3. Evaluate the new state

o if it is closer to goal state than current state make it current stateo if it is no better ignore

4. If the current state is goal state or no new operators available, quitThe Travelling Salesman Problem

“A salesman has a list of cities, each of which he must visit exactly once. There are direct roads between each pair of cities on the list. Find the route the salesman should follow for the shortest possible round trip that both starts and finishes at any one of the cities.”

Nearest neighbour heuristic:1. Select a starting city.2. Select the one closest to the current city.3. Repeat step 2 until all cities have been visited.

Brute force searchThe most general search algorithms are brute-force searches since they do not require any domain specific knowledge. All that is required for a brute-force search is a state description, a set of legal operators, an initial state, and a descriptions of the goal state. So brute-force search is also called uninformed search and blind search.Brute-force search should proceed in a systematic way by exploring nodes in some predetermined order or simply by selecting nodes at random. Search programs either return only a solution value when a goal is found or record and return the solution path.

Brute-force search, also known as exhaustive search, is the simplest and crudest of all possible heuristics: it means checking every single point in the function space.

Nearly all thought in heuristic pertains to how to find a solution, an optimum, or a pretty good combination without searching every point in the design space. Thus brute-force search is the null heuristic.

It's what you do when you don't know of any heuristic that could simplify the problem. That said, though, brute force always has the last word. However you whittle down your search space, you still must examine one possibility at a time, even in your much-reduced search space.

Using computers, brute-force search is becoming more and more feasible for more kinds of problems. Brute-force search always has the advantage that it requires no imagination or cleverness.

There is none of the metaheuristic problem of finding a good heuristic. If you want results fast, and the problem is small enough that brute-force search can find the solution, brute-force search is the way to go.

No matter how fast computers get, though, the vast majority of interesting problems will never submit to brute force.

The reason is that in many problems, the number of combinations grows so quickly that even if all the universe's matter were converted into the fastest computers, it would still take more years to find the solution than there are sub-atomic particles in the universe.

For example, brute force cannot figure out optimal play in chess. There are 20 possible opening moves in chess, and approximately that number of possible responses. 20 x 20 x 20 x 20 x ... soon multiplies out to a vaster number than anything that any computer could ever deal with.

But, sometimes you can find a clever way to use brute force on part of a problem. And often that's a huge advance.

(b) LISP :

http://greenlightwiki.com/heuristic/heuristic

http://greenlightwiki.com/heuristic/search_space

http://greenlightwiki.com/heuristic/search_space

http://greenlightwiki.com/heuristic/design_space


http://greenlightwiki.com/heuristic/function_space


Lisp (or LISP) is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldesthigh-level programming language in widespread use today.The name LISP derives from "LISt Processing". Linked lists are one of Lisp languages' majordata structures, and Lisp source code is itself made up of lists. As a result, Lisp programs can manipulate source code as a data structure, giving rise to the macro systems that allow programmers to create new syntax or even new domain-specific languages embedded in Lisp.

Versions of LISP

• Lisp is an old language with many variants

• Lisp is alive and well today

• Most modern versions are based on Common Lisp

• LispWorks is based on Common Lisp

• Scheme is one of the major variants

• The essentials haven’t changed much

Recursion

• Recursion is essential in Lisp

• A recursive definition is a definition in which– certain things are specified as belonging to the category being defined, and– a rule or rules are given for building new things in the category from other things already known to be in the category.Informal Syntax

• An atom is either an integer or an identifier.

• A list is a left parenthesis, followed by zero or more S-expressions, followed by a right parenthesis.

• An S-expression is an atom or a list.

• Example: (A (B 3) (C) ( ( ) ) )Formal Syntax (approximate)

• <S-expression> ::= <atom> | <list>

• <atom> ::= <number> | <identifier>

• <list> ::= ( <S-expressions> )

• <S-expressions > ::= <empty> | <S-expressions > <S-expression>

• <number> ::= <digit> | <number> <digit>

• <identifier> ::= string of printable characters, not including parentheses

T and NIL

• NIL is the name of the empty list

http://en.wikipedia.org/wiki/Domain-specific_language

http://en.wikipedia.org/wiki/Macro_(computer_science)

http://en.wikipedia.org/wiki/Source_code

http://en.wikipedia.org/wiki/Data_structure

http://en.wikipedia.org/wiki/Linked_list

http://en.wikipedia.org/wiki/High-level_programming_language

http://en.wikipedia.org/wiki/Programming_language

http://en.wikipedia.org/wiki/Computer

• As a test, NIL means “false”

• T is usually used to mean “true,” but…

• …anything that isn’t NIL is “true”

• NIL is both an atom and a list

– it’s defined this way, so just accept it

Function calls and data

• A function call is written as a list– the first element is the name of the function– remaining elements are the arguments

• Example: (F A B)– calls function F with arguments A and B

• Data is written as atoms or lists

• Example: (F A B) is a list of three elements– Do you see a problem here?

Basic Functions

• CAR returns the head of a list

• CDR returns the tail of a list

• CONS inserts a new head into a list

• EQ compares two atoms for equality

• ATOM tests if its argument is an atom

Other useful Functions

• (NULL S) tests if S is the empty list

• (LISTP S) tests if S is a list

• LIST makes a list of its (evaluated) arguments

– (LIST 'A '(B C) 'D) returns (A (B C) D)– (LIST (CDR '(A B)) 'C) returns ((B) C)

• APPEND concatenates two lists

– (APPEND '(A B) '((X) Y) ) returns (A B (X) Y)

L (CAR L) (CDR L) (CONS (CAR L) (CDR L))

(A B C) A (B C) (A B C)( (X Y) Z) (X Y) (Z) ( (X Y) Z)(X) X ( ) (X)( ( ) ( ) ) ( ) ( ( ) ) ( ( ) ( ) )

( ) undefined undefined undefined

ATOM

• ATOM takes any S-expression as an argument

• ATOM returns "true" if the argument you gave it is an atom

• As with any predicate, ATOM returns either NIL or something that isn't NIL

COND

• COND implements the if...then...elseif...then...elseif...then... control structure

• The arguments to a function are evaluated before the function is called– This isn't what you want for COND

• COND is a special form, not a function

Defining Functions

• (DEFUN function_name parameter_list function_body )

• Example: Test if the argument is the empty list

• (DEFUN NULL (X) (COND (X NIL) (T T) ) )

Rules for Recursion

• Handle the base (“simplest”) cases first

• Recur only with a “simpler” case– “Simpler” = more like the base case

• Don’t alter global variables (you can’t anyway with the Lisp functions)

• Don’t look down into the recursion

Guidelines for Lisp Functions

• Unless the function is trivial, start with COND.

• Handle the base case first.

• Avoid having more than one base case.

• The base case is usually testing for NULL.

• Do something with the CAR and recur with the CDR.

PROLOG :Prolog is a general purpose logic programming language associated with artificial intelligenceand computational linguistics. Prolog has its roots in formal logic, and unlike many

http://en.wikipedia.org/wiki/Formal_logic

http://en.wikipedia.org/wiki/Computational_linguistics

http://en.wikipedia.org/wiki/Artificial_intelligence

http://en.wikipedia.org/wiki/Artificial_intelligence

http://en.wikipedia.org/wiki/Logic_programming

other programming languages, Prolog isdeclarative: The program logic is expressed in terms of relations, represented as facts and rules. A computation is initiated by running a query over these relations. Prolog was one of the first logic programming languages, and remains among the most popular such languages today, with many free and commercial implementations available. While initially aimed at natural language processing, the language has since then stretched far into other areas like theorem proving, expert systems, games, Prolog is the major example of a fourth generation programming language supporting the declarative programming paradigm. The programs in this tutorial are written in 'standard' University of Edinburgh Prolog, as specified in the classic Prolog textbook by authors Clocksin and Mellish (1981,1992). What Is Prolog?

Prolog is a logic-based language

With a few simple rules, information can be analyzedSyntax .pl files contain lists of clauses Clauses can be either facts or rulesmale(bob).male(harry).child(bob,harry).son(X,Y):-

male(X),child(X,Y).Rules

Rules combine facts to increase knowledge of the systemson(X,Y):-

male(X),child(X,Y).

X is a son of Y if X is male and X is a child of Y

http://en.wikipedia.org/wiki/Expert_system

http://en.wikipedia.org/wiki/Automated_theorem_proving

http://en.wikipedia.org/wiki/Natural_language_processing

http://en.wikipedia.org/wiki/Declarative_programming

http://en.wikipedia.org/wiki/Programming_language

UNIT-IIQ10. Explain Bayes theorem with example.Ans. Thomas Bayes addressed both the case of discrete probability distributions of data and the more complicated case of continuous probability distributions. In the discrete case, Bayes' theorem relates the conditional and marginal probabilities of events A and B, provided that the probability of B does not equal zero:

Each term in Bayes' theorem has a conventional name: P(A) is the prior probability or marginal probability of A. It is "prior" in the sense that it

does not take into account any information about B. P(A|B) is the conditional probability of A, given B. It is also called the posterior

probability because it is derived from or depends upon the specified value of B. P(B|A) is the conditional probability of B given A. It is also called the likelihood. P(B) is the prior or marginal probability of B, and acts as a normalizing constant.Bayes' theorem in this form gives a mathematical representation of how the conditional probability of event A given B is related to the converse conditional probability of B given A.

Most students understand that the probability of an event occurring can be influenced by another event that has already occurred. However, many students cannot understand that the probability of an event occurring can actually be dependent on an event that occurred later. Having information about the outcome of an event can be used to revise probabilities of the occurrence of a previous event. This lesson plan will help teachers to correct this common student misconception. For students with more probability experience, this lesson also introduces them to Bayes’ Theorem. Bayes’ Theorem provides a formula that to find one conditional probability if other conditional probabilities are known. More specifically, it can be used to find P(A|B) if P(B|A) is known. Bayes’ Theorem is usually expressed as:

P(AB) [P(B|A) * P(A)]P(A|B) = ----------- = --------------------------------------------

P(B) [P(B|A) * P(A) + P(B|~A) * P(~A)]

i) If A and B are two mutually exclusive events, then

ii)

A simple example of Bayes' theoremSuppose there is a school with 60% boys and 40% girls as its students. The female students wear trousers or skirts in equal numbers; the boys all wear trousers. An observer sees a (random) student from a distance, and what the observer can see is that this student is wearing trousers. What is the probability this student is a girl? The correct answer can be computed using Bayes' theorem.The event A is that the student observed is a girl, and the event B is that the student observed is wearing trousers. To compute P(A|B), we first need to know:

http://en.wikipedia.org/wiki/Normalizing_constant

http://en.wikipedia.org/wiki/Likelihood_function

http://en.wikipedia.org/wiki/Posterior_probability

http://en.wikipedia.org/wiki/Posterior_probability

http://en.wikipedia.org/wiki/Conditional_probability

http://en.wikipedia.org/wiki/Marginal_probability

http://en.wikipedia.org/wiki/Prior_probability

http://en.wikipedia.org/wiki/Marginal_probability


http://en.wikipedia.org/wiki/Continuous_probability_distribution

http://en.wikipedia.org/wiki/Discrete_probability_distribution

P(A), or the probability that the student is a girl regardless of any other information. Since the observer sees a random student, meaning that all students have the same probability of being observed, and the fraction of girls among the students is 40%, this probability equals 0.4.

P(B|A), or the probability of the student wearing trousers given that the student is a girl. Since they are as likely to wear skirts as trousers, this is 0.5.

P(B), or the probability of a (randomly selected) student wearing trousers regardless of any other information. Since half of the girls and all of the boys are wearing trousers, this is 0.5×0.4 + 1.0×0.6 = 0.8.

Given all this information, the probability of the observer having spotted a girl given that the observed student is wearing trousers can be computed by substituting these values in the formula:

Another, essentially equivalent way of obtaining the same result is as follows. Assume, for concreteness, that there are 100 students, 60 boys and 40 girls. Among these, 60 boys and 20 girls wear trousers. All together there are 80 trouser-wearers, of which 20 are girls. Therefore the chance that a random trouser-wearer is a girl equals 20/80 = 0.25. Put in terms of Bayes´ theorem, the probability of a student being a girl is 40/100, the probability that any given girl will wear trousers is 1/2. The product of these two is 20/100, but we know the student is wearing trousers, so one deducts the 20 students not wearing trousers, and then calculate a probability of (20/100)/(80/100), or 20/80.It is often helpful when calculating conditional probabilities to create a simple table containing the number of occurrences of each outcome, or the relative frequencies of each outcome, for each of the independent variables. The table below illustrates the use of this method for the above girl-or-boy example

Girls Boys Total

Trousers 20 60 80

Skirts 20 0 20

Total 40 60 100

Bayes' theorem derived via conditional probabilitiesTo derive Bayes' theorem, start from the definition of conditional probability. The probability of the event A given the event B is

Equivalently, the probability of the event B given the event A is

. Q 11. What are issues involved in representation of knowledge?


http://en.wikipedia.org/wiki/Relative_frequency

Ans. Below are listed issues that should be raised when using a knowledge representation technique:Important Attributes

-- Are there any attributes that occur in many different types of problem?There are two instance and is a and each is important because each supports property inheritance.

Single-Valued Attributes- Introduce an explicit notation for temporal interval. If two values are given for the same time then Signal a contradiction automatically.- Assume the only temporal interval is NOW. So, if new value comes then replace the old value.- Provide no explicit support. If an attribute has one value then it is known not to have all other values.

Relationships-- What about the relationship between the attributes of an object, such as, inverses, existence, techniques for reasoning about values and single valued attributes. We can consider an example of an inverse in band(John Zorn,Naked City)This can be treated as John Zorn plays in the band Naked City or John Zorn's band is Naked City.Another representation is band = Naked Cityband-members = John Zorn, Bill Frissell, Fred Frith, Joey Barron,

Granularity-- At what level should the knowledge be represented and what are the primitives. Clearly the separate levels of understanding require different levels of primitives and these need many rules to link together apparently similar primitives.Obviously there is a potential storage problem and the underlying question must be what level of comprehension is needed.

Finding the right structures as needed: Selecting an initial structure. Revising the choice when necessary.

Range of Knowledge Representations’ issues,include, but are not limited to:1. measure of KR approach’s adequacy to the represented knowledge2. measure of knowledge role with respect to the goal that is trying to be achieved3. measure of overall quality of knowledge within the knowledge representation4. measure of knowledge uncertainty for the knowledge utilization by the autonomous system5. measure of the consistency of knowledge that is provided by the autonomous software agents or by the service providers6. measure of the ontologies’ role in autonomous systems

Q 12.Represent following sentences in WFF or Predicate logic:a. All gardeners like sun.

( x) gardener(x) => likes(x,Sun)

b. Everyone is younger than his father. X: y: younger (x,y)

c. John likes all kinds of food.( x) food(x) ->likes(John,x)

d. Everyone is loyal to someone. X: y: loyalto(x,y)

e. Apple is food.Food(Apple)

Q 13. Discuss Frames in detail.Ans. A frame is a data-structure for representing a stereotyped situation, like being in a certain kind of living room, or going to a child's birthday party. Attached to each frame are several kinds of information. Some of this information is about how to use the frame. Some is about what one can expect to happen next. Some is about what to do if these expectations are not confirmed.A frame is a data structure introduced by Marvin Minsky in the 1970s that can be used for knowledge representation. Minsky frames are intended to help an Artificial Intelligent system recognize specific instances of patterns. Frames usually contain properties called attributes or slots. Slots may contain default values (subject to override by detecting a different value for an attribute), refer to other frames (component relationships) or contain methods for recognizing pattern instances. Frames are thus a machine-usable formalization of concepts or schemata. In contrast, the object-oriented paradigm partitions an information domain into abstraction hierarchies (classes and subclasses) rather than partitioning into component hierarchies, and is used to implement any kind of information processing. Frame Technology is loosely based on Minsky Frames, its purpose being software synthesis rather than pattern analysis.Like many other knowledge representation systems and languages, frames are an attempt to resemble the way human beings are storing knowledge. It seems like we are storing our knowledge in rather large chunks, and that different chunks are highly interconnected. In frame-based knowledge representations knowledge describing a particular concept is organized as a frame. The frame usually contains a name and a set of slots.The slots describe the frame with attribute-value pairs <slotname value> or alternatively a triple containing framename, slotname and value in some order. In many frame systems the slots are complex structures that have facets describing the properties of the slot. The value of a slot may be a primitive such as a text string or an integer, or it may be another frame. Most systems allow multiple values for slots and some systems support procedural attachments. These attachments can be used to compute the slot value, or they can be triggers used to make consistency checking or updates of other slots. The triggers can be trigged by updates on slots.

Q 14.What are Rule based Deduction systems? Also discuss certainty factors in detail.Ans. The way in which a piece of knowledge is expressed by a human expert carries important information,

example: if the person has fever and feels tummy-pain then she may have an infection.In logic it can be expressed as follows:

"x. (has_fever(x) & tummy_pain(x) à has_an_infection(x))If we convert this formula to clausal form we loose the content as then we may have equivalent formulas like:(i) has_fever(x) & ~has_an_infection(x) à ~tummy_pain(x)(ii) ~has_an_infection(x) & tummy_pain(x) à ~has_fever(x)Notice that:

(i) and (ii) are logically equivalent to the original sentence they have lost the main information contained in its formulation.

Forward Production System:- The main idea behind the forward/backward production systems is:

to take advantage of the implicational form in which production rules are stated by the expert

and use that information to help achieving the goal. In the present systems the formulas have two forms:

rules and facts

Rules are the productions stated in implication form. Rules express specific knowledge about the problem. Facts are assertions not expressed as implications. The task of the system will be to prove a goal formula with these facts and rules. In a forward production system the rules are expressed as F-rules F-rules operate on the global database of facts until the termination condition is

achieved. This sort of proving system is a direct system rather than a refutation system.

Facts Facts are expressed in AND/OR form. An expression in AND/OR form consists on sub-expressions of literals connected

by & and V symbols. An expression in AND/OR form is not in clausal form.

Steps to transform facts into AND/OR form for forward system:

1. Eliminate (temporarily) implication symbols.2. Reverse quantification of variables in first disjunct by moving negation symbol.3. Skolemize existential variables.4. Move all universal quantifiers to the front an drop.5. Rename variables so the same variable does not occur in different main conjuncts

- Main conjuncts are small AND/OR trees, not necessarily sum of literal clauses as in Prolog.

Steps to transform the rules into a free-quantifier form:1. Eliminate (temporarily) implication symbols.2. Reverse quantification of variables in first disjunct by moving negation symbol.3. Skolemize existential variables.4. Move all universal quantifiers to the front and drop.

5. Restore implication.All variables appearing on the final expressions are assumed to be universally quantified.E.g. Original formula: "x.($y. "z. (p(x, y, z)) à "u. q(x, u))Converted formula: p(x, y, f(x, y)) à q(x, u).

Backward Production System:-We restrict B-rules to expressions of the form: W ==> L,where W is an expression in AND/OR form and L is a literal,and the scope of quantification of any variables in the implication is the entire implication.Recall that W==>(L1 & L2) is equivalent to the two rules: W==>L1 and W==>L2.An important property of logic is the duality between assertions and goals in theorem-proving systems.Duality between assertions and goals allows the goal expression to be treated as if it were an assertion.Conversion of the goal expression into AND/OR form:

1. Elimination of implication symbols.2. Move negation symbols in.3. Skolemize existential variables.4. Drop existential quantifiers. Variables remaining in the AND/OR form are considered to

be existentially quantified.Goal clauses are conjunctions of literals and the disjunction of these clauses is the clause form of the goal well-formed formula.

Q 15.What is Reasoning under Uncertainty?

Ans. Axioms of Probability TheoryProbability Theory provides us with the formal mechanisms and rules for manipulating propositions represented probabilistically. The following are the three axioms of probability theory:

0 <= P(A=a) <= 1 for all a in sample space of A P(True)=1, P(False)=0 P(A v B) = P(A) + P(B) - P(A ^ B)

From these axioms we can show the following properties also hold: P(~A) = 1 - P(A) P(A) = P(A ^ B) + P(A ^ ~B) Sum{P(A=a)} = 1, where the sum is over all possible values a in the sample space of A

Axioms of Probability TheoryProbability Theory provides us with the formal mechanisms and rules for manipulating propositions represented probabilistically. The following are the three axioms of probability theory:



Conditional Probabilities Conditional probabilities are key for reasoning because they formalize the process of

accumulating evidence and updating probabilities based on new evidence. For example, if we know there is a 4% chance of a person having a cavity, we can represent this as the prior (aka unconditional) probability P(Cavity)=0.04. Say that person now has a symptom of a toothache, we'd like to know what is the posterior probability of a Cavity given this new evidence. That is, compute P(Cavity | Toothache).

If P(A|B) = 1, this is equivalent to the sentence in Propositional Logic B => A. Similarly, if P(A|B) =0.9, then this is like saying B => A with 90% certainty. In other words, we've made implication fuzzy because it's not absolutely certain.

Given several measurements and other "evidence", E1, ..., Ek, we will formulate queries as P(Q | E1, E2, ..., Ek) meaning "what is the degree of belief that Q is true given that we know E1, ..., Ek and nothing else."

Conditional probability is defined as: P(A|B) = P(A ^ B)/P(B) = P(A,B)/P(B) One way of looking at this definition is as a normalized (using P(B)) joint probability (P(A,B)).









Using Bayes's Rule Bayes's Rule is the basis for probabilistic reasoning because given a prior model of the

world in the form of P(A) and a new piece of evidence B, Bayes's Rule says how the new piece of evidence decreases my ignorance about the world by defining P(A|B).

Combining Multiple Evidence using Bayes's RuleGeneralizing Bayes's Rule for two pieces of evidence, B and C, we get:P(A|B,C) = ((P(A)P(B,C | A))/P(B,C)= P(A) * [P(B|A)/P(B)] * [P(C | A,B)/P(C|B)]

A is (unconditionally) independent of B if P(A|B) = P(A). In this case, P(A,B) = P(A)P(B).

A is conditionally independent of B given C if P(A|B,C) = P(A|C) and, symmetrically, P(B|A,C) = P(B|C). What this means is that if we know P(A|C), we also know P(A|B,C), so we don't need to store this case. Furthermore, it also means that P(A,B|C) = P(A|C)P(B|C).









Using Bayes's Rule Bayes's Rule is the basis for probabilistic reasoning because given a prior model of the

world in the form of P(A) and a new piece of evidence B, Bayes's Rule says how the new piece of evidence decreases my ignorance about the world by defining P(A|B).

Combining Multiple Evidence using Bayes's RuleGeneralizing Bayes's Rule for two pieces of evidence, B and C, we get:P(A|B,C) = ((P(A)P(B,C | A))/P(B,C)= P(A) * [P(B|A)/P(B)] * [P(C | A,B)/P(C|B)]

A is (unconditionally) independent of B if P(A|B) = P(A). In this case, P(A,B) = P(A)P(B).

A is conditionally independent of B given C if P(A|B,C) = P(A|C) and, symmetrically, P(B|A,C) = P(B|C). What this means is that if we know P(A|C), we also know P(A|B,C), so we don't need to store this case. Furthermore, it also means that P(A,B|C) = P(A|C)P(B|C).

Bayesian Networks Bayesian Networks, also known as Bayes Nets, Belief Nets, Causal Nets, and Probability

Nets, are a space-efficient data structure for encoding all of the information in the full joint probability distribution for the set of random variables defining a domain. That is, from the Bayesian Net one can compute any value in the full joint probability distribution of the set of random variables.

Represents all of the direct causal relationships between variables Intuitively, to construct a Bayesian net for a given set of variables, draw arcs from cause

variables to immediate effects. Space efficient because it exploits the fact that in many real-world problem domains the

dependencies between variables are generally local, so there are a lot of conditionally independent variables

Captures both qualitative and quantitative relationships between variables Can be used to reason

o Forward (top-down) from causes to effects -- predictive reasoning (aka causal reasoning)

o Backward (bottom-up) from effects to causes -- diagnostic reasoning Formally, a Bayesian Net is a directed, acyclic graph (DAG), where there is a node for

each random variable, and a directed arc from A to B whenever A is a direct causal influence on B. Thus the arcs represent direct causal relationships and the nodes represent states of affairs. The occurrence of A provides support for B, and vice versa. The backward influence is call "diagnostic" or "evidential" support for A due to the occurrence of B.

Each node A in a net is conditionally independent of any subset of nodes that are not descendants of A given the parents of A.

Building a Bayesian NetIntuitively, "to construct a Bayesian Net for a given set of variables, we draw arcs from cause variables to immediate effects. In almost all cases, doing so results in a Bayesian network [whose conditional independence implications are accurate]." (Heckerman, 1996)More formally, the following algorithm constructs a Bayesian Net:

1. Identify a set of random variables that describe the given problem domain2. Choose an ordering for them: X1, ..., Xn3. for i=1 to n do

a. Add a new node for Xi to the netb. Set Parents(Xi) to be the minimal set of already added nodes such that we have

conditional independence of Xi and all other members of {X1, ..., Xi-1} given Parents(Xi)

c. Add a directed arc from each node in Parents(Xi) to Xid. If Xi has at least one parent, then define a conditional probability table at Xi:

P(Xi=x | possible assignments to Parents(Xi)). Otherwise, define a prior probability at Xi: P(Xi)

Q 16. Explain Temporal Reasoning.Ans. Temporal reasoning is only one of the components of CEP (complex event processing).CEP is about processing a large amount of events and identifying the meaningful events out of the event cloud. CEP uses techniques such as detection of complex patterns, etc. It aims at describing the common-sense background knowledge on which our human perspective on the physical reality is based. Methodologically, qualitative constraint calculi restrict the vocabulary of rich mathematical theories dealing with temporal or spatial entities such that specific aspects of these theories can be treated within decidable fragments with simple qualitative (non-metric) languages. Contrary to mathematical or physical theories about space and time, qualitative constraint calculi allow for rather inexpensive reasoning about entities located in space and time. For this reason, the limited expressiveness of qualitative representation formalism calculi is a benefit if such reasoning tasks need to be integrated in applications.Temporal reasoning requires:

A CEP enabled engine (time and events) Ability to express temporal relationships Requires a reference clock Requires support of temporal dimension

http://en.wikipedia.org/wiki/Metric_(mathematics)

http://en.wikipedia.org/wiki/Decidability_(logic)

http://en.wikipedia.org/wiki/Constraint_satisfaction

Temporal reasoning is widely used in AI, especially for natural language processing. Existing methods for temporal reasoning are extremely expensive in time and space, because complete graphs are used. We present an approach of temporal reasoning for expert systems in technical applications that reduces the amount of time and space by using sequence graphs. A sequence graph consists of one or more sequence chains and other intervals that are connected only loosely with these chains. Sequence chains are based on the observation that in technical applications many events occur sequentially. The uninterrupted execution of technical processes for a long time is characteristic for technical applications. To relate the first intervals in the application with the last ones makes no sense. In sequence graphs only these relations are stored that are needed for further propagation. In contrast to other algorithms which use incomplete graphs, no information is lost and the reduction of complexity is significant. Additionally, the representation is more transparent, because the "flow" of time is modelled.Reasoning about space and time is a major field of interest in many areas of theoretical and applied AI, especially in the theory and application of temporal and spatial models in planning, high-level navigation of autonomous mobile robots, natural language understanding, temporal databases, and concurrent and distributed programming. The special track on spatio-temporal reasoning focuses on research and development aspects in the area of reasoning about models of space and time.

Q 17. Explain fuzzy reasoning.Ans. Fuzzy logic is a form of multi-valued logic derived from fuzzy set theory to deal with reasoning that is approximate rather than accurate. Fuzzy logic corresponds to "degrees of truth", while probabilistic logic corresponds to "probability, likelihood"; as these differ, fuzzy logic and probabilistic logic yield different models of the same real-world situations.A fuzzy concept is a concept of which the content, value, or boundaries of application can vary according to context or conditions, instead of being fixed once and for all.Usually this means the concept is vague, lacking a fixed, precise meaning, without however being meaningless altogether. It does have a meaning, or rather multiple meanings (it has different semantic associations). But these can become clearer only through further elaboration and specification, including a closer definition of the context in which they are used. Fuzzy concepts "lack clarity and are difficult to test or operationalize". In logic, fuzzy concepts are often regarded as concepts which in their application are neither completely true or completely false, or which are partly true and partly false.Complementing general questions of how to represent knowledge is the need to understand how knowledge can be used. In general, realistic problems have enormous associated spaces of possible solutions which must be explored (searched) to find an actual solution that meets the requirements of the problem. These spaces are much too large to be searched in their entirety, and ways must be found to focus or short-circuit the search for solutions if systems are to have any practical utility."Linguistic variablesWhile variables in mathematics usually take numerical values, in fuzzy logic applications, the non-numeric linguistic variables are often used to facilitate the expression of rules and facts.[4]

http://en.wikipedia.org/wiki/Fuzzy_logic#cite_note-3

http://en.wikipedia.org/wiki/Logic

http://en.wikipedia.org/wiki/Concept

http://en.wikipedia.org/wiki/Reasoning

http://en.wikipedia.org/wiki/Fuzzy_set

http://en.wikipedia.org/wiki/Multi-valued_logic

http://academic.research.microsoft.com/Search.aspx?query=expert+systems

http://academic.research.microsoft.com/Search.aspx?query=temporal+reasoning

http://academic.research.microsoft.com/Search.aspx?query=temporal+reasoning

http://academic.research.microsoft.com/Search.aspx?query=natural+language

http://academic.research.microsoft.com/Search.aspx?query=Temporal+reasoning

A linguistic variable such as age may have a value such as young or its antonym old. However, the great utility of linguistic variables is that they can be modified via linguistic hedges applied to primary terms. The linguistic hedges can be associated with certain functions.ExampleFuzzy set theory defines fuzzy operators on fuzzy sets. The problem in applying this is that the appropriate fuzzy operator may not be known. For this reason, fuzzy logic usually uses IF-THEN rules, or constructs that are equivalent, such as fuzzy associative matrices.Rules are usually expressed in the form:IF variable IS property THEN actionFor example, a simple temperature regulator that uses a fan might look like this:IF temperature IS very cold THEN stop fanIF temperature IS cold THEN turn down fanIF temperature IS normal THEN maintain levelIF temperature IS hot THEN speed up fanThere is no "ELSE" – all of the rules are evaluated, because the temperature might be "cold" and "normal" at the same time to different degrees.The AND, OR, and NOT operators of boolean logic exist in fuzzy logic, usually defined as the minimum, maximum, and complement; when they are defined this way, they are called the Zadeh operators. So for the fuzzy variables x and y:NOT x = (1 - truth(x))x AND y = minimum(truth(x), truth(y))x OR y = maximum(truth(x), truth(y))There are also other operators, more linguistic in nature, called hedges that can be applied. These are generally adverbs such as "very", or "somewhat", which modify the meaning of a set using a mathematical formula.

Q 18. Explain the heuristic methods for Reasoning under uncertainty.Ans. Bayesian methodsThe Bayesian methods have a number of advantages that indicates their suitability in uncertainty management. Most significant is their sound theoretical foundation in probability theory. Thus, they are currently the most mature of all of the uncertainty reasoning methods. While Bayesian methods are more developed than the other uncertainty methods, they are not without faults.

1. They require a significant amount of probability data to construct a knowledge base. Furthermore, human experts are normally uncertain and uncomfortable about the probabilities they are providing.2. What are the relevant prior and conditional probabilities based on? If they are statistically based, the sample sizes must be sufficient so the probabilities obtained are accurate. If human experts have provided the values, are the values consistent and comprehensive?3. Often the type of relationship between the hypothesis and evidence is important in determining howthe uncertainty will be managed. Reducing these associations to simple numbers removes relevant information that might be needed for successful reasoning about the uncertainties. For example,Bayesian-based medical diagnostic systems have failed to gain acceptance because physicians distrust systems that cannot provide explanations describing how a conclusion was reached (a feature difficult to provide in a Bayesian-based system).

http://en.wikipedia.org/wiki/Formula

http://en.wikipedia.org/wiki/Boolean_logic

http://en.wikipedia.org/wiki/Logical_operator

http://en.wikipedia.org/wiki/Fuzzy_associative_matrix

http://en.wikipedia.org/w/index.php?title=IF-THEN_rules&action=edit&redlink=1

http://en.wikipedia.org/w/index.php?title=IF-THEN_rules&action=edit&redlink=1

4. The reduction of the associations to numbers also eliminated using this knowledge within other tasks.For example, the associations that would enable the system to explain its reasoning to a user are lost, as is the ability to browse through the hierarchy of evidences to hypotheses.2: Certainty factors: Certainty factor is another method of dealing withuncertainty. This method was originally developed for the MYCIN system. One of the difficulties with Bayesian method is thatthere are too many probabilities required. Most of them could be unknown.The problem gets very bad when there are manypieces of evidence.Besides the problem of amassing all the conditional probabilities for the Bayesian method, another major problem that appeared with medical experts was the relationship of belief and disbelief. At first sight, this may appear trivial since obviously disbelief is simply the opposite of belief. In fact, the theory of probability states thatP(H) + P(H’) = 1 and soP(H) = 1 - P(H’)For the case of a posterior hypothesis that relies on evidence, E (1) P(H | E) = 1 - P(H’ | E)However, when the MYCIN knowledge engineers began interviewing medical experts, they found that physicians were extremely reluctant to state their knowledge in the form of equation (1).For example, consider a MYCIN rule such as the following.IF 1) The stain of the organism is gram positive, and2) The morphology of the organism is coccus, and3) The growth conformation of the organism is chainsTHEN There is suggestive evidence (0.7) that the identity of the organism is streptococcusThis can be written in terms of posterior probability:(2) P(H | E1∩ E2 ∩ E3) = 0.7where the Ei correspond to the three patterns of the antecedent.

3: Dempster-Shafer TheoryHere we discuss another method for handling uncertainty. It is called Dempster-Shafer theory. It is evolved during the 1960s and 1970s through the efforts of Arthur Dempster and one of his students, Glenn Shafer.This theory was designed as a mathematical theory of evidence.The development of the theory has been motivated by the observation that probability theory is not able to distinguish between uncertainty and ignorance owing to incomplete information.

UNIT-IIIQ 19. Explain the need of planning and also discuss the representation of planning.Ans. Need of planning: Intelligent agents must be able to set goals and achieve them. They need a way to visualize the future (they must have a representation of the state of the world and be able to make predictions about how their actions will change it) and be able to make choices that maximize the utility (or "value") of the available choices. In classical planning problems, the agent can assume that it is the only thing acting on the world and it can be certain what the consequences of its actions may be.] However, if this is not true, it must periodically check if the world matches its predictions and it must change its plan as this becomes necessary, requiring the agent to reason under uncertainty.Multi-agent planning uses the cooperation and competition of many agents to achieve a given goal. Emergent behavior such as this is used by evolutionary algorithms and swarm intelligence.The representation of planning :An analysis of strategies, recognizable abstract patterns of planned behavior, highlights the difference between the assumptions that people make about their own planning processes and the representational commitments made in current automated planning systems

Problem Solving – PlanningNewell and Simon 1956• Given the actions available in a task domain.• Given a problem specified as:

– an initial state of the world,– a set of goals to be achieved.

• Find a solution to the problem, i.e., a way to transform the initial state into a new state of the world where the goal statement is true.-Action Model, State, Goals

Classical Deterministic Planning•Action Model: –How to represent actions

–Deterministic, correct, rich representation• State: - single initial state, fully known• Goals: - complete satisfactionThe Blocks World Definition – Actions

• Blocks are picked up and put down by the arm

http://en.wikipedia.org/wiki/Swarm_intelligence

http://en.wikipedia.org/wiki/Evolutionary_algorithms

http://en.wikipedia.org/wiki/Emergent_behavior

http://en.wikipedia.org/wiki/Competition

http://en.wikipedia.org/wiki/Cooperation

http://en.wikipedia.org/wiki/Multi-agent_planning

http://en.wikipedia.org/wiki/Utility

•Blocks can be picked up only if they are clear, i.e., without any block on top• The arm can pick up a block only if the arm is empty, i.e., if it is not holding another block, i.e., the arm can be pick up only one block at a time•The arm can put down blocks on blocks or on the tablePlanning by “Plain” State Search•Search from an initial state of the world to a goal state•Enumerate all states of the world• Connect states with legal actions• Search for paths between initial and goal States

Planning - Generation•Many plan generation algorithms:–Forward from state, backward from goals–Serial, parallel search– Logical satisfiability–Heuristic search

Planning – Actions and States•Model of an action – a description of legal actions in the domain• “move queen”, “open door if unlocked”, “unstack if top is clear”,….•Model of the state

–Numerical identification (s1, s2,...) – no information–“Symbolic” description• objects, predicates

STRIPS Action Representation•Actions - operators -- rules -- with:

–Precondition expression -- must be satisfied before the operator is applied.– Set of effects -- describe how the application of the operator changes the state.

• Precondition expression: propositional, typed first order predicate logic, negation, conjunction, disjunction,existential and universal quantification, and functions.•Effects: add-list and delete-list.•Conditional effects -- dependent on condition on the state when action takes place.

Many Planning “Domains”•Web management agents•Robot planning• Manufacturing planning•Image processing management• Logistics transportation• Crisis management• Bank risk management• Blocks world• Puzzles• Artificial domains

Q 20. What is learning? Explain various types of learning in detail.Ans. Most often heard criticisms of AI is that machines cannot be called intelligent until they are able to learn to do new things and adapt to new situations, rather than simply doing as they are told to do.

Some critics of AI have been saying that computers cannot learn! Definitions of Learning: changes in the system that are adaptive in the sense that they

enable the system to do the same task or tasks drawn from the same population more efficiently and more effectively the next time.

Learning covers a wide range of phenomenon: Skill refinement : Practice makes skills improve. More you play tennis, better you

get Knowledge acquisition: Knowledge is generally acquired through experience

various types of learning :1. Rote Learning2. Learning In Problem solving3. Winston’s Learning Program4. Explanation-Based Learning

Rote Learning When a computer stores a piece of data, it is performing a form of learning. In case of data caching, we store computed values so that we do not have to recompute

them later. When computation is more expensive than recall, this strategy can save a significant

amount of time. Caching has been used in AI programs to produce some surprising performance

improvements. Such caching is known as rote learning. Rote learning does not involve any sophisticated problem-solving capabilities. It shows the need for some capabilities required of complex learning systems such as:

Organized Storage of information Generalization

Learning In Problem solving Can program get better without the aid of a teacher? It can be by generalizing from its own experiences

Winston’s Learning Program An early structural concept learning program. This program operates in a simple blocks world domain. Its goal was to construct representations of the definitions of concepts in blocks domain. For example, it learned the concepts House, Tent and Arch. A near miss is an object that is not an instance of the concept in question but that is very

similar to such instances.Basic approach of Winston’s Program :

1. Begin with a structural description of one known instance of the concept. Call that description the concept definition.

2. Examine descriptions of other known instances of the concepts. Generalize the definition to include them.

3. Examine the descriptions of near misses of the concept. Restrict the definition to exclude these.

Explanation-Based Learning

Learning from Examples: Induction

Classification is the process of assigning, to a particular input, the name of a class to which it belongs.

The classes from which the classification procedure can choose can be described in a variety of ways.

Their definition will depend on the use to which they are put. Classification is an important component of many problem solving tasks.

Before classification can be done, the classes it will use must be defined: Isolate a set of features that are relevant to the task domain. Define each class by a

weighted sum of values of these features. Ex: task is weather prediction, the parameters can be measurements such as rainfall, location of cold fronts etc.

Isolate a set of features that are relevant to the task domain. Define each class as a structure composed of these features. Ex: classifying animals, various features can be such things as color, length of neck etc

The idea of producing a classification program that can evolve its own class definitions is called concept learning or induction.

Q 21. Explain partial order planning algorithmAns. Partial-order planning is an approach to automated planning. The basic idea is to leave the decision about the order of the actions as open as possible. Given a problem description, a partial-order plan is a set of all needed actions and order conditions for the actions where needed.The approach is inspired by the least commitment strategy. In many cases, there are many possible plans for a problem which only differs in the order of the actions. Many traditional automated planners are searching for plans in the full search space containing all possible orders. Despite the smaller search space for partial-order planning, it can also have advantages to leave the option about the order of the actions open for later.Partial-order planA partial-order plan consists of four components:

A set of actions. A partial order for the actions. It specifies the conditions about the order of some actions. A set of causal links. It describes what action meet what preconditions of other actions. A set of open preconditions, i.e. those preconditions which are not fulfilled by any action

in the partial-order plan.If you want to keep the possible orders of the actions as open as possible, you want to have the set of order conditions as small as possible.A plan is a solution if the set of open preconditions is empty. Partial-order plannerA partial-order planner is an algorithm or program which will construct a plan and searches for a solution. The input is the problem description, consisting of descriptions of the initial state, the goal and possible actions.The problem can be interpreted as a search problem where the set of possible partial-order plans is the search space. The initial state would be the plan with the open preconditions equal to the goal conditions. The final state would be any plan with no open preconditions, i.e. a solution.

Q 22. Explain Neural Network. Also explain the strength and weakness of NN.Ans: An artificial neural network (ANN), usually called "neural network" (NN), is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. A neural network consists of an interconnected group of

http://en.wikipedia.org/wiki/Biological_neural_networks

http://en.wikipedia.org/wiki/Computational_model

http://en.wikipedia.org/wiki/Mathematical_model

http://en.wikipedia.org/wiki/Search_algorithm

http://en.wikipedia.org/wiki/Computer_program

http://en.wikipedia.org/wiki/Algorithm

http://en.wikipedia.org/w/index.php?title=Least_commitment_strategy&action=edit&redlink=1

http://en.wikipedia.org/wiki/Partial_plan

http://en.wikipedia.org/wiki/Automated_planning

artificial neurons, and it processes information using a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. Modern neural networks are non-linear statistical data modeling tools. They are usually used to model complex relationships between inputs and outputs or to find patterns in data.In an artificial neural network simple artificial nodes, called variously "neurons", "neurodes", "processing elements" (PEs) or "units", are connected together to form a network of nodes mimicking the biological neural networks — hence the term "artificial neural network".Neural networks are being used:in investment analysis:

to attempt to predict the movement of stocks currencies etc., from previous data. There, they are replacing earlier simpler linear models.

in signature analysis:as a mechanism for comparing signatures made (e.g. in a bank) with those stored. This is one of the first large-scale applications of neural networks in the USA, and is also one of the first to use a neural network chip.

in process control:there are clearly applications to be made here: most processes cannot be determined as computable algorithms. Newcastle University Chemical Engineering Department is working with industrial partners (such as Zeneca and BP) in this area.

in monitoring:networks have been used to monitor

the state of aircraft engines. By monitoring vibration levels and sound, early warning of engine problems can be given.

British Rail have also been testing a similar application monitoring diesel engines.in marketing:

networks have been used to improve marketing mailshots. One technique is to run a test mailshot, and look at the pattern of returns from this. The idea is to find a predictive mapping from the data known about the clients to how they have responded. This mapping is then used to direct further mailshots.

Strengths and Weaknesses of Neural Networks The greatest strength of neural networks is :their ability to accurately predict outcomes of complex problems. In accuracy tests against other approaches, neural networks are always able to score very high. There are some downfalls to neural networks.

1) First, they have been criticized as being useful for prediction, but not always in understanding a model. It is true that early implementations of neural networks were criticized as “black box” prediction engines; however, with the new tools on the market today, this criticism is debatable. 2) Secondly, neural networks are susceptible to over-training. If a network with a large capacity for learning is trained using too few data examples to support that.

Q 23. Discuss Genetic algorithm.Ans. A genetic algorithm (or GA for short) is a programming technique that mimics biological

evolution as a problem-solving strategy. Given a specific problem to solve, the input to the GA is

http://en.wikipedia.org/wiki/Artificial_neuron

http://en.wikipedia.org/wiki/Node_(neural_networks)

http://en.wikipedia.org/wiki/Pattern_recognition

http://en.wikipedia.org/wiki/Data_modeling

http://en.wikipedia.org/wiki/Statistical

http://en.wikipedia.org/wiki/Non-linear

http://en.wikipedia.org/wiki/Adaptive_system

http://en.wikipedia.org/wiki/Computation

http://en.wikipedia.org/wiki/Connectionism

http://en.wikipedia.org/wiki/Artificial_neuron

a set of potential solutions to that problem, encoded in some fashion, and a metric called a fitness

function that allows each candidate to be quantitatively evaluated. These candidates may be

solutions already known to work, with the aim of the GA being to improve them, but more often

they are generated at random.

The GA then evaluates each candidate according to the fitness function. In a pool of randomly

generated candidates, of course, most will not work at all, and these will be deleted. However,

purely by chance, a few may hold promise - they may show activity, even if only weak and

imperfect activity, toward solving the problem.

These promising candidates are kept and allowed to reproduce. Multiple copies are made of

them, but the copies are not perfect; random changes are introduced during the copying process.

These digital offspring then go on to the next generation, forming a new pool of candidate

solutions, and are subjected to a second round of fitness evaluation. Those candidate solutions

which were worsened, or made no better, by the changes to their code are again deleted; but

again, purely by chance, the random variations introduced into the population may have

improved some individuals, making them into better, more complete or more efficient solutions

to the problem at hand. Again these winning individuals are selected and copied over into the

next generation with random changes, and the process repeats. The expectation is that the

average fitness of the population will increase each round, and so by repeating this process for

hundreds or thousands of rounds, very good solutions to the problem can be discovered.

As astonishing and counterintuitive as it may seem to some, genetic algorithms have proven to

be an enormously powerful and successful problem-solving strategy, dramatically demonstrating

the power of evolutionary principles. Genetic algorithms have been used in a wide variety of

fields to evolve solutions to problems as difficult as or more difficult than those faced by human

designers. Moreover, the solutions they come up with are often more efficient, more elegant, or

more complex than anything comparable a human engineer would produce. In some cases,

genetic algorithms have come up with solutions that baffle the programmers who wrote the

algorithms in the first place!

Methods of representation

Before a genetic algorithm can be put to work on any problem, a method is needed to encode

potential solutions to that problem in a form that a computer can process. One common approach

is to encode solutions as binary strings: sequences of 1's and 0's, where the digit at each position

represents the value of some aspect of the solution. Another, similar approach is to encode

solutions as arrays of integers or decimal numbers, with each position again representing some

particular aspect of the solution. This approach allows for greater precision and complexity than

the comparatively restricted method of using binary numbers only and often "is intuitively closer

to the problem space" (Fleming and Purshouse 2002, p. 1228).

This technique was used, for example, in the work of Steffen Schulze-Kremer, who wrote a

genetic algorithm to predict the three-dimensional structure of a protein based on the sequence of

amino acids that go into it (Mitchell 1996, p. 62). Schulze-Kremer's GA used real-valued

numbers to represent the so-called "torsion angles" between the peptide bonds that connect

amino acids. (A protein is made up of a sequence of basic building blocks called amino acids,

which are joined together like the links in a chain. Once all the amino acids are linked, the

protein folds up into a complex three-dimensional shape based on which amino acids attract each

other and which ones repel each other. The shape of a protein determines its function.) Genetic

algorithms for training neural networks often use this method of encoding also.

A third approach is to represent individuals in a GA as strings of letters, where each letter again

stands for a specific aspect of the solution. One example of this technique is Hiroaki Kitano's

"grammatical encoding" approach, where a GA was put to the task of evolving a simple set of

rules called a context-free grammar that was in turn used to generate neural networks for a

variety of problems (Mitchell 1996, p. 74).

The virtue of all three of these methods is that they make it easy to define operators that cause

the random changes in the selected candidates: flip a 0 to a 1 or vice versa, add or subtract from

the value of a number by a randomly chosen amount, or change one letter to another. (See the

section on Methods of change for more detail about the genetic operators.) Another strategy,

developed principally by John Koza of Stanford University and called genetic programming,

represents programs as branching data structures called trees (Koza et al. 2003, p. 35). In this

approach, random changes can be brought about by changing the operator or altering the value at

a given node in the tree, or replacing one subtree with another.

http://www.talkorigins.org/faqs/genalg/genalg.html#koza2003

http://www.talkorigins.org/faqs/genalg/genalg.html#what:change

http://www.talkorigins.org/faqs/genalg/genalg.html#mitchell1996

http://www.talkorigins.org/faqs/genalg/genalg.html#what:neuralnetworks


http://www.talkorigins.org/faqs/genalg/genalg.html#fleming2002

Figure 1: Three simple program trees of the kind normally used in genetic programming. The

mathematical expression that each one represents is given underneath.

It is important to note that evolutionary algorithms do not need to represent candidate solutions

as data strings of fixed length. Some do represent them in this way, but others do not; for

example, Kitano's grammatical encoding discussed above can be efficiently scaled to create large

and complex neural networks, and Koza's genetic programming trees can grow arbitrarily large

as necessary to solve whatever problem they are applied to.

Methods of selection

There are many different techniques which a genetic algorithm can use to select the individuals

to be copied over into the next generation, but listed below are some of the most common

methods. Some of these methods are mutually exclusive, but others can be and often are used in

combination.

Elitist selection : The most fit members of each generation are guaranteed to be selected. (Most

GAs do not use pure elitism, but instead use a modified form where the single best, or a few of

the best, individuals from each generation are copied into the next generation just in case nothing

better turns up.)

Fitness-proportionate selection: More fit individuals are more likely, but not certain, to be

selected.

Roulette-wheel selection: A form of fitness-proportionate selection in which the chance of an

individual's being selected is proportional to the amount by which its fitness is greater or less

than its competitors' fitness. (Conceptually, this can be represented as a game of roulette - each

individual gets a slice of the wheel, but more fit ones get larger slices than less fit ones. The

wheel is then spun, and whichever individual "owns" the section on which it lands each time is

chosen.)

Scaling selection: As the average fitness of the population increases, the strength of the selective

pressure also increases and the fitness function becomes more discriminating. This method can

be helpful in making the best selection later on when all individuals have relatively high fitness

and only small differences in fitness distinguish one from another.

Tournament selection: Subgroups of individuals are chosen from the larger population, and

members of each subgroup compete against each other. Only one individual from each subgroup

is chosen to reproduce.

Rank selection : Each individual in the population is assigned a numerical rank based on fitness,

and selection is based on this ranking rather than absolute differences in fitness. The advantage

of this method is that it can prevent very fit individuals from gaining dominance early at the

expense of less fit ones, which would reduce the population's genetic diversity and might hinder

attempts to find an acceptable solution.

Generational selection: The offspring of the individuals selected from each generation become

the entire next generation. No individuals are retained between generations.

Steady-state selection : The offspring of the individuals selected from each generation go back

into the pre-existing gene pool, replacing some of the less fit members of the previous

generation. Some individuals are retained between generations.

Hierarchical selection: Individuals go through multiple rounds of selection each generation.

Lower-level evaluations are faster and less discriminating, while those that survive to higher

levels are evaluated more rigorously. The advantage of this method is that it reduces overall

computation time by using faster, less selective evaluation to weed out the majority of

individuals that show little or no promise, and only subjecting those who survive this initial test

to more rigorous and more computationally expensive fitness evaluation.

Methods of change

Once selection has chosen fit individuals, they must be randomly altered in hopes of improving

their fitness for the next generation. There are two basic strategies to accomplish this. The first

and simplest is called mutation. Just as mutation in living things changes one gene to another, so

mutation in a genetic algorithm causes small alterations at single points in an individual's code.

The second method is called crossover, and entails choosing two individuals to swap segments of

their code, producing artificial "offspring" that are combinations of their parents. This process is

intended to simulate the analogous process of recombination that occurs to chromosomes during

sexual reproduction. Common forms of crossover include single-point crossover, in which a

point of exchange is set at a random location in the two individuals' genomes, and one individual

contributes all its code from before that point and the other contributes all its code from after that

point to produce an offspring, and uniform crossover, in which the value at any given location in

the offspring's genome is either the value of one parent's genome at that location or the value of

the other parent's genome at that location, chosen with 50/50 probability.

Figure 2: Crossover and mutation. The above diagrams illustrate the effect of each of these

genetic operators on individuals in a population of 8-bit strings. The upper diagram shows two

individuals undergoing single-point crossover; the point of exchange is set between the fifth and

sixth positions in the genome, producing a new individual that is a hybrid of its progenitors. The

second diagram shows an individual undergoing mutation at position 4, changing the 0 at that

position in its genome to a 1.

Other problem-solving techniques

With the rise of artificial life computing and the development of heuristic methods, other

computerized problem-solving techniques have emerged that are in some ways similar to genetic

algorithms. This section explains some of these techniques, in what ways they resemble GAs and

in what ways they differ.

Neural networks

A neural network, or neural net for short, is a problem-solving method based on a

computer model of how neurons are connected in the brain. A neural network consists of

layers of processing units called nodes joined by directional links: one input layer, one

output layer, and zero or more hidden layers in between. An initial pattern of input is

presented to the input layer of the neural network, and nodes that are stimulated then

transmit a signal to the nodes of the next layer to which they are connected. If the sum of

all the inputs entering one of these virtual neurons is higher than that neuron's so-called

activation threshold, that neuron itself activates, and passes on its own signal to neurons

in the next layer. The pattern of activation therefore spreads forward until it reaches the

output layer and is there returned as a solution to the presented input. Just as in the

nervous system of biological organisms, neural networks learn and fine-tune their

performance over time via repeated rounds of adjusting their thresholds until the actual

output matches the desired output for any given input. This process can be supervised by

a human experimenter or may run automatically using a learning algorithm (Mitchell

1996, p. 52). Genetic algorithms have been used both to build and to train neural

networks.



Figure 3: A simple feedforward neural network, with one input layer consisting of four neurons,

one hidden layer consisting of three neurons, and one output layer consisting of four neurons.

The number on each neuron represents its activation threshold: it will only fire if it receives at

least that many inputs. The diagram shows the neural network being presented with an input

string and shows how activation spreads forward through the network to produce an output.

Hill-climbing

Similar to genetic algorithms, though more systematic and less random, a hill-climbing

algorithm begins with one initial solution to the problem at hand, usually chosen at

random. The string is then mutated, and if the mutation results in higher fitness for the

new solution than for the previous one, the new solution is kept; otherwise, the current

solution is retained. The algorithm is then repeated until no mutation can be found that

causes an increase in the current solution's fitness, and this solution is returned as the

result (Koza et al. 2003, p. 59). (To understand where the name of this technique comes

from, imagine that the space of all possible solutions to a given problem is represented as

a three-dimensional contour landscape. A given set of coordinates on that landscape

represents one particular solution. Those solutions that are better are higher in altitude,

forming hills and peaks; those that are worse are lower in altitude, forming valleys. A

"hill-climber" is then an algorithm that starts out at a given point on the landscape and

moves inexorably uphill.) Hill-climbing is what is known as a greedy algorithm, meaning

it always makes the best choice available at each step in the hope that the overall best

result can be achieved this way. By contrast, methods such as genetic algorithms and

http://www.talkorigins.org/faqs/genalg/genalg.html#koza2003

simulated annealing, discussed below, are not greedy; these methods sometimes make

suboptimal choices in the hopes that they will lead to better solutions later on.

Simulated annealing

Another optimization technique similar to evolutionary algorithms is known as simulated

annealing. The idea borrows its name from the industrial process of annealing in which a

material is heated to above a critical point to soften it, then gradually cooled in order to

erase defects in its crystalline structure, producing a more stable and regular lattice

arrangement of atoms (Haupt and Haupt 1998, p. 16). In simulated annealing, as in

genetic algorithms, there is a fitness function that defines a fitness landscape; however,

rather than a population of candidates as in GAs, there is only one candidate solution.

Simulated annealing also adds the concept of "temperature", a global numerical quantity

which gradually decreases over time. At each step of the algorithm, the solution mutates

(which is equivalent to moving to an adjacent point of the fitness landscape). The fitness

of the new solution is then compared to the fitness of the previous solution; if it is higher,

the new solution is kept. Otherwise, the algorithm makes a decision whether to keep or

discard it based on temperature. If the temperature is high, as it is initially, even changes

that cause significant decreases in fitness may be kept and used as the basis for the next

round of the algorithm, but as temperature decreases, the algorithm becomes more and

more inclined to only accept fitness-increasing changes. Finally, the temperature reaches

zero and the system "freezes"; whatever configuration it is in at that point becomes the

solution. Simulated annealing is often used for engineering design applications such as

determining the physical layout of components on a computer chip.

http://www.talkorigins.org/faqs/genalg/genalg.html#haupt1998

Q 24. Write explanatory note on Explanation based learning.

Learning complex concepts using Induction procedures typically requires a substantial number of training instances.

But people seem to be able to learn quite a bit from single examples. We don’t need to see dozens of positive and negative examples of fork( chess) positions

in order to learn to avoid this trap in the future and perhaps use it to our advantage. What makes such single-example learning possible? The answer is knowledge.

Much of the recent work in machine learning has moved away from the empirical, data intensive approach described in the last section toward this more analytical knowledge intensive approach.

A number of independent studies led to the characterization of this approach as explanation-base learning(EBL).

An EBL system attempts to learn from a single example x by explaining why x is an example of the target concept.

The explanation is then generalized, and then system’s performance is improved through the availability of this knowledge.

We can think of EBL programs as accepting the following as input: A training example A goal concept: A high level description of what the program is supposed to learn An operational criterion- A description of which concepts are usable. A domain theory: A set of rules that describe relationships between objects and

actions in a domain. From this EBL computes a generalization of the training example that is sufficient to

describe the goal concept, and also satisfies the operationality criterion.

Explanation-based generalization (EBG) is an algorithm for EBL and has two steps: (1) explain, (2) generalize

During the explanation step, the domain theory is used to prune away all the unimportant aspects of the training example with respect to the goal concept. What is left is an explanation of why the training example is an instance of the goal concept. This explanation is expressed in terms that satisfy the operationality criterion.

The next step is to generalize the explanation as far as possible while still describing the goal concept.

An Explanation-based Learning (EBL ) system accepts an example (i.e. a training example) and explains what it learns from the example. The EBL system takes only the relevant aspects of the training. This explanation is translated into particular form that a problem solving program can understand. The explanation is generalized so that it can be used to solve other problems. PRODIGY is a system that integrates problem solving, planning, and learning methods in a single architecture. It was originally conceived by Jaime Carbonell and Steven Minton, as an AI system to test and develop ideas on the role that machine learning plays in planning and problem solving. PRODIGY uses the EBL to acquire control rules. The EBL module uses the results from the problem-solving trace (ie. Steps in solving problems) that were generated by the central problem solver (a search engine that searches over a problem space). It constructs explanations using an axiomatized theory that describes both the domain and the architecture of the problem solver. The results are then translated as control rules and added to the knowledge base. The control knowledge that contains control rules is used to guide the search process effectively.When an agent can utilize a worked example of a problem as a problem-solving method, the agent is said to have the capability of explanation-based learning (EBL). This is a type of analytic learning. The advantage of explanation-based learning is that, as a deductive mechanism, it requires only a single training example ( inductive learning methods often require many training examples). However, to utilize just a single example most EBL algorithms require all of the following:

The training example A Goal Concept An Operationality Criteria A Domain Theory

From the training example, the EBL algorithm computes a generalization of the example that is consistent with the goal concept and that meets the operationality criteria (a description of the appropriate form of the final concept). One criticism of EBL is that the required domain theory needs to be complete and consistent. Additionally, the utility of learned information is an issue when learning proceeds indiscriminately. Other forms of learning that are based on EBL are knowledge compilation, caching and macro-ops.

http://ai.eecs.umich.edu/cogarch0/common/capa/caching.html

http://ai.eecs.umich.edu/cogarch0/common/issue/utility.html

http://ai.eecs.umich.edu/cogarch0/common/env/consist.html

http://ai.eecs.umich.edu/cogarch0/common/env/completeknow.html

http://ai.eecs.umich.edu/cogarch0/common/prop/general.html

http://ai.eecs.umich.edu/cogarch0/common/capa/concept.html

Q 25. Write explanatory note on learning by analogy

Analogy is a powerful inference tool.

Our language and reasoning are laden with analogies.

Last month, the stock market was a roller coaster.

Bill is like a fire engine.

Problems in electromagnetism are just like problems in fluid flow.

Underlying each of these examples is a complicated mapping between what appear to be dissimilar concepts.

For example, to understand the first sentence above, it is necessary to do two things:

Pick out one key property of a roller coaster, namely that it travels up and down rapidly

Realize that physical travel is itself an analogy for numerical fluctuations.

This is no easy trick.

The space of possible analogies is very large.

An AI program that is unable to grasp analogy will be difficult to talk to and consequently difficult to teach.

Thus analogical reasoning is an important factor in learning by advice taking.

Humans often solve problems by making analogies to things they already understand how to do.

Analogical learning generally involves developing a set of mappings between features of two instances. Paul Thagard and Keith Holyoak have developed a computational theory of analogical reasoning that is consistent with the outline above, provided that abstraction rules are provided to the model.

Analogy is a cognitive process of transferring information or meaning from a particular subject (the analogue or source) to another particular subject (the target), and a linguistic expression corresponding to such a process. In a narrower sense, analogy is an inference or an argument from one particular to another particular, as opposed to deduction, induction, and abduction, where at least one of the premises or the conclusion is general. The word analogy can also refer to the relation between the source and the target themselves, which is often, though not necessarily, a similarity, as in the biological notion of analogy.

http://en.wikipedia.org/wiki/Analogy_(biology)

http://en.wikipedia.org/wiki/Similarity

http://en.wikipedia.org/wiki/Premise

http://en.wikipedia.org/wiki/Abductive_reasoning

http://en.wikipedia.org/wiki/Inductive_reasoning

http://en.wikipedia.org/wiki/Deductive_reasoning

http://en.wikipedia.org/wiki/Logical_argument

http://en.wikipedia.org/wiki/Inference

http://en.wikipedia.org/wiki/Language

http://en.wikipedia.org/wiki/Meaning

http://en.wikipedia.org/wiki/Information

http://en.wikipedia.org/wiki/Cognition

Niels Bohr's model of the atom made an analogy between the atom and the solar system.Analogy plays a significant role in problem solving, decision making, perception, memory, creativity, emotion, explanation and communication. It lies behind basic tasks such as the identification of places, objects and people, for example, in face perception and facial recognition systems. It has been argued that analogy is "the core of cognition". [3] Specific analogical language comprises exemplification, comparisons, metaphors, similes, allegories, and parables, but not metonymy.

The ANALOGY module uses reasoning strategies and justifications saved from previous problem-solving traces to build strategies for a new problem. These strategies are evoked when the problem solver comes to a place where the new problem's justifications are similar to a previously-solved problem. ANALOGY then directs the problem-solving strategy in the direction that led to the previous solution.

ANALOGY

Requires more inferencing

Process of learning new concept or solutions through the use of similar known concepts or solutions.

http://en.wikipedia.org/wiki/Metonymy

http://en.wikipedia.org/wiki/Parable

http://en.wikipedia.org/wiki/Allegory

http://en.wikipedia.org/wiki/Simile

http://en.wikipedia.org/wiki/Metaphor

http://en.wikipedia.org/wiki/Comparison_(grammar)

http://en.wikipedia.org/wiki/Exemplar

http://en.wikipedia.org/wiki/Analogy#cite_note-2

http://en.wikipedia.org/wiki/Facial_recognition_system

http://en.wikipedia.org/wiki/Facial_recognition_system

http://en.wikipedia.org/wiki/Face_perception

http://en.wikipedia.org/wiki/Communication

http://en.wikipedia.org/wiki/Explanation

http://en.wikipedia.org/wiki/Emotion

http://en.wikipedia.org/wiki/Creativity

http://en.wikipedia.org/wiki/Memory

http://en.wikipedia.org/wiki/Perception

http://en.wikipedia.org/wiki/Decision_making

http://en.wikipedia.org/wiki/Problem_solving

http://en.wikipedia.org/wiki/Solar_system

http://en.wikipedia.org/wiki/Atom

http://en.wikipedia.org/wiki/Niels_Bohr

http://en.wikipedia.org/wiki/File:Bohr_atom_model_English.svg

http://en.wikipedia.org/wiki/File:Bohr_atom_model_English.svg

UNIT-IVQ 26. What is expert system? Discuss the architecture of Expert system.Ans. Definition(s) of Expert/Knowledge-Based SystemsThe primary intent of expert system technology is to realize the integration of human expertise into computer processes. This integration not only helps to preserve the human expertise but also allows humans to be freed from performing the more routine activities that might be associated with interactions with a computer-based system.Given the number of textbooks, journal articles, and conference publications about expert/knowledgebasedsystems and their application, it is not surprising that there exist a number of different definitions for an expert/knowledge-based system. In this article we use the following definition (2, 3):An expert/knowledge-based system is a computer program that is designed to mimic the decision-making ability of a decision-maker(s) (i.e., expert(s)) in a particular narrow domain of expertise.In order to fully understand and appreciate the meaning and nature of this definition, we highlight anddetail the four major component pieces.

An expert/knowledge-based system is a computer program. A computer program is a piece of software, written by a “programmer” as a solution to some particular problem or client need. Because expert/knowledge-based systems are software products they inherit all of the problems associated with any piece of computer software. Some of these issues will be addressed in the discussion on thedevelopment of these systems.

An expert/knowledge-based system is designed to mimic the decision-making ability. The specific task of an expert/knowledge-based system is to be an alternative source of decision-making ability for organizations to use; instead of relying on the expertise of just one—or a handful—of people qualified to make a particular decision. An expert/knowledge-based system attempts to capture the reasoning of a particular person for a specific problem. Usually expert/knowledge-based systems are designed and developed to capture the scarce, but critical decision-making that occurs in many organizations. Expert/knowledge-based systems are often feared to be “replacements” for decisionmakers, however, in many organizations, these systems are used to “free up” the decision-maker to address more complex and important issues facing the organization.

An expert/knowledge-based system uses a decision-maker(s) (i.e., expert(s)). Webster’s dictionary defines an expert asOne with the special skill or mastery of a particular subjectThe focal point in the development of an expert/knowledge-based system is to acquire and represent the knowledge and experience of a person(s) who have been identified as possessing the special skill or mastery.An expert/knowledge-based system is created to solve problems in a particular narrow domain of expertise. The above definition restricts the term expert to a particular subject. Some of the most

successful development efforts of expert/knowledge-based systems have been in domains that are well scoped and have clear boundaries. Specific problem characteristics that lead to successful expert/knowledge-based systems are discussed as part of the development process.Now that we have defined what an expert/knowledge-based is, we will briefly discuss the history of these systems. In this discussion, we include their historical place within the Artificial Intelligence area and highlight some of the early, significant expert system development.

Major Application AreasThere are two different ways developers look at application areas for expert/knowledge-based systems. First, they look at the functional nature of the problem. Secondly, they look at the application domain. We review both of these ways to get a better understanding for the application of expert/knowledge-based systems to “real-world” problems. In 1993, John Durkin (12) published a catalog of expert system applications that briefly reviews a number of applications of expert/knowledgebased system technology and categorizes each of the nearly 2,500 systems.Both MYCIN and XCON point out two different functions that are viewed as highly favorable for expert/knowledge-based system development. MYCIN mainly deals with the diagnosis of a disease given a set of symptoms and patient information. XCON, on the other hand, is a synthesis-based (design) configuration expert system. It takes as its input the needs of the customer and builds a feasible arrangement of components to meet the need. Both of these systems solve different generic “types” of problems.An expert system may have many differing functions. It may monitor, detect faults, isolate faults, control, give advice, document, assist, etc. The range of applications for expert system technology ranges from highly embedded turnkey expert systems for controlling certain functions in a car or in a home to systems that provide financial, medical, or navigation advice to systems that control spacecraft.

Structure of Expert SystemsIn the early days the phrase “expert system” was used to denote a system whose knowledge base and reasoning mechanisms were based on those of a human expert. In this article a more general position is held. A system will be called an “expert system” based on its form alone and independent of its source of knowledge or reasoning capabilities.The purpose of this section is to provide an intuitive overview of the architectural ideas associated with expert systems. In discussing the architecture of expert systems we will first introduce the concept of an expert system kernel and then embed that kernel in a fuller and more traditional expert system architecture.

An Expert System ArchitectureIf we embed the kernel of an expert system in an operational contextthat contains processes for interacting with and interfacing with a user, a process for knowledge and data acquisition and a process to support the generation of explanations for rule firings and advice to the userthen we arrive at what is customarily viewed as the architecture for an expert system.

Figure 2 displays the architecture commonly associated with expert systems. In our terminology it is comprised of a kernel augmented by processes for data and knowledge capture, user interfaces and interactions, and an process for generating and presenting to a user explanations of its behaviors.The “Knowledge and Data Acquisition” process is used by the expert system to acquire new facts and rules associated with its specific domain. It is through this process that capabilities can be added to or subtracted from the expert system. Associated with this process is the concept of knowledge engineering.This is the process whereby knowledge from an expert or group of experts or other sources such as books, procedure manuals, training guides, etc. are gathered, formatted, verified and validated, and input into the knowledge base of the expert system (see the discussion on expert/knowledge development for a more detailed explanation of knowledge engineering activities).

Q 27. Explain the AI application to Robotics.Ans.Application of AI to RoboticsOne of the main applications of AI is in the area of robot control. By using evolving control architectures, the robot can 'learn' the best way to do a task. Designers can use neural networks and genetic algorithms to enable the robot to cope with complicated tasks, such as navigation in a complicated environment (Mars, for instance). Another area is image, sound and pattern recognition - 3 traits that any anthropomorphic robot would need. Again, neural-networks could be used to analyze data from the optical or audio device the robot used.

ExampleA robot is assigned to hover over an assembly line and examine the gears that pass underneath is for faults. If a fault is discovered, the robot is to push the gear off the line into the rejection bin. Before the robot is put into practice, the robot is trained using a neural network to recognize the salient features of the gear - its radius, the shape of its perimeter, and its size. When the machine is then put on to the assembly line, its optical equipment converts what it sees into input for the neural network which then analyzes whether the gear is ok or not. If so, it passes, else an arm is activated to push the gear off. There are limitations to such visual systems, described in the Problems with Machine Vision essay.

Another area that robotics and AI/ALife are closely connected is the 'simple function, complex behaviour' robots. Such robots perform small tasks, using very simple rules, but behave rather in a complicated fashion, in a very similar way to Conway's Boids. For example, 5-6 robots could be programmed to clean up a room, moving small objects to the nearest corner of the room, whilst avoiding obstacles and each other. The military and NASA are researching such robots as possible spy (or for NASA, exploration) robots that could easily pass through enemy defenses, and through sheer numbers gather a large amount of data, without risk of loss of human life.Examples of robots :Cog.MIT has always been at the forefront of AI technology, and it is building its own robot(s) under a project heading "The Cog Shop", then main one called Cog. Cog is an attempt at creating a robot that simulates the sensory and motor dynamics of the human body (the only exception being its lack of legs). The motors within Cog all simulate the degree of freedom that the human body does. The robot also has an advanced vision system that again simulates that of the human. Each 'eye' consists of two cameras, one for the wide-angle view, the other for a smaller, more precise view (mimicking the fovea). Cog is powered by 8 (expandable to 239!) 16Mhz Motorola 68332 microprocessors that are networked together, running L (a Kesmit.Other robot from the "The Cog Shop" is a social robot called Kesmit. Kesmit is a completely autonomous robot that attempts to simulate a baby learning from its parents. This rather cute looking robot uses facial expressions to show the untrained user how it feels. So far it has over 10 different facial expressions, including fear, disgust, anger, surprise and other common emotions. His features include a mouth, eye-brows, ears and eyelids - giving Kesmit a Gremlin-like appearance. Kesmits software takes much of its theory from "...psychology, ethology, and developmental psychology..." [Cog 2]. It is split up into 5 systems, perception system, the motivation system,

http://library.thinkquest.org/18242/references.shtml

http://library.thinkquest.org/18242/alife.shtml

http://library.thinkquest.org/18242/vision.shtml

http://library.thinkquest.org/18242/vision.shtml

http://library.thinkquest.org/18242/ga.shtml

http://library.thinkquest.org/18242/nnintro.shtml

the attention system, the behaviour system, and the motor system. The perception system extracts the data from the outside world (through the cameras in Kesmits eyes), the motivation system maintains the emotions Kesmit 'feels', the attention system regulates the extent of these emotions, the behaviour system implements the emotions, and the motor system controls the hardware required to express the emotion. Visit Kismet's homepage to learn more, its a fascinating project.

Conclusion

Robotics is in many respects Mechanical AI. It is also a lot more complicated, since the data the robot is receiving is real-time, real-world data, a lot more complicated that more software-based AI programs have to deal with. On top of this more complicated programming required, algorithms to respond via motors and other sensors is needed. The field of robotics is where AI is all eventually aimed, most research is intended to one day become part of a robot.

Q 28. Discuss the current trends in Intelligent systems.Ans.

Mobile Robot Competition and Exhibition — This is the twelfth year AAAI has sponsored the Mobile Robot Competition, which brings together teams from leading robotics research labs to compete and demonstrate state-of-the-art research in robotics and AI. Each year the bar is raised on the competition challenges, and each year the robots demonstrate increasing capabilities. This year, the competition includes three events: Robot Host, Robot Rescue, and Robot Challenge.

Innovative Applications of AI — Awards and Emerging Applications — This conference, co-located with IJCAI-03 and sponsored by the Association for the Advancement of Artificial Intelligence, honors deployed applications that use AI techniques in innovative ways delivering quantifiable value. In recent years, the conference has been expanded to recognize experimental emerging applications that in preliminary tests are demonstrating promising results. Together, the 20 applications presented point the way toward new trends in intelligent applications in a wide range of areas — from automated fraud detection in the NASDAQ stock market, to automated search of broadcast news for items of interest, to new teaching and commerce systems, and intelligent distributed computing.

Trading Agents Competition — The goal of this competition is to spurn research in intelligent agents for e-commerce. This year there will be two competitions — one in which travel agents put together travel packages, and the second — a dynamic supply chain trading scenario — in which PC manufacturers compete against one another for customer orders and supplies over 250 simulated days.

AI and the Web — special track — Four invited speakers will discuss emerging trends in using AI to improve the intelligence of the Web infrastructure. Senior researchers from Google, University of Southern California, Hong Kong Baptist University and University of Trento, Italy, will discuss topics such as Web search engines, technologies to build and deploy intelligent agents on the Web, the research agenda to build true Web intelligence, and eCommerce travel systems.

Invited Speakers — Eleven speakers from leading research centers and business ventures in the U.S. and Europe have been invited to the technical conference to discuss emerging AI research and experimental systems on topics, such as selfreconfiguring

http://www.ai.mit.edu/projects/kismet/kismet.html

robots, Paul Allen's Vulcan, Inc. Project Halo, aimed toward the creation of a digital Aristotle, capable of answering and providing cogent explanations to arbitrary questions in a variety of domains.

Technical Program — At the heart of the conference is the technical program, which this year includes 189 technical paper presentations by leading researchers on a broad array of AI topics, for example: computer vision, robotics, intelligent agents, intelligent Internet, logic, learning, reasoning, representation and much more.

Workshops and Tutorials — There will be 30 workshops on research and applications (by invitation only) on such topics as Agent-Oriented Systems, AI Applications, Knowledge Representation and Reasoning, Machine Learning and Data Mining, and AI and the Web. In addition, 19 tutorials will cover concentrated technical topics of current or emerging interest, for instance: state of the art in ant robotics, intelligent Web service, and automated security protocol verification.

Intelligent Systems Demonstrations — This track highlights innovative contributions to the science of AI with an emphasis on the benefits of developing and using implemented systems in AI research. This year's demonstrations include web technologies, intelligent agents, reasoning engines, and collaborative and conversational system, in domains ranging from space exploration to travel agencies to writing technical papers.

Poster Session — This new program is designed to promote new research ideas and widen participation at the conference. The 87 poster presentations represent a broad cross-section of AI research areas including: automated reasoning, case-based reasoning, constraints, knowledge representation, data mining and information retrieval, machine learning, multiagents, natural language, neural networks, planning, search, vision and robotics.

Exhibit Program — Some of the leading robotics vendors, publishers, research labs, and others will exhibit their offerings.

Q. 29 What are principles of natural language processing?Ans.INTRODUCTIONNatural Language Processing (NLP) is the computerized approach to analyzing text that is based on both a set of theories and a set of technologies. And, being a very active area of research and development, there is not a single agreed-upon definition that would satisfy everyone, but there are some aspects, which would be part of any knowledgeable person’s definition. The definition is: Natural Language Processing is a theoretically motivated range of computational techniques for analyzing and representing naturally occurring texts at one or more levels of linguistic analysis for the purpose of achieving human-like language processing for a range of tasks or applications.

Several elements of this definition can be further detailed. Firstly the imprecise notion of ‘range of computational techniques’ is necessary because there are multiple methods or techniques from which to choose to accomplish a particular type of language analysis.‘Naturally occurring texts’ can be of any language, mode, genre, etc. The texts can be oral or written. The only requirement is that they be in a language used by humans to communicate to one another. Also, the text being analyzed should not be specifically constructed for the purpose of the analysis, but rather that the text be gathered from actual usage.The notion of ‘levels of linguistic analysis’ (to be further explained in Section 2) refers to the fact that there are multiple types of language processing known to be at work when humans produce or comprehend language. It is thought that humans normally utilize all of these levels since each level conveys different types of meaning. But various NLP systems utilize different levels, or combinations of levels of linguistic analysis, and this is seen in the differences amongst various NLP applications. This also leads to muchconfusion on the part of non-specialists as to what NLP really is, because a system that uses any subset of these levels of analysis can be said to be an NLP-based system. The difference between them, therefore, may actually be whether the system uses ‘weak’ NLP or ‘strong’ NLP.‘Human-like language processing’ reveals that NLP is considered a discipline within Artificial Intelligence (AI). And while the full lineage of NLP does depend on a number of other disciplines, since NLP strives for human-like performance, it is appropriate to consider it an AI discipline.‘For a range of tasks or applications’ points out that NLP is not usually considered a goal in and of itself, except perhaps for AI researchers. For others, NLP is the means for accomplishing a particular task. Therefore, you have Information Retrieval (IR) systems that utilize NLP, as well as Machine Translation (MT), Question-Answering, etc.

GoalThe goal of NLP as stated above is “to accomplish human-like language processing”. The choice of the word ‘processing’ is very deliberate, and should not be replaced with ‘understanding’. For although the field of NLP was originally referred to as Natural Language Understanding (NLU) in the early days of AI, it is well agreed today that while the goal of NLP is true NLU, that goal has not yet been accomplished. A full NLU System would be able to:1. Paraphrase an input text2. Translate the text into another language

3. Answer questions about the contents of the text4. Draw inferences from the textWhile NLP has made serious inroads into accomplishing goals 1 to 3, the fact that NLP systems cannot, of themselves, draw inferences from text, NLU still remains the goal of NLP.There are more practical goals for NLP, many related to the particular application for which it is being utilized. For example, an NLP-based IR system has the goal of providing more precise, complete information in response to a user’s real information need. The goal of the NLP system here is to represent the true meaning and intent of the user’s query, which can be expressed as naturally in everyday language as if they were speaking to a reference librarian. Also, the contents of the documents that are being searched will be represented at all their levels of meaning so that a true match between need and response can be found, no matter how either are expressed in their surface form.OriginsAs most modern disciplines, the lineage of NLP is indeed mixed, and still today has strong emphases by different groups whose backgrounds are more influenced by one or another of the disciplines. Key among the contributors to the discipline and practice ofNLP are: Linguistics - focuses on formal, structural models of language and the discovery of language universals - in fact the field of NLP was originally referred to as Computational Linguistics; Computer Science - is concerned with developing internal representations of data and efficient processing of these structures, and; CognitivePsychology - looks at language usage as a window into human cognitive processes, andhas the goal of modeling the use of language in a psychologically plausible way.DivisionsWhile the entire field is referred to as Natural Language Processing, there are in fact two distinct focuses – language processing and language generation. The first of these refers to the analysis of language for the purpose of producing a meaningful representation, while the latter refers to the production of language from a representation. The task of Natural Language Processing is equivalent to the role of reader/listener, while the task of Natural Language Generation is that of the writer/speaker. While much of the theory andtechnology are shared by these two divisions, Natural Language Generation also requires a planning capability. That is, the generation system requires a plan or model of the goal of the interaction in order to decide what the system should generate at each point in an interaction. We will focus on the task of natural language analysis, as this is most relevant to Library and Information Science.

Another distinction is traditionally made between language understanding and speech understanding. Speech understanding starts with, and speech generation ends with, oral language and therefore rely on the additional fields of acoustics and phonology. Speech understanding focuses on how the ‘sounds’ of language as picked up by the system in the form of acoustical waves are transcribed into recognizable morphemes and words. Once in this form, the same levels of processing which are utilized on written text are utilized.

LEVELS OF NATURAL LANGUAGE PROCESSINGThe most explanatory method for presenting what actually happens within a Natural Language Processing system is by means of the ‘levels of language’ approach. This is also referred to as the synchronic model of language and is distinguished from the earlier sequential model, which

hypothesizes that the levels of human language processing follow one another in a strictly sequential manner. Psycholinguistic research suggests that language processing is much more dynamic, as the levels can interact in a variety of orders. Introspection reveals that we frequently use information we gain from what is typically thought of as a higher level of processing to assist in a lower level of analysis.For example, the pragmatic knowledge that the document you are reading is about biology will be used when a particular word that has several possible senses (or meanings) is encountered, and the word will be interpreted as having the biology sense.Of necessity, the following description of levels will be presented sequentially. The key point here is that meaning is conveyed by each and every level of language and that since humans have been shown to use all levels of language to gain understanding, the more capable an NLP system is, the more levels of language it will utilize.PhonologyThis level deals with the interpretation of speech sounds within and across words. There are, in fact, three types of rules used in phonological analysis: 1) phonetic rules – for sounds within words; 2) phonemic rules – for variations of pronunciation when words are spoken together, and; 3) prosodic rules – for fluctuation in stress and intonation across a sentence. In an NLP system that accepts spoken input, the sound waves are analyzed and encoded into a digitized signal for interpretation by various rules or by comparison to the particular language model being utilized.MorphologyThis level deals with the componential nature of words, which are composed of morphemes – the smallest units of meaning. For example, the word preregistration can be morphologically analyzed into three separate morphemes: the prefix pre, the root registra, and the suffix tion. Since the meaning of each morpheme remains the same across words, humans can break down an unknown word into its constituent morphemes in order to understand its meaning. Similarly, an NLP system can recognize the meaning conveyed by each morpheme in order to gain and represent meaning. For example, adding the suffix –ed to a verb, conveys that the action of the verb took place in the past.This is a key piece of meaning, and in fact, is frequently only evidenced in a text by the use of the -ed morpheme.LexicalAt this level, humans, as well as NLP systems, interpret the meaning of individual words. Several types of processing contribute to word-level understanding – the first of these being assignment of a single part-of-speech tag to each word. In this processing, words that can function as more than one part-of-speech are assigned the most probable part-ofspeech tag based on the context in which they occur.Additionally at the lexical level, those words that have only one possible sense or meaning can be replaced by a semantic representation of that meaning. The nature of the representation varies according to the semantic theory utilized in the NLP system. The following representation of the meaning of the word launch is in the form of logical predicates. As can be observed, a single lexical unit is decomposed into its more basic properties. Given that there is a set of semantic primitives used across all words, these simplified lexical representations make it possible to unify meaning across words and to produce complex interpretations, much the same as humans do.launch (a large boat used for carrying people on rivers, lakes harbors, etc.)((CLASS BOAT) (PROPERTIES (LARGE)

(PURPOSE (PREDICATION (CLASS CARRY) (OBJECT PEOPLE))))The lexical level may require a lexicon, and the particular approach taken by an NLP system will determine whether a lexicon will be utilized, as well as the nature and extent of information that is encoded in the lexicon. Lexicons may be quite simple, with only the words and their part(s)-of-speech, or may be increasingly complex and contain information on the semantic class of the word, what arguments it takes, and the semantic limitations on these arguments, definitions of the sense(s) in the semantic representationutilized in the particular system, and even the semantic field in which each sense of a polysemous word is used.SyntacticThis level focuses on analyzing the words in a sentence so as to uncover the grammatical structure of the sentence. This requires both a grammar and a parser. The output of this level of processing is a (possibly delinearized) representation of the sentence that reveals the structural dependency relationships between the words. There are various grammars that can be utilized, and which will, in turn, impact the choice of a parser. Not all NLP applications require a full parse of sentences, therefore the remaining challenges inparsing of prepositional phrase attachment and conjunction scoping no longer stymie those applications for which phrasal and clausal dependencies are sufficient. Syntax conveys meaning in most languages because order and dependency contribute to meaning. For example the two sentences: ‘The dog chased the cat.’ and ‘The cat chased the dog.’ differ only in terms of syntax, yet convey quite different meanings.SemanticThis is the level at which most people think meaning is determined, however, as we can see in the above defining of the levels, it is all the levels that contribute to meaning.Semantic processing determines the possible meanings of a sentence by focusing on the interactions among word-level meanings in the sentence. This level of processing can include the semantic disambiguation of words with multiple senses; in an analogous way to how syntactic disambiguation of words that can function as multiple parts-of-speech is accomplished at the syntactic level. Semantic disambiguation permits one and only one sense of polysemous words to be selected and included in the semantic representation of the sentence. For example, amongst other meanings, ‘file’ as a noun can mean either a folder for storing papers, or a tool to shape one’s fingernails, or a line of individuals in a queue. If information from the rest of the sentence were required for the disambiguation, the semantic, not the lexical level, would do the disambiguation. A wide range of methods can be implemented to accomplish the disambiguation, some which require information as to the frequency with which each sense occurs in a particular corpus of interest, or in general usage, some which require consideration of the local context, and others which utilize pragmatic knowledge of the domain of the document.DiscourseWhile syntax and semantics work with sentence-length units, the discourse level of NLP works with units of text longer than a sentence. That is, it does not interpret multisentence texts as just concatenated sentences, each of which can be interpreted singly.Rather, discourse focuses on the properties of the text as a whole that convey meaning by making connections between component sentences. Several types of discourse processing can occur at this level, two of the most common being anaphora resolution and discourse/text structure recognition. Anaphora resolution is the replacing of words such as pronouns, which are

semantically vacant, with the appropriate entity to which they refer (30). Discourse/text structure recognition determines the functions of sentences inthe text, which, in turn, adds to the meaningful representation of the text. For example, newspaper articles can be deconstructed into discourse components such as: Lead, MainPragmaticThis level is concerned with the purposeful use of language in situations and utilizes context over and above the contents of the text for understanding The goal is to explain how extra meaning is read into texts without actually being encoded in them. This requires much world knowledge, including the understanding of intentions, plans, and goals. Some NLP applications may utilize knowledge bases and inferencing modules. For example, the following two sentences require resolution of the anaphoric term ‘they’, but this resolution requires pragmatic or world knowledge.

ai model test paper with answers

Documents

knowledge acquisition

human intelligence

knowledge base

existing knowledge

knowledge representation

refinement of knowledge

history of ai

importance of ai