15-780: graduate ai lecture 1. intro & logicggordon/780-spring09/slides/01... · 2009. 1. 13. ·...

81
15-780: Graduate AI Lecture 1. Intro & Logic Geoff Gordon (this lecture) Tuomas Sandholm TAs Byron Boots, Sam Ganzfried

Upload: others

Post on 22-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

  • 15-780: Graduate AILecture 1. Intro & Logic

    Geoff Gordon (this lecture)Tuomas Sandholm

    TAs Byron Boots, Sam Ganzfried

  • Admin

  • http://www.cs.cmu.edu/~ggordon/780/http://www.cs.cmu.edu/~sandholm/cs15-780S09/

  • Website highlights

    Book: Russell and Norvig. Artificial Intelligence: A Modern Approach, 2nd ed. Grading: 4–5 HWs, “mid”term, projectProject: proposal, 2 interim reports, final report, posterOffice hours

  • Website highlights

    Authoritative source for readings, HWsPlease check the website regularly for readings (for Lec. 1–3, Russell & Norvig Chapters 7–9)

  • Background

    No prerequisitesBut, suggest familiarity with at least some of the following:

    Linear algebraCalculusAlgorithms & data structuresComplexity theory

  • Waitlist, Audits

    If you need us to approve something, send us email

  • Course email list

    15780students@…domain cs.cmu.eduTo subscribe/unsubscribe:

    email 15780students-request@…word “help” in subject or body

  • Matlab

    Should all have access to Matlab via school computers

    Those with access to CS license servers, please use if possibleLimited number of Andrew licenses

    Tutorial TBA soonHWs: please use C, C++, Java, or Matlab

  • Intro

  • What is AI?

    Easy part: AHard part: I

    Anything we don’t know how to make a computer do yetCorollary: once we do it, it isn’t AI anymore :-)

  • Definition by examples

    Card gamesPokerBridge

    Board gamesDeep BlueTD-GammonSamuels’s checkers player

  • Web search

  • Web search, cont’d

  • Recommender systems

  • from http://www.math.wpi.edu/IQP/BVCalcHist/calctoc.html

    Computer algebra systems

  • Grand Challenge road race

  • Getting from A to B

    ITA software (http://beta.itasoftware.com)

  • Robocup

  • Kidney exchange

    In US, ≥ 50,000/yr get lethal kidney disease

    Cure = transplant, but donor must be compatible (blood type, tissue type, etc.)

    Wait list for cadaver kidneys: 2–5 years

    Live donors: have 2 kidneys, can survive w/ 1

    Illegal to buy/sell, but altruists/friends/family donate

  • Kidney Exchange

    Patient

    Donor

    Pair 1

    Patient

    Donor

    Pair 2

  • Optimization: cycle cover

    Cycle length constraint => extremely hard (NP-complete) combinatorial optimization problem

    National market predicted to have 10,000 patients at any one time

  • Optimization performance

    Our algorithm

    CPLEXSandholm et al.ʼs

  • More examples

    Motor skills: riding a bicycle, learning to walk, playing pool, … Vision

  • More examples

    Valerie and Tank, the RoboceptionistsSocial skills: attending a party, giving directions, …

  • More examples

    Natural languageSpeech recognition

  • Common threads

    Finding the needle in the haystackSearchOptimizationSummation / integration

    Set the problem up well (so that we can apply a standard algorithm)

  • Common threads

    Managing uncertaintychance outcomes (e.g., dice)sensor uncertainty (“hidden state”)opponents

    The more different types of uncertainty, the harder the problem (and the slower the solution)

  • Classic AI

    No uncertainty, pure searchMathematicadeterministic planningSudoku

    This is the topic of Part I of the course

    http://www.cs.ualberta.ca/~aixplore/search/IDA/Applet/SearchApplet.html

  • Outcome uncertainty

    In backgammon, don’t know ahead of time what the dice will showWhen driving down a corridor, wheel slippage causes unexpected deviationsOpen a door, find out what’s behind itMDPs (later)

  • Sensor uncertainty

    For given set of immediate measurements, multiple world states may be possibleImage of a handwritten digit → 0, 1, …, 9

    Image of room → person locations, identitiesLaser rangefinder scan of a corridor → map, robot location

  • Sensor uncertainty example

  • Opponents cause uncertainty

    In chess, must guess what opponent will do; cannot directly control him/herAlternating moves (game trees)

    not really uncertain (Part I)Simultaneous or hidden moves: game theory (later)

  • Other agents cause uncertainty

    In many AI problems, there are other agents who aren’t (necessarily) opponents

    Ignore them & pretend part of NatureAssume they’re opponents (pessimistic)Learn to cope with what they doTry to convince them to cooperate (paradoxically, this is the hardest case)

    More later

  • Logic

  • Why logic?

    Search: for problems like Sudoku, can write compact description of rulesReasoning: figure out consequences of the knowledge we’ve given our agent… and, logical inference is a special case of probabilistic inference

  • Propositional logic

    Constants: T or FVariables: x, y (values T or F)Connectives: ∧, ∨, ¬

    Can get by w/ just NANDSometimes also add others: ⊕, ⇒, ⇔, …

    George Boole1815–1864

  • Propositional logic

    Build up expressions like ¬x ⇒ y

    Precedence: ¬, ∧, ∨, ⇒

    Terminology: variable or constant with or w/o negation = literalWhole thing = formula or sentence

  • Expressive variable names

    Rather than variable names like x, y, may use names like “rains” or “happy(John)”For now, “happy(John)” is just a string with no internal structure

    there is no “John”happy(John) ⇒ ¬happy(Jack) means the same as x ⇒ ¬y

  • But what does it mean?

    A formula defines a mapping(assignment to variables) ↦ {T, F}

    Assignment to variables = modelFor example, formula ¬x yields mapping:

    x ¬x

    T F

    F T

  • Questions about models and sentences

    How many models make a sentence true?A sentence is satisfiable if it is True in some modelIf not satisfiable, it is a contradiction (False in every model)A sentence is valid if it is True in every model (called a tautology)

  • How is the variable X set in {some, all} true models?This is the most frequent question an agent would ask: given my assumptions, can I conclude X? Can I rule X out?

    Questions about models and sentences

  • More truth tables

    x y x ∧ y

    T T T

    T F F

    F T F

    F F F

    x y x ∨ y

    T T T

    T F T

    F T T

    F F F

  • Truth table for implication

    (a ⇒ b) is logically equivalent to (¬a ∨ b)

    If a is True, b must be True too

    If a False, no requirement on b

    E.g., “if I go to the movie I will have popcorn”: if no movie, may or may not have popcorn

    a b a ⇒ b

    T T T

    T F F

    F T T

    F F T

  • Complex formulas

    To evaluate a bigger formula(x ∨ y) ∧ (x ∨ ¬y) when x = F, y = F

    Build a parse treeFill in variables at leaves using modelWork upwards using truth tables for connectives

  • Example

    (x ∨ y) ∧ (x ∨ ¬y) when x = F, y = F

  • Another example

    (x ∨ y) ⇒ z x = F, y = T, z = F

  • Example

  • 3-coloring

  • http://www.cs.qub.ac.uk/~I.Spence/SuDoku/SuDoku.html

    Sudoku

  • Minesweeper“Minesweeper” CSP

    V = { v1 , v2 , v3 , v4 , v5 , v6 , v7 , v8 }, D = { B (bomb) , S (space) }

    C = { (v1,v2) : { (B , S) , (S,B) } ,(v1,v2,v3) : { (B,S,S) , (S,B,S) , (S,S,B)},...}

    0 0

    0 0

    0 0

    1

    1

    1

    211

    v1

    v2

    v3

    v4

    v5v6v7v8

    v1

    v2

    v3

    v4

    v5v6

    v7

    v8

    The Waltz algorithm

    One of the earliest examples of a computation posed as a CSP.

    The Waltz algorithm is for interpreting line drawings of solid polyhedra.

    Adjacent intersections impose constraints on each other. Use CSP to find a unique set of labelings. Important step to “understanding” the image.

    Look at all intersections.

    What kind of intersection could thisbe? A concave intersection of three

    faces? Or an external convex intersection?

    Waltz Alg. on simple scenes

    Assume all objects:

    • Have no shadows or cracks• Three-faced vertices• “General position”: no junctions change with small

    movements of the eye.

    Then each line on image is one of the following:

    • Boundary line (edge of an object) (

  • Propositional planning

    init: have(cake)goal: have(cake), eaten(cake)eat(cake):

    pre: have(cake)

    eff: -have(cake), eaten(cake)bake(cake):

    pre: -have(cake)

    eff: have(cake)

  • Scheduling (e.g., of factory production)Facility locationCircuit layoutMulti-robot planning

    Important issue: handling uncertainty

    Other important logic problems

  • Working with formulas

  • Truth tables get big fast

    x y z (x ∨ y) ⇒ zT T TT T FT F TT F FF T TF T FF F TF F F

  • Truth tables get big fast

    x y z a (x ∨ y ∨ a) ⇒ zT T T TT T F TT F T TT F F TF T T TF T F TF F T TF F F TT T T FT T F FT F T FT F F FF T T FF T F FF F T FF F F F

  • Definitions

    Two sentences are equivalent, A ≡ B, if they have same truth value in every model

    (rains ⇒ pours) ≡ (¬rains ∨ pours)

    reflexive, transitive, commutativeSimplifying = transforming a formula into a shorter*, equivalent formula

  • Transformation rules210 Chapter 7. Logical Agents

    (α ∧ β) ≡ (β ∧ α) commutativity of ∧(α ∨ β) ≡ (β ∨ α) commutativity of ∨

    ((α ∧ β) ∧ γ) ≡ (α ∧ (β ∧ γ)) associativity of ∧((α ∨ β) ∨ γ) ≡ (α ∨ (β ∨ γ)) associativity of ∨

    ¬(¬α) ≡ α double-negation elimination(α ⇒ β) ≡ (¬β ⇒ ¬α) contraposition(α ⇒ β) ≡ (¬α ∨ β) implication elimination(α ⇔ β) ≡ ((α ⇒ β) ∧ (β ⇒ α)) biconditional elimination¬(α ∧ β) ≡ (¬α ∨ ¬β) de Morgan¬(α ∨ β) ≡ (¬α ∧ ¬β) de Morgan

    (α ∧ (β ∨ γ)) ≡ ((α ∧ β) ∨ (α ∧ γ)) distributivity of ∧ over ∨(α ∨ (β ∧ γ)) ≡ ((α ∨ β) ∧ (α ∨ γ)) distributivity of ∨ over ∧

    Figure 7.11 Standard logical equivalences. The symbols α, β, and γ stand for arbitrarysentences of propositional logic.

    chapter, we will see algorithms that are much more efficient in practice. Unfortunately, every

    known inference algorithm for propositional logic has a worst-case complexity that is expo-

    nential in the size of the input. We do not expect to do better than this because propositional

    entailment is co-NP-complete. (See Appendix A.)

    Equivalence, validity, and satisfiability

    Before we plunge into the details of logical inference algorithms, we will need some addi-

    tional concepts related to entailment. Like entailment, these concepts apply to all forms of

    logic, but they are best illustrated for a particular logic, such as propositional logic.

    The first concept is logical equivalence: two sentences α and β are logically equivalentLOGICALEQUIVALENCEif they are true in the same set of models. We write this as α ⇔ β. For example, wecan easily show (using truth tables) that P ∧ Q and Q ∧ P are logically equivalent; otherequivalences are shown in Figure 7.11. They play much the same role in logic as arithmetic

    identities do in ordinary mathematics. An alternative definition of equivalence is as follows:

    for any two sentences α and β,

    α ≡ β if and only if α |= β and β |= α .(Recall that |= means entailment.)

    The second concept we will need is validity. A sentence is valid if it is true in allVALIDITY

    models. For example, the sentence P ∨ ¬P is valid. Valid sentences are also known astautologies—they are necessarily true and hence vacuous. Because the sentence True is trueTAUTOLOGYin all models, every valid sentence is logically equivalent to True.

    What good are valid sentences? From our definition of entailment, we can derive the

    deduction theorem, which was known to the ancient Greeks:DEDUCTIONTHEOREM

    For any sentences α and β, α |= β if and only if the sentence (α ⇒ β) is valid.(Exercise 7.4 asks for a proof.) We can think of the inference algorithm in Figure 7.10 as

    α, β, γ are arbitrary formulas

    210 Chapter 7. Logical Agents

    (α ∧ β) ≡ (β ∧ α) commutativity of ∧(α ∨ β) ≡ (β ∨ α) commutativity of ∨

    ((α ∧ β) ∧ γ) ≡ (α ∧ (β ∧ γ)) associativity of ∧((α ∨ β) ∨ γ) ≡ (α ∨ (β ∨ γ)) associativity of ∨

    ¬(¬α) ≡ α double-negation elimination(α ⇒ β) ≡ (¬β ⇒ ¬α) contraposition(α ⇒ β) ≡ (¬α ∨ β) implication elimination(α ⇔ β) ≡ ((α ⇒ β) ∧ (β ⇒ α)) biconditional elimination¬(α ∧ β) ≡ (¬α ∨ ¬β) de Morgan¬(α ∨ β) ≡ (¬α ∧ ¬β) de Morgan

    (α ∧ (β ∨ γ)) ≡ ((α ∧ β) ∨ (α ∧ γ)) distributivity of ∧ over ∨(α ∨ (β ∧ γ)) ≡ ((α ∨ β) ∧ (α ∨ γ)) distributivity of ∨ over ∧

    Figure 7.11 Standard logical equivalences. The symbols α, β, and γ stand for arbitrarysentences of propositional logic.

    chapter, we will see algorithms that are much more efficient in practice. Unfortunately, every

    known inference algorithm for propositional logic has a worst-case complexity that is expo-

    nential in the size of the input. We do not expect to do better than this because propositional

    entailment is co-NP-complete. (See Appendix A.)

    Equivalence, validity, and satisfiability

    Before we plunge into the details of logical inference algorithms, we will need some addi-

    tional concepts related to entailment. Like entailment, these concepts apply to all forms of

    logic, but they are best illustrated for a particular logic, such as propositional logic.

    The first concept is logical equivalence: two sentences α and β are logically equivalentLOGICALEQUIVALENCEif they are true in the same set of models. We write this as α ⇔ β. For example, wecan easily show (using truth tables) that P ∧ Q and Q ∧ P are logically equivalent; otherequivalences are shown in Figure 7.11. They play much the same role in logic as arithmetic

    identities do in ordinary mathematics. An alternative definition of equivalence is as follows:

    for any two sentences α and β,

    α ≡ β if and only if α |= β and β |= α .(Recall that |= means entailment.)

    The second concept we will need is validity. A sentence is valid if it is true in allVALIDITY

    models. For example, the sentence P ∨ ¬P is valid. Valid sentences are also known astautologies—they are necessarily true and hence vacuous. Because the sentence True is trueTAUTOLOGYin all models, every valid sentence is logically equivalent to True.

    What good are valid sentences? From our definition of entailment, we can derive the

    deduction theorem, which was known to the ancient Greeks:DEDUCTIONTHEOREM

    For any sentences α and β, α |= β if and only if the sentence (α ⇒ β) is valid.(Exercise 7.4 asks for a proof.) We can think of the inference algorithm in Figure 7.10 as

  • More rules

    210 Chapter 7. Logical Agents

    (α ∧ β) ≡ (β ∧ α) commutativity of ∧(α ∨ β) ≡ (β ∨ α) commutativity of ∨

    ((α ∧ β) ∧ γ) ≡ (α ∧ (β ∧ γ)) associativity of ∧((α ∨ β) ∨ γ) ≡ (α ∨ (β ∨ γ)) associativity of ∨

    ¬(¬α) ≡ α double-negation elimination(α ⇒ β) ≡ (¬β ⇒ ¬α) contraposition(α ⇒ β) ≡ (¬α ∨ β) implication elimination(α ⇔ β) ≡ ((α ⇒ β) ∧ (β ⇒ α)) biconditional elimination¬(α ∧ β) ≡ (¬α ∨ ¬β) de Morgan¬(α ∨ β) ≡ (¬α ∧ ¬β) de Morgan

    (α ∧ (β ∨ γ)) ≡ ((α ∧ β) ∨ (α ∧ γ)) distributivity of ∧ over ∨(α ∨ (β ∧ γ)) ≡ ((α ∨ β) ∧ (α ∨ γ)) distributivity of ∨ over ∧

    Figure 7.11 Standard logical equivalences. The symbols α, β, and γ stand for arbitrarysentences of propositional logic.

    chapter, we will see algorithms that are much more efficient in practice. Unfortunately, every

    known inference algorithm for propositional logic has a worst-case complexity that is expo-

    nential in the size of the input. We do not expect to do better than this because propositional

    entailment is co-NP-complete. (See Appendix A.)

    Equivalence, validity, and satisfiability

    Before we plunge into the details of logical inference algorithms, we will need some addi-

    tional concepts related to entailment. Like entailment, these concepts apply to all forms of

    logic, but they are best illustrated for a particular logic, such as propositional logic.

    The first concept is logical equivalence: two sentences α and β are logically equivalentLOGICALEQUIVALENCEif they are true in the same set of models. We write this as α ⇔ β. For example, wecan easily show (using truth tables) that P ∧ Q and Q ∧ P are logically equivalent; otherequivalences are shown in Figure 7.11. They play much the same role in logic as arithmetic

    identities do in ordinary mathematics. An alternative definition of equivalence is as follows:

    for any two sentences α and β,

    α ≡ β if and only if α |= β and β |= α .(Recall that |= means entailment.)

    The second concept we will need is validity. A sentence is valid if it is true in allVALIDITY

    models. For example, the sentence P ∨ ¬P is valid. Valid sentences are also known astautologies—they are necessarily true and hence vacuous. Because the sentence True is trueTAUTOLOGYin all models, every valid sentence is logically equivalent to True.

    What good are valid sentences? From our definition of entailment, we can derive the

    deduction theorem, which was known to the ancient Greeks:DEDUCTIONTHEOREM

    For any sentences α and β, α |= β if and only if the sentence (α ⇒ β) is valid.(Exercise 7.4 asks for a proof.) We can think of the inference algorithm in Figure 7.10 as

    α, β are arbitrary formulas

  • Still more rules…

    … can be derived from truth tablesFor example:

    (a ∨ ¬a) ≡ True

    (True ∨ a) ≡ True

    (False ∧ a) ≡ False

  • Example

    (a ∨ ¬b) ∧ (a ∨ ¬c) ∧ (¬(b ∨ c) ∨ ¬a)

  • Normal Forms

  • Normal forms

    A normal form is a standard way of writing a formulaE.g., conjunctive normal form (CNF)

    conjunction of disjunctions of literals(x ∨ y ∨ ¬z) ∧ (x ∨ ¬y) ∧ (z)

    Each disjunct called a clauseAny formula can be transformed into CNF w/o changing meaning

  • CNF cont’d

    Often used for storage of knowledge databasecalled knowledge base or KB

    Can add new clauses as we find them outEach clause in KB is separately true (if KB is)

    happy(John) ∧ (¬happy(Bill) ∨ happy(Sue)) ∧man(Socrates) ∧(¬man(Socrates) ∨ mortal(Socrates))

  • Another normal form: DNF

    DNF = disjunctive normal form = disjunction of conjunctions of literalsDoesn’t compose the way CNF does: can’t just add new conjuncts w/o changing meaning of KBExample:

    (rains ∨ ¬pours) ∧ fishing≡

    (rains ∧ fishing) ∨ (¬pours ∧ fishing)

  • Transforming to CNF or DNF

    Naive algorithm:replace all connectives with ∧∨¬

    move negations inward using De Morgan’s laws and double-negationrepeatedly distribute over ∧ over ∨ for DNF (∨ over ∧ for CNF)

  • Example

    Put the following formula in CNF

    (a ∨ b ∨ ¬c) ∧ ¬(d ∨ (e ∧ f)) ∧ (c ∨ d ∨ e)

    (a ∨ b ∨ ¬c) ∧ ¬(d ∨ (e ∧ f)) ∧ (c ∨ d ∨ e)

  • Example

    Now try DNF

    (a ∨ b ∨ ¬c) ∧ ¬(d ∨ (e ∧ f)) ∧ (c ∨ d ∨ e)

  • Discussion

    Problem with naive algorithm: it’s exponential! (Space, time, size of result.)Each use of distributivity can almost double the size of a subformula

  • A smarter transformation

    Can we avoid exponential blowup in CNF?Yes, if we’re willing to introduce new variablesG. Tseitin.  On the complexity of derivation in propositional calculus.  Studies in Constrained Mathematics and Mathematical Logic, 1968.

  • Tseitin example

    Put the following formula in CNF:(a ∧ b) ∨ ((c ∨ d) ∧ e)

    Parse tree:

  • Tseitin transformation

    Introduce temporary variablesx = (a ∧ b)

    y = (c ∨ d)

    z = (y ∧ e)

  • Tseitin transformation

    To ensure x = (a ∧ b), want

    x ⇒ (a ∧ b)

    (a ∧ b) ⇒ x

  • Tseitin transformation

    x ⇒ (a ∧ b)

    (¬x ∨ (a ∧ b))

    (¬x ∨ a) ∧ (¬x ∨ b)

  • Tseitin transformation

    (a ∧ b) ⇒ x

    (¬(a ∧ b) ∨ x)

    (¬a ∨ ¬b ∨ x)

  • Tseitin transformation

    To ensure y = (c ∨ d), want

    y ⇒ (c ∨ d)

    (c ∨ d) ⇒ y

  • Tseitin transformation

    y ⇒ (c ∨ d)

    (¬y ∨ c ∨ d)

    (c ∨ d) ⇒ y((¬c ∧ ¬d) ∨ y)

    (¬c ∨ y) ∧ (¬d ∨ y)

  • Tseitin transformation

    Finally, z = (y ∧ e)

    z ⇒ (y ∧ e) ≡ (¬z ∨ y) ∧ (¬z ∨ e)

    (y ∧ e) ⇒ z ≡ (¬y ∨ ¬e ∨ z)

  • Tseitin end result

    (a ∧ b) ∨ ((c ∨ d) ∧ e) ≡

    (¬x ∨ a) ∧ (¬x ∨ b) ∧ (¬a ∨ ¬b ∨ x) ∧

    (¬y ∨ c ∨ d) ∧ (¬c ∨ y) ∧ (¬d ∨ y) ∧(¬z ∨ y) ∧ (¬z ∨ e) ∧ (¬y ∨ ¬e ∨ z) ∧

    (x ∨ z)