languages and grammars msu cse 260. outline introduction: e xample phrase-structure grammars:...

Post on 22-Dec-2015

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Languages and Grammars

MSU CSE 260

Outline

• Introduction: Example

• Phrase-Structure Grammars: Terminology, Definition, Derivation, Language of a Grammar, Examples

– Exercise 10.1 (1)

• Types of Phrase-Structure Grammars• Derivation Trees: Example, Parsing

– Exercise 10.1 (2, 3)

• Backus-Naur Form

Introduction

• In the English language, the grammar determines whether a combination of words is a valid sentence.

• Are the following valid sentences?– The large rabbit hops quickly. Yes– The frog writes neatly. Yes– Swims quickly mathematician. No

• Grammars are concerned with the syntax (form) of a sentence, and NOT its semantics (or meaning.)

English Grammar

• Sentence: noun phrase followed by verb phrase;• Noun phrase: article adjective noun, or article

noun;• Verb phrase: verb adverb, or verb;• Article: a, or the;• Adjective: large, or hungry;• Noun: rabbit, or mathematician, or frog;• Verb: eats, or hops, or writes, or swims;• Adverb: quickly, or wildly, or neatly;

Example

• Sentence• Noun phrase verb phrase• Article adjective noun verb phrase• Article adjective noun verb adverb• the adjective noun verb adverb• the large noun verb adverb• the large rabbit verb adverb• the large rabbit hops adverb• the large rabbit hops quickly

Grammars and Computation

• Grammars are used as a model of computation.

• Grammars are used to:– generate the words of a language, and– determine whether a word is in a language.

Phrase-Structure Grammars Terminology

• Definitions. A vocabulary (or alphabet) V is a finite, nonempty set of elements called symbols.

• A word (or sentence) over V is a string of finite length of elements of V.

• The empty string (or null string,) denoted by , is the string containing no symbols.

• The set of all words over V is denoted by V*.• A language over V is a subset of V*.

Phrase-Structure Grammars

• A language can be specified by:– listing all the words in the language, or

– giving a set of criteria satisfied by its words, or

– using a grammar.

• A grammar provides:– a set of symbols, and

– a set of rules, called productions, for producing words by replacing strings by other strings: w0 w1.

Phrase-Structure GrammarDefinition

A phrase-structure grammar G = (V, T, S, P) consists of:– a vocabulary V,– a subset T of V consisting of terminal elements,– a start symbol S from V, and– a set P of productions.The set N = V-T consists of nonterminal symbols.Every production in P must contain at least one

nonterminal on its left side.

Phrase-structure Grammar Example

• G = {V, T, S, P}, where– V = {a, b, A, B, S},– T = {a, b},– S is the start symbol, and– P = { S Aba,

A BB,B ab,AB b}.

Phrase-Structure GrammarsDerivation

• Definition. Let G = (V, T, S,P) be a phrase-structure grammar.

Let w0 = lz0r and w1 = lz1r be strings over V.– If z0 z1 is a production of G, we say that:

w1 is directly derivable from w0 (denoted: w0 w1.)– If w0, w1, …, wn are strings over V such that:

w0 w1, w1 w2, …, wn-1 wn, we say that:

wn is derivable from w0 (denoted: w0 * wn.)Note. * should be on top of .

– The sequence of all steps used to obtain wn from w0 is called a derivation.

Example

• In the previous example grammar, the production: B ab makes the string Aaba directly derivable from string ABa.– ABa Aaba

• Also Aaba BBaba Bababa abababa– using: A BB, B ab, and B ab.

• So: ABa * abababaabababa is derivable from ABa.

Language of a Grammar

• Definition. Let G = (V, T, S, P) be a phrase-structure grammar.The language generated by G (or the language of G), denoted by L(G), is the set of all strings of terminals that are derivable from the start symbol S.

L(G) = {wT* | S * w}.

Example

• Let G = {V, T, S, P} be the grammar where:– V = {S, 0, 1},– T = {0, 1},– P = { S 11S,

S 0}.

• What is L(G)?– At any stage of the derivation we can either:

• add two 1s at the end of the string, or• terminate the derivation by adding a 0 at the end of the string.

– L(G)={0, 110, 11110, 1111110, …} = Set of all strings that begin with an even number of 1s and end with 0.

Exercise 10.1 (1)

Types of Grammars

• A type 0 (phrase-structure) grammar has no restrictions on its productions.

• A type 1 (or context-sensitive) grammar has productions only of forms:– w1 w2 with length of w2 length of w1, or– w1 .

• A type 2 (or context-free) grammar has productions only of the form A w2, where A is a single nonterminal symbol.

Types of Grammars – cont.

• A type 3 (or regular) grammar has productions only of the form:– A aB, or A a, where

• A and B are nonterminal symbols, and• a is a terminal symbol, or

– S .

• Note.– Every type 3 grammar is a type 2 grammar– Every type 2 grammar is a type 1 grammar– Every type 1 grammar is a type 0 grammar

Types of Grammars - Summary

Type Restrictions on productions w1w2

0 No restrictions

1 l(w1) l(w2), or w2=

2 w1=A where AN

3 w1=A, and w2=aB or w2=a, where

AN, BN, aT, or

w1=S and w2=

Derivation Trees

• For type 2 (context-free) grammars:

A derivation (or parse) tree, is an ordered rooted tree that represents a derivation in the language generated by a context-free grammar, where:– the root represents the starting symbol;

– the internal vertices represent nonterminal symbols;

– the leaves represent the terminal symbols;

– for a production A w, the vertex representing A will have children vertices that represent each symbol in w.

Example

• Derivation tree for:

the hungry rabbit eats quickly

sentence

noun phrase verb phrase

article adjective noun verb adverb

the hungry rabbit eats quickly

Exercise 10.1 (2, 3)

Parsing

• To determine whether a string is in the language generated by a grammar, use:– Top-down parsing:

• Begin with S and attempt to derive the word by successively applying productions, or

– Bottom-up parsing: • Work backward: Begin by inspecting the word and

apply productions backward.

Example

• Let G = {V, T, S, P} be the grammar where:– V = {a, b, c, A, B, C, S}, T = {a, b, c}, – Productions: Determine whether cbab is in L(G)?

S AB Top-down parsing: A Ca S ABB Ba S AB CaB B Cb S AB CaB cbaB B b S AB CaB cbaB cbab C cb Bottom-up parsing: C b Cab cbab

Ab Cab cbabAB Ab Cab cbabS AB Ab Cab cbab

Backus-Naur Form

• Used with type 2 (context-free) grammars; like for specification of programming languages:– Use ::= instead of – Enclose nonterminal symbols within < >– Group productions with same left side with symbol |

• Example.– <signed integer> ::= <sign><integer>– <sign> ::= + | -– <integer> ::= <digit> | <digit><integer>– <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

top related