description of programming languages 1 using regular expressions and context free grammars

13
Description of programmin g languages 1 Description of programming languages Using regular expressions and context free grammars

Upload: isabella-wilkinson

Post on 11-Jan-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

1

Description of programming languages

Using regular expressions and context free grammars

Page 2: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

2

Introduction

• Programming languages must be described in an exact language– No discussion whether a language element is legal or

not• I will introduce 2 description languages

– Regular expressions• Used to describes the “small” parts of a programming

language– Identifiers, numbers, etc.

– Context free grammars• Used to describes the “bigger” parts of a programming

language– Expressions, statements, classes, etc.

Page 3: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

3

Regular expressions defined

• We need an alphabet called Σ– Example alphabets: ASCII, UNICODE

• Regular expressions are sets– Ø (the empty set) is a regular expression– { ε } is a regular set

• ε means the empty string– All sets {a} where a is in the alphabet Σ are regular

expressions– From two regular expressions R and S we can generate

more regular expressions• R | S R U S• RS Concatenations of strings from R and

from S• R* if R is {a} then R* is {ε, a, aa, aaa, … }

Page 4: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

4

Regular expressions examples

• Set of positive integers – (0|1|2|3|4|5|6|7|8|9) (0|1|2|3|4|5|6|7|8|9)*

• Set of words in English– (a|b|…|z)(a|b|…|z)*– Not exactly English …

• bbz is in the set, but is not an English word

Page 5: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

5

Regular expressions, short hand notation

• R+ means R R*– 1 or more occurrences

• R? means ε | R– 0 or 1 occurrence

• [a-z] means a|b|c|…|z• [a-zA-Z] means [a-z] | [A-Z]• Examples

– Integer: -?[0-9]+– Identifier: [a-zA-Z][a-zA-Z0-9]*

Page 6: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

6

Regular expressions in Java

• Java API which uses regular expressions– Class String

• String[].split(String regex)• “Java is my favorite language”.split(“ “)

– produces an array {Java, is, my, favorite, language}– “ “ is a very simple regular expression

– Package java.util.regex• Class Pattern• Class Matcher

Page 7: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

7

What regular expressions can’t do

• Regular expression can describe simple languages.

• Regular expressions have no “memory”– Cannot describe parenthesis structures

• (((a + b) + c) + d)• if (…) { if (…) … else …} else …

• We need something stronger!– Context free grammars

Page 8: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

8

Context free grammars defined

• A context free grammar consists of 4 parts– V is an alphabet– Σ is a set of terminals,Σ ⊂ V

• The elements of the set V − Σ are called non-terminals

– R is a set of production rules, (V − Σ) X V*– S the start symbol, S ∈ V − Σ

Page 9: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

9

Context free grammars examples

• Example a, b– Alphabet {a, b, A}– Terminals { a, b }

• Non-terminals { A }

– Production• {A → Aa, A → Ab, A → a, A → b}

– Some derivations• A → Aa → Aaa → Abaa → abaa• A → Ab → ab• A → Ab → bb

Page 10: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

10

Example: Boolean expressions

• We only state the productions explicitly– Terminals and non-

terminals can be inferred by looking at the productions

– Convention• Capital letters: Non-

terminals

• Non-capital letters: Terminals

• Boolean expressions– E → true– E → false– E → E && E– E → E || E– E → (E)– E → !E– Derivations

E → E && E → E && (E) → E && (E || E) →* true && (false || true)

Sometimes pictured as a (parse) tree.

Page 11: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

11

What context free grammars can’t do

• Context free grammars cannot be used to check that a variable is declared before it is used– And by no means to check the variables type

Page 12: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

12

The phases of a compiler

• Lexical analysis (scanning)– Using regular expressions

• Syntax analysis (parsing)– Using context free grammars

• Semantic analysis– Using a symbol table

• Code generation

Page 13: Description of programming languages 1 Using regular expressions and context free grammars

Description of programming languages

13

References• Wikipedia

– Regular expression http://en.wikipedia.org/wiki/Regular_expression

– Context-free grammar http://en.wikipedia.org/wiki/Context-free_grammar

• Friedl Mastering Regular Expressions, 2nd edition, O’Reilly 2002

– An entire book (460 pages) devoted to regular expressions

• J2SE 5.0 API specification– package java.util.regex

• Scott A. Hommel Regular Expressions, The Java Tutorial

– http://java.sun.com/docs/books/tutorial/extra/regex/index.html

• Lewis & Papadimitriou Elements of the Theory of Computation, Pearson 1997

– Introduction to regular expressions and context free grammars (and a lot more)

• Aho, Sethi & Ullman Compilers: Principles, Techniques and Tools, Addison Wesley 1986

– A famous book on compilers.– Referred to as “The Dragon Book”