invitation to computer science 5 th edition chapter 11 compilers and language translation

27
Invitation to Computer Science 5 th Edition Chapter 11 Compilers and Language Translation

Upload: aubrie-ferguson

Post on 26-Dec-2015

224 views

Category:

Documents


4 download

TRANSCRIPT

Invitation to Computer Science 5th Edition

Chapter 11

Compilers and Language Translation

Objectives

In this chapter, you will learn about:

• The compilation process– Phase I: Lexical analysis – Phase II: Parsing – Phase III: Semantics and code generation – Phase IV: Code optimization

Invitation to Computer Science, 5th Edition 2

Introduction

• Compiler– Translates high-level language into machine

language prior to execution

• Assembly language and machine language are related one to one

• One to many– Relationship between a high-level language and

machine language

• Compiler goals– Correctness; efficient and concise

Invitation to Computer Science, 5th Edition 33

The Compilation Process

• Four phases of compilation– Phase I: Lexical analysis– Phase II: Parsing– Phase III: Semantic analysis and code generation– Phase IV: Code optimization

Invitation to Computer Science, 5th Edition 44

Invitation to Computer Science, 5th Edition 5

Figure 11.1 General Structure of a Compiler

Invitation to Computer Science, 5th Edition 6

Figure 11.2 Overall Execution Sequence of a High-level Language Program

Invitation to Computer Science, 5th Edition 7

Phase I: Lexical Analysis

• Lexical analyzer– Groups input characters into units called tokens

• Scanner– Discards nonessential characters, such as blanks

and tabs– Groups remaining characters into high-level

syntactical units such as symbols, numbers, and operators

Invitation to Computer Science, 5th Edition 8

Figure 11.3 Typical Token Classifications

Phase II: Parsing

• During the parsing phase:– Compiler determines whether the tokens recognized

by the scanner during phase I fit together in a grammatically meaningful way

• Parsing– Process of diagramming a high-level language

statement– Done by a program called a parser

Invitation to Computer Science, 5th Edition 9

Invitation to Computer Science, 5th Edition 10

Grammars, Languages, and BNF

• Backus-Naur Form– Named after its designers John Backus and Peter

Naur– Syntax of a language is specified as a set of rules,

also called productions– Entire collection of rules is called a grammar– Lefthand side of a BNF rule is the name of a single

grammatical category– Operator ::= means “is defined as,” and “definition,”

also called righthand side

Invitation to Computer Science, 5th Edition 11

Grammars, Languages, and BNF (continued)

• Backus-Naur Form– Nonterminal: intermediate grammatical category

used to help explain and organize the language– Goal symbol: final nonterminal– Language: collection of all statements that can be

successfully parsed– Metasymbols: <, >, and ::=

Invitation to Computer Science, 5th Edition 12

Parsing Concepts and Techniques

• Parser – Receives as input the BNF description of a high-level

language and a sequence of tokens recognized by the scanner

• Look-ahead parsing algorithms– “Look down the road” a few tokens to see what

would happen if a certain choice is made

• Ambiguous– Grammar that allows the construction of two or more

distinct parse trees for the same statement

Invitation to Computer Science, 5th Edition 13

Figure 11.4 First Attempt at a Grammar for a Simplified Assignment Statement

Invitation to Computer Science, 5th Edition 14

Figure 11.5 Parse Tree Produced by the Parser

Invitation to Computer Science, 5th Edition 15

Figure 11.6 Second Attempt at a Grammar for Assignment Statements

Invitation to Computer Science, 5th Edition 16

Figure 11.7 Two Parse Trees for the Statement x = x + y + z

Invitation to Computer Science, 5th Edition 17

Figure 11.8 Third Attempt at a Grammar for Assignment Statements

Invitation to Computer Science, 5th Edition 18

Figure 11.9 Grammar for a Simplified Version of an if-else Statement

Invitation to Computer Science, 5th Edition 19

Figure 11.10 Parse Tree for the Statement if (x55y)x5z; else x5y;

Phase III: Semantics and Code Generation

• Semantic record – Data structure that stores information about a

nonterminal

• First part of code generation – Involves a pass over the parse tree to determine

whether all branches of the tree are semantically valid

• Code generation– Compiler must determine how transformation of

grammatical objects can be accomplished in machine language

Invitation to Computer Science, 5th Edition 20

Phase III: Semantics and Code Generation (continued)

• Optimization– Compiler polishes and fine-tunes the translation so

that it runs a little faster or occupies a little less memory

Invitation to Computer Science, 5th Edition 21

Invitation to Computer Science, 5th Edition 22

Figure 11.11 Code Generation for the Assignment Statement x = x + y + z

Phase IV: Code Optimization

• Efficiency– Ability to write highly optimized programs that

contained no wasted microseconds or unnecessary memory cells

• Goal in compiler design today– Provide a wide array of compiler tools to simplify the

programmer’s task and increase productivity

• Integrated development environment– Compiler is embedded within a collection of

supporting software development routines

Invitation to Computer Science, 5th Edition 23

Phase IV: Code Optimization (continued)

• Two types of optimization– Local optimization and global optimization

• Possible local optimizations– Constant evaluation– Strength reduction– Eliminating unnecessary operations

Invitation to Computer Science, 5th Edition 24

Invitation to Computer Science, 5th Edition 25

Figure 11.12 Optimized Code for the Assignment Statement x = x + y + z

Invitation to Computer Science, 5th Edition 26

Summary

• Compiler – Piece of system software that translates high-level

languages into machine language

• Goals of a compiler– Correctness and the production of efficient and

concise code

• Source program– High-level language program

Invitation to Computer Science, 5th Edition 27

Summary (continued)

• Object program– The machine language translation of the source

program

• Phases of the compilation process– Phase I: Lexical analysis– Phase II: Parsing– Phase III: Semantic analysis and code generation– Phase IV: Code optimization