mini-pascal compiling mini-pascal (mpc) language subset of the pascal programming language somewhat...
Post on 21-Dec-2015
219 views
TRANSCRIPT
Mini-Pascal
Compiling Mini-Pascal (MPC) language Subset of the Pascal programming language
Somewhat similar to the Java and “C” programming languages There are many differences, however
Differences make it much easier to compile ☺ We will discuss details of the actual language
when it becomes important
Scanning
1st stage of compiling a program Code written in a programming language
High-level languages are supposed to resemble English…
…but they don’t. Contain many features specifically designed for
the computer How often do you use a semi-colon?
Scanning
Raw text is hard for computer to understand Much easier for it to work with objects
Scanning converts text into tokens Object encoding a single text idea This is a very common problem, not just in
compilers
Scanning
Only consider token being processed This stage of compilation only generates tokens Looks for obvious lexical errors --- text that cannot
be legal Does not track past tokens Does not worry if text has any real meaning
Understanding meaning occurs later in the process
Lexical analysis for tokens in English Legal:
? “”, Snap p. 35 crackle < pop ppo quack!
Illegal:¡ I am excited !
Illegal:gemütlichkeit façade
Lexical analysis for tokens in English Legal:
? “”, Snap p. 35 crackle < pop ppo quack!
Illegal:¡ I am excited !
Illegal:gemütlichkeit façade
Types of Tokens in Mini-Pascal Operator
All the meaningful symbols in Mini-Pascal: Numerical: + - * ^ Comparative: < > <= >= <> == Separator: ( ) [ ] . ; , Assignment::=
Spaces are meaningful “:=” is one token “: =” is two tokens
Types of Tokens in Mini-Pascal Int
Includes all numbers defined by Mini-Pascal Mini-Pascal does not include real numbers
Int token includes an uninterrupted series of integers “1354934573212” is one token “13 45” is two tokens – for “13” and “45” “13.65” is three tokens – for “13”, “.”, and “65” “2,585” is three tokens – for “2”, “,”, and “585”
Types of Tokens in Mini-Pascal String
The literal strings in Mini-Pascal Java strings begin and end with double quote (“”) Pascal strings begin and end with single quote (‘’) Can include any set of characters, letters, and
numbers, but cannot go across multiple lines
‘Hi Mom. #1’ is one token -- “Hi Mom. #1” Note: The quotes are not included in the token
Types of Tokens in Mini-Pascal Identifier/Id
Includes keywords (reserved) in Mini-Pascal:and array begin case const div do downto else end for function if mod nil not of or procedure program record repeat then to type until var while
Also potential variable and method names Begin with letter and then any combination of
letters and numbers DO NOT worry (yet) if it is an actual name
Other Work While Scanning
Comments Pascal also includes comments Begins with either a “{“ or “(*” Then include any legal characters including
letters, numbers, spaces, newlines End with either “}” or “*)” { This is a legal comment *) (* and so
is this } There is no comment token --- it is not used in
compilation