symbol table is a data structure kept by a translator that …kzhu/ei326/l2.pdfsymbol table a symbol...
TRANSCRIPT
SYMBOL TABLE
A symbol table is a data structure kept by a
translator that allows it to keep track of each
declared name and its binding.
Assume for now that each name is unique within
its local scope.
The data structure can be any implementation of
a dictionary, where the name is the key.
1
HANDLING NONLOCAL REFERENCES
1. Each time a scope is entered, push a new dictionary onto
the stack.
2. Each time a scope is exited, pop a dictionary off the top of
the stack.
3. For each name declared, generate an appropriate binding
and enter the name-binding pair into the dictionary on the
top of the stack.
4. Given a name reference, search the dictionary on top of
the stack:
a) If found, return the binding.
b) Otherwise, repeat the process on the next dictionary down in
the stack.
c) If the name is not found in any dictionary, report an error.
2
EXAMPLE DICTIONARY STACK
At line 4 and 11:
<j, 2> <i, 2> <size,1> <a, 1>
<sort, 1>
At line 7:
<t, 6>
<j, 2> <i, 2> <size,1> <a, 1>
<sort, 1>
3
1 void sort (float a[ ], int size) {
2 int i, j;
3 for (i = 0; i < size; i++)
4 for (j = i + 1; j < size; j++)
5 if (a[j] < a[i]) {
6 float t;
7 t = a[i];
8 a[i] = a[j];
9 a[j] = t;
10 }
11 }
STATIC SCOPING
For static scoping, the referencing environment for
a name is its defining scope and all nested
subscopes.
The referencing environment defines the set of
statements which can validly reference a name.
4
DYNAMIC SCOPING
In dynamic scoping, a name is bound to its most
recent declaration based on the program’s call
history.
Used be early Lisp, APL, Snobol, Perl.
Symbol table for each scope built at compile time,
but managed at run time.
Scope pushed/popped on stack when
entered/exited.
5
6
1 int h, i;
2 void B(int w) {
3 int j, k;
4 i = 2*w;
5 w = w+1;
6 ...
7 }
8 void A (int x, int y) {
9 float i, j;
10 B(h);
11 i = 3;
12 ...
13 }
14 void main() {
15 int a, b;
16 h = 5; a = 3; b = 2;
17 A(a, b);
18 B(h);
19 ...
20 }
RUNTIME SYMBOL TABLE
Call history
main (17) A (10) B
Function Dictionary
B <w, 2> <j, 3> <k, 3>
A <x, 8> <y, 8> <i, 9> <j, 9>
main <a, 15> <b, 15>
<h, 1> <i, 1> <B, 2> <A, 8> <main, 14>
Reference to i (4) resolves to <i, 9> in A.
7
8
1 int h, i;
2 void B(int w) {
3 int j, k;
4 i = 2*w;
5 w = w+1;
6 ...
7 }
8 void A (int x, int y) {
9 float i, j;
10 B(h);
11 i = 3;
12 ...
13 }
14 void main() {
15 int a, b;
16 h = 5; a = 3; b = 2;
17 A(a, b);
18 B(h);
19 ...
20 }
RUNTIME SYMBOL TABLE (AGAIN)
Same example: call history
main (17) B
Function Dictionary
B <w, 2> <j, 3> <k, 3>
main <a, 15> <b, 15>
<h, 1> <i, 1> <B, 2> <A, 8> <main, 14>
Reference to i (4) resolves to <i, 1> in global scope.
9
DISADVANTAGES OF DYNAMIC SCOPING
Compromises the ability to statically type check
references to non-local variables
Variables in a function visible to other functions
which call this function program less reliable
Access to non-local variable follows chains of
dynamic links more time consuming
Therefore most modern languages prefer static
scoping!
10
VISIBILITY
A name is visible if its referencing environment
includes the reference and the name is not
redeclared in an inner scope.
A name redeclared in an inner scope effectively
hides the outer declaration.
Some languages provide a mechanism for
referencing a hidden name; e.g.: this.x in
C++/Java.
11
EXAMPLE JAVA PROGRAM
1 public class Student {
2 private String name;
3 public Student (String name, ...) {
4 this.name = name;
5 ...
6 }
7 }
12
OVERLOADING
Overloading uses the number or type of
parameters to distinguish among identical
function names or operators.
Examples:
+, -, *, / can be float or int
+ can be float or int addition or string
concatenation in Java
System.out.print(x) in Java
13
LIFETIME
The lifetime of a variable is the time interval
during which the variable has been allocated a
block of memory.
Earliest languages used static allocation.
Memory assigned at compile time
Function only allocated space for arguments and
return value recursive function not supported!
Algol introduced the notion that memory should
be allocated/deallocated at scope entry/exit.
Modern languages are based on this idea but may
break scope equals lifetime rule. 14
TYPES
15
A type is a collection of values and operations on
those values.
Example: Integer type has values ..., -2, -1, 0, 1, 2, ...
and operations +, -, *, /, <, ...
The Boolean type has values true and false and
operations , , .
16
BASICS
BASICS
Computer types have a finite number of values due to
fixed size allocation; problematic for numeric types.
Exceptions:
Smalltalk uses unbounded fractions.
Haskell type Integer represents unbounded integers.
More problematic is the fixed sized floating point numbers
0.2 is not exact in binary.
So 0.2 * 5 is not exactly 1.0
Floating point is inconsistent with real numbers in mathematics.
17
BASICS
In the early languages, Fortran, Algol, Cobol, all of
the types were built in.
If needed a type color, could use integers; but what
does it mean to multiply two colors?
Purpose of types in programming languages is to
provide ways of effectively modeling a problem
solution.
18
TYPE ERRORS
Machine data carries no type information.
Basically, just a sequence of bits.
Example: 0100 0000 0101 1000 0000 0000 0000 0000
• The floating point number 3.375
• The 32-bit integer 1,079,508,992
• Two 16-bit integers 16472 and 0
• Four ASCII characters: @ X NUL NUL
19
TYPE ERRORS
A type error is any error that arises because an
operation is attempted on a data type for which it is
undefined.
Type errors are common in assembly language
programming.
High level languages reduce the number of type
errors.
A type system provides a basis for detecting type
errors.
20
STATIC AND DYNAMIC TYPING
A type system imposes constraints such as the
values used in an addition must be numeric.
Cannot be expressed syntactically in EBNF.
Some languages perform type checking at compile
time (e.g., C).
Other languages (e.g., Perl) perform type checking at
run time.
Still others (e.g., Java) do both.
21
STATIC AND DYNAMIC TYPING
A language is statically typed if the types of all
variables are fixed when they are declared at
compile time.
A language is dynamically typed if the type of a
variable can vary at run time depending on the
value assigned.
Can you give examples of each?
Static: C/C++, Java, ML, Haskell
Dynamic: Perl, Python, JavaScript, Prolog, Scheme
22
BASIC TYPES
Terminology in use with current 32-bit
computers:
Nibble: 4 bits
Byte: 8 bits
Half-word: 16 bits
Word: 32 bits
Double word: 64 bits
Quad word: 128 bits
23
FINITE SIZE IN TYPES
Unlike mathematics:
a + (b + c) (a + b) + c (why??)
In most languages, the numeric types are finite in
size.
So a + b may overflow the finite range.
Also in C-like languages, the equality and
relational operators produce an int, not a Boolean
24
OVERLOADING
An operator or function is overloaded when its
meaning varies depending on the types of its
operands or arguments or result.
Java: a + b (ignoring size)
integer add
floating point add
string concatenation
Mixed mode: one operand an int, the other floating
point
25
TYPE CONVERSION
Type conversion is implicit or explicit change of the type of a value to a different one.
Implicit conversion type coercion
Explicit conversion type casting
A type conversion is a narrowing conversion if the result type permits fewer bits, thus potentially losing information.
Otherwise it is termed a widening conversion.
Should languages ban implicit narrowing conversions?
Why?
Lose information
Unexpected results:
26
In PL/I:
declare (a) char (3);
…
a = ‘123’;
a = a + 1;
NONBASIC TYPES
Enumeration:
enum day {Monday, Tuesday, Wednesday, Thursday,
Friday, Saturday, Sunday};
enum day myDay = Wednesday;
In C/C++ the above values of this type are
0, ..., 6.
More powerful in Java:
for (day d : day.values())
Sytem.out.println(d); 27
POINTERS
C, C++, Ada, Pascal
Java???
Value is a memory address
Indirect referencing
Operator in C: *
28
EQUIVALENCE OF ARRAYS AND POINTERS IN C
float sum(float a[ ], int n) {
int i;
float s = 0.0;
for (i = 0; i<n; i++)
s += a[i];
return s;
float sum(float *a, int n) {
int i;
float s = 0.0;
for (i = 0; i<n; i++)
s += *a++;
return s;
29
STRCPY EXAMPLE
void strcpy(char *p, char *q) {
while (*p++ = *q++) ;
}
30
POINTER OPERATIONS
If T is a type and ref T is a pointer:
& : T → ref T
* : ref T → T
For an arbitrary variable x:
*(&x) = x
31
PITFALLS OF POINTERS
Bane of reliable software development
Error-prone
Buffer overflow, memory leaks
Particularly troublesome in C
Other languages such as Java, Haskell, ML and
Prolog completely remove pointers from the
vocabulary of the language
However internally these languages still makes
heavy use of pointers
Still can do dynamic allocation/deallocation of
memory 32
ARRAYS AND LISTS
Array: Indexed sequences of values of the same type.
Example:
int a[10];
float x[3][5]; /* odd syntax vs. math */
char s[40];
/* indices: 0 ... n-1 */
Default one-dimensional, but possible to have array
of arrays: x[3][5] is an array of 5 3-element arrays
(odd syntax!) 33
INDEXING
Only operation for many languages
Type signature
[ ] : T[ ] x int → T
Example
float x[3] [5];
type of x: float[ ][ ]
type of x[1]: float[ ]
type of x[1][2]: float
34
STRUCTURES
Analogous to a tuple in mathematics
Collection of elements of different types
Used first in Cobol, PL/I
Absent from Fortran, Algol 60
Common to Pascal-like, C-like languages
Omitted from Java as redundant
35
STRUCTURE (CONT’D)
struct employeeType {
int id;
char name[26];
int age;
float salary;
char dept;
};
struct employeeType employee;
...
employee.age = 45;
Number of bytes employeeType requires? (assuming 32-bit int)
Answer: 39 bytes
Memory allocation may have special requirement
Reverse order of allocation
32-bit int or float must be allocated at address multiple of 4
How many bytes are needed?
Answer: 44 bytes
36
UNIONS
C: union
Pascal: case-variant record
Logically: multiple views of same storage
Useful in some systems applications
37
FUNCTIONS AS TYPES
“First-class citizens”:
Can be assigned a value
Can be passed as an argument to a function
Pascal example:
function newton(a, b: real; function f: real): real;
Know that f returns a real value, but the arguments to f are unspecified.
Java interface class:
public interface RootSolvable {
double valueAt(double x);
}
public double Newton(double a, double b, RootSolvable f);
38
SUBTYPES
A subtype is a type that has certain constraints placed on its values or operations.
In Ada subtypes can be directly specified:
subtype one_to_ten is Integer range 1 .. 10;
type Day is (Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday);
subtype Weekend is Day range Saturday .. Sunday;
type Salary is delta 0.01 digits 9
range 0.00 .. 9_999_999.99;
subtype Author_Salary is Salary digits 5
range 0.0 .. 999.99; 39
SUBTYPES (CONT’D)
Java implements subtypes using class hierarchy:
Integer i = new Integer(3);
...
Number v = i;
...
Integer x = (Integer) v;
//Integer is a subclass of Number,
// and therefore a subtype
In general, t: T, s: S, t = s is possible only if:
T = S, or
S is a subtype of T 40
Type cast required
POLYMORPHISM AND GENERICS
Greek: “poly” means many, “morph” means
form/shape
A function or operation is polymorphic if it can be
applied to any one of several related types and
achieve the same result.
An advantage of polymorphism is that it enables
code reuse.
41
POLYMORPHISM
Example: overloaded built-in operators and
functions
+ - * / == != ...
Java: + also used for string concatenation
Ada 83
Ada, C++: define + - ... for new types
Java overloaded methods: number or type of
parameters
Example: class PrintStream
print, println defined for:
boolean, char, int, long, float, double, char[ ]
String, Object
42
SEMANTICS
43
MOTIVATION
To provide an authoritative definition of the
meaning of all language constructs for:
Programmers
Compiler writers
Standards developers
A programming language is complete only
when its syntax, type system, and semantics
are well-defined.
44
MOTIVATION (CONT’D)
Semantics is a precise definition of the meaning of a
syntactically and type-wise correct program.
Ideas of “meaning”:
Operational Semantics:
The meaning attached by compiling using compiler C and
executing using machine M. Ex: Fortran on IBM 709.
Axiomatic Semantics:
Program Verification
Denotational Semantics:
Statements as state transforming functions
High level mathematic precision 45
EXPRESSION SEMANTICS
Expression Semantics:
Operators + associativity + precedence
Evaluation orders
Pure vs. Impure
46
AN EXAMPLE
Infix (a + b) – (c * d)
Infix uses associativity and precedence to disambiguate.
Polish Prefix: - + a b * c d
Polish Postfix: a b + c d * -
Same symbol can’t be used for operations of different
arities
Cambridge Polish: (- (+ a b) (* c d))
47
SHORT CIRCUIT EVALUATION
Traditionally, a op b is evaluated as
eval (a) and eval (b) and then op
Optimization:
a and b evaluated as:
if a then b else false
a or b evaluated as:
if a then true else b
Also known as lazy evaluation: b can be
undefined 48
SIDE EFFECT
A change to any non-local variable or I/O.
What is the value of:
i = 2; b = 2; c = 5;
a = b * i++ + c * i;
Depends on which one is evaluated first:
b * i++ or c * I
Side Effects make a language impure and should
be avoided if possible!
49
PROGRAM STATE (IMPERATIVE & OO)
The state of a program is the bindings of all
active variables and their current values.
Maps:
the pairing of active variable names with specific
memory locations environment
the pairing of active memory locations with their
current values memory
state = memory × environment
50
PROGRAM STATE (CONT’D)
The current statement (portion of an abstract syntax
tree) to be executed in a program is interpreted
relative to the current state.
The individual steps that occur during a program
run can be viewed as a series of state
transformations.
For the purposes of this lecture, use only a map from
a variable to its value; like a debugger watch
window, tied to a particular statement.
51
THE FACTORIAL PROGRAM
// compute the factorial of n
1 void main ( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while (i < n) {
7 i = i + 1;
8 f = f * i;
9 }
10 }
52
// compute the factorial of n
1 void main ( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while (i < n) {
7 i = i + 1;
8 f = f * i;
9 }
10 }
n i f
undef undef undef
53
// compute the factorial of n
1 void main ( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while (i < n) {
7 i = i + 1;
8 f = f * i;
9 }
10 }
n i f
3 undef undef
54
// compute the factorial of n
1 void main ( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while (i < n) {
7 i = i + 1;
8 f = f * i;
9 }
10 }
n i f
3 1 undef
55
// compute the factorial of n
1 void main ( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while (i < n) {
7 i = i + 1;
8 f = f * i;
9 }
10 }
n i f
3 1 1
56
// compute the factorial of n
1 void main ( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while (i < n) {
7 i = i + 1;
8 f = f * i;
9 }
10 }
n i f
3 1 1
57
// compute the factorial of n
1 void main ( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while (i < n) {
7 i = i + 1;
8 f = f * i;
9 }
10 }
n i f
3 2 1
58
// compute the factorial of n
1 void main ( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while (i < n) {
7 i = i + 1;
8 f = f * i;
9 }
10 }
n i f
3 2 2
59
// compute the factorial of n
1 void main ( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while (i < n) {
7 i = i + 1;
8 f = f * i;
9 }
10 }
n i f
3 2 2
60
ASSIGNMENT SEMANTICS
Fundamental to imperative and object-oriented
programming
Issues
Multiple assignment
Assignment statement vs. expression
Copy vs. reference semantics
Semantics of assignment:
Evaluate the source expression a value
Replace the value of the target variable a state
61
COPY VS. REFERENCE SEMANTICS
Copy: a = b;
a, b have same value.
Changes to either have no effect on other.
Used in imperative languages.
Reference
a, b point to the same object.
A change in object state affects both
Used by many object-oriented languages.
62
COPY VS. REFERENCE SEMANTICS
public void add (Object word, Object number) {
Vector set = (Vector) dict.get(word);
if (set == null) { // not in Concordance
set = new Vector( );
dict.put(word, set);
}
if (allowDupl || !set.contains(number))
set.addElement(number);
}
//Under copy semantics, number will not be added to the dictionary, because set is an object referenced from dict.
63
CONTROL FLOW SEMANTICS
To be complete, an imperative language needs:
Statement sequencing
Conditional statement
Looping statement
64
SEQUENCE
s1 s2
Semantics: in the absence of a branch (return,
break, continue, etc.):
First execute s1
Then execute s2
Output state of s1 is the input state of s2
65
CONDITIONAL
IfStatement if ( Expresion ) Statement [ else Statement ]
Example: if (a > b)
z = a;
else
z = b;
If the test expression is true
then the output state of the conditional is the output
state of the then branch
Else:
the output state of the conditional is the output state of
the else branch
66
LOOPS
WhileStatement while ( Expression ) Statement
The expression is evaluated.
If it is true, first the statement is executed, and
then the loop is executed again.
Otherwise the loop terminates.
Foreach loop:
An iterator is any finite set of values over which a
loop can be repeated.
67