cs1621b
TRANSCRIPT
-
7/31/2019 cs1621b
1/175
Course Notes for
CS1621 Structure of
Programming LanguagesPart BBy
John C. Ramirez
Department of Computer ScienceUniversity of Pittsburgh
-
7/31/2019 cs1621b
2/175
2
These notes are intended for use by students in CS1621 at
the University of Pittsburgh and no one else
These notes are provided free of charge and may not be soldin any shape or form
Material from these notes is obtained from various sources,
including, but not limited to, the textbooks: Concepts of Programming Languages, Seventh Edition, by Robert
W. Sebesta (Addison Wesley)
Programming Languages, Design and Implementation, FourthEdition, by Terrence W. Pratt and Marvin V. Zelkowitz (Prentice
Hall) Compilers Principles, Techniques, and Tools, by Aho, Sethi and
Ullman (Addison Wesley)
-
7/31/2019 cs1621b
3/175
3
Expressions
Expressions are vital to programs
Allow programmer to specify the calculationsthat computer is to perform
It is important that programmer understand
how a language evaluates expressions
Things to consider:
Precedence and associativity
Order of operand evaluation Side-effects of evaluation
Overloadings and coercions
-
7/31/2019 cs1621b
4/175
4
Expressions
Precedence and Associativity
We always learn these rules for any newlanguage
Vital to using expressions correctly
Most languages have similar precedence forthe standard operators: * / then +
But programmer needs to understand
precedence and associativity for alloperators, especially those that may beunusual
-
7/31/2019 cs1621b
5/175
5
Expressions
Ex: boolean and relational operators
and or not < > = != ==
In Pascal, the boolean operators have higherprecedence than the relational operators(opposite of C++)
if x < y then writeln(Less);
if x < y and y < z then writeln(Middle); Above is an error in Pascal, since the first sub-
expression evaluated would be y and y
if (x < y) and (y < z) then writeln(Middle);
Now it is ok
In C++if (x < y && y < z) cout
-
7/31/2019 cs1621b
6/175
6
Expressions
Ex: unary ++ and -- in C++
Precedence and associativity are wacky!#include
using namespace std;
int main()
{
unsigned int i1 = 0, i2, i3, i4, i5, j, k, m1, m2, m3, m4, m5;
j = i1++; k = ++i1;
cout
-
7/31/2019 cs1621b
7/175
7
Expressions
Output? See plusplus.cpp try it on different
platforms http://www.cppreference.com/operator_precedenc
e.html See problem in Assignment 3
Compare to plusplus.java and plusplus.pl
http://www.cppreference.com/operator_precedence.htmlhttp://www.cppreference.com/operator_precedence.htmlhttp://www.cppreference.com/operator_precedence.htmlhttp://www.cppreference.com/operator_precedence.html -
7/31/2019 cs1621b
8/175
8
Expressions
In some cases, expression is ambiguous and
compiler will not let you do it, or warn youabout it
Ex: A ** B ** C in Ada Must have parentheses
Ex: Mixing bitwise operators in C++ Warning to use parentheses
Sometimes you could probably figure it out,
but youre better off not trying Ex: If more than one coercion can occur in
C++ May have defined constructor and conversion fn
-
7/31/2019 cs1621b
9/175
9
Expressions
Sometimes you dont think you should care,
about precedence and associativity, but youshould
In math, addition and multiplication areassociative and commutative
On computer, overflow can cause this to notalways be the case: floats x = 1e+30, B = 1.0/1e+30, C = 1e+30
A * B * C A * C * B
~= 1e+30 = infinitysee Overflow.cpp
F1.add(F2); F2.add(F1)
-- If F1 and F2 are from different classes, the operationsmay be different or perhaps not even legal
-
7/31/2019 cs1621b
10/175
10
Expressions
Side-effects can also cause evaluation order
problems
Expressions can involve function calls, whichcan change variable valuesY = f(X) + X;
Y = X + f(X);
Without side-effects, the results are thesame, but if f(X) changes the value of X, theresults could be different
Most languages allow reference parameters withfunctions
These can cause logic errors if used improperly
See side.cpp
-
7/31/2019 cs1621b
11/175
11
Expressions
How to handle this?
Leave it up to the programmer, as in Pascal andC++
Limits compiler optimizations, some of which mayinclude reordering of operations
Compiler cannot reorder if it could possibly changeresult
Do not allow (most) side-effects to occur, as inAda Ada functions cannot change parameters
Now optimizations can reorder expressions withoutchanging result (at least due to this)
Best advice is to program in such away as to either
avoid all side-effects, or to only allow them incases where they will not affect expressionevaluation
-
7/31/2019 cs1621b
12/175
12
Expressions
Operator Overloading
Used in many newer high-level languages
Can be good and bad
Good:Aids in readability and simplifies code if used
correctly Ex: New class Complex variables A, B and C
A + B + C is more clear than (A.add(B)).add(C)
Ex: String variables can be comparedif (A < B)
is clearer thanif (A.compareTo(B) < 1)
-
7/31/2019 cs1621b
13/175
13
Expressions
Bad:
Can harm readability if used incorrectly Ex: + defined to do multiplication
But methods could be improperly named as well
Function calls are not obvious, especially if
other versions of the function exist In C++ we could have an member function + and
also a friend function + which is used?
Can allow some logic errors to go undetected Ex: C++ uses / for float and integer division
If user expects a value between 0 and 1, its notgoing to happen if integer division is used
-
7/31/2019 cs1621b
14/175
14
Expressions
Some languages like C++ and Ada allow
programmer-defined operator overloading
Others like Java do not
Both positions have support
-
7/31/2019 cs1621b
15/175
15
Expressions
Coercion and conversion
In many expressions we use more than onedatatype Mixed expressions
This seems a reasonable thing to allow
However, often the operators and functionsused are defined for only a single type
In this case, to allow mixed expressions to beused, some types must be converted to other
types The differences in languages are whether
these conversions should be IMPLICIT orEXPLICIT
-
7/31/2019 cs1621b
16/175
16
Expressions
Explicit conversion
In this case the language allows little or nomixed expressions in the code
To allow mixing of data types, theprogrammer must convert through an
operation of function call Ex: Ada does not even allow mixing of floats and
integers
Good:
Everything is clear no uncertainty or ambiguity Programmer can more easily verify correctness ofprograms
Easier to avoid logic errors
-
7/31/2019 cs1621b
17/175
17
Expressions
Bad:
Makes language very wordy Can be annoying, especially when the types are
similar (ex. addition of integers and floats)
Implicit conversion coercion
In this case mixed expressions are allowed,and the language coerces types whereneeded to allow types to match
Usually a language has some rules by which
the coercions are performed Good:
Less wordy makes programs shorter andsometimes easier to write
-
7/31/2019 cs1621b
18/175
18
Expressions
Bad:
Programs are harder to verify for correctness It is not always clear which coercion is being done,
especially when programmer-defined coercions areallowed
Can lead to logic errors in programs
Ex: In C++ expressions are always coerced if theycan be
Standard rules of promotion for predefined typescan be easily remembered
However, programmer can also define functionsthat will be used for coercion
Constructors for classes and conversion functionsare both implicitly called if necessary
Now the rules are less clear and can lead toambiguity and logic errors
-
7/31/2019 cs1621b
19/175
19
Expressions
Consider A = B + C where A, B and C are all ofdifferent types
Any/all of the following could exist: + operator with two type B arguments
+ operator with two type C arguments
Constructor for type B with argument type C
Constructor for type C with argument type B
Coercion function from C to B Coercion function from B to C
Constructor for type A with argument type B
Constructor for type A with argument type C
How does programmer know which will be used?
Should NOT assume any particular coercion willoccur in this case
Here explicit coercion should be used to removeambiguity
See coercion.cpp and rational.h
-
7/31/2019 cs1621b
20/175
20
Expressions
Boolean expressions
Expressions that evaluate to TRUE or FALSE
Formed using relational operators andboolean operators
Relational operators operators whichcompare values Operands can be most primitive types and complex
types as well in some cases
Boolean operators operators used tocombine boolean results Operands must be boolean values
Exception is C/C++
-
7/31/2019 cs1621b
21/175
-
7/31/2019 cs1621b
22/175
22
Expressions
Short-Circuit Evaluation
Important note (that we may not haveemphasized earlier):
Operator precedence and associativity are for
OPERATORS, not OPERANDS The operators simply indicate how the operands
are combined/utilized, NOT the order in which theyare accessed/determined
For example: A + B + C + D
We know we first add A and B, then add C, then
add D But the VALUES for A, B, C and D could be
obtained in ANY ORDER Done to optimize execution (ex. in parallel)
-
7/31/2019 cs1621b
23/175
23
Expressions
This is significant in (at least) 2 situations:
1) Operand evaluation produces a side-effect thatchanges result of subsequent operand evaluation As we discussed previously, operand could be a
function call with a reference parameter
Operand could be used/modified more than once,as with ++ example
2) An operand may not be even be valid if aprevious operand evaluates in a certain wayEx: if ((X != 0) && (Y/X < 1)) cout
-
7/31/2019 cs1621b
24/175
24
Expressions
Idea of SSE is simple:
Evaluate boolean expressions only until a finalanswer can be determined For example with &&, we know that
FALSE && ANYTHING == FALSE
so we would not get the division by zero error
SSE is nice because it makes our codesimpler
If we know compiler uses SSE, we can put
into a single expression what otherwise wouldrequire two
-
7/31/2019 cs1621b
25/175
25
Expressions
Ex: if ((X != 0) && (Y/X < 1)) cout
-
7/31/2019 cs1621b
26/175
26
Expressions
Solution is to offer programmer the choice
Ada uses arbitrary evaluation of operandsnormally But special operators and then and or else provide
short-circuit evaluation if desired
C++ and Java use SSE for && and || butarbitrary evaluation for bitwise & and |
-
7/31/2019 cs1621b
27/175
27
Expressions
Assignment
Central to Imperative Languages
Gives a value to a variable
Typical syntax:
Semantics:1) Compute lvalue of variable
2) Compute rvalue of expression
3) Store computed rvalue in lvalue location
-
7/31/2019 cs1621b
28/175
28
Expressions
Variations
Some languages allow multiple targets
C++ and Java allow conditional targets Wacky ?: operator
C, C++ and Java have many assignmentvariations for convenience Ex: ++, +=, *=
C, C++ and Java return the rvalue asoperation result
Allows assignment to be mixed within otherexpressions
As with many features from C, C++, this is bothgood and bad
-
7/31/2019 cs1621b
29/175
29
Expressions
Allows shorter code in cases such as:A = B = C
while ((ch = getchar()) != EOF)
Since it is changing the value of a variable, orderof evaluation is critical
Typically associates right to left, and it is a goodidea to parenthesize (as above)
Famous C/C++ bug that we mentioned before:if (x = y) is wacky! Will ALWAYS be true if y is non-zero
Will ALWAYS be false if y is zero
Newer compilers warn you about it
Not possible in Java since if requires a boolean
Concern also must be given for overloading theassignment operator (legal in C++ and Ada) It is possible to cause it to behave differently from
what is normally expected
Care has to be taken so that it works in all cases
-
7/31/2019 cs1621b
30/175
30
Expressions
Ex: Overloading = for a linked list variableLList A, B;
// Fill B with various nodes
A = B;
If we want to use this assignment as with otherassignments, we need to return the assigned resultas the result of the assignment
In C++ this is typically a reference return value, sothat we can cascade the operator effectivelyA = (B = C); (A = B) = C;
On the left, when the assignment B = C is finished,we need the rvalue of the result
On the right, when the assignment A = B isfinished, we need the lvalue of the result
Reference allows both (even though right seemssilly to do)
Also, how about A = A; If we destroy old LL before assigning new one, this
could destroy the value
-
7/31/2019 cs1621b
31/175
31
Expressions
One issue that you may not normally
consider: How is the rvalue evaluated? For statically typed languages, there is usually
no ambiguity expression result type mustmatch the type of the variable
But for dynamically typed languages, it is nolonger clear Ex: in Prolog
A = 5 + 3
Since A is not necessarily an integer, 5 + 3 couldbe taken as a string just as reasonably as it couldbe taken as an arithmetic expression
See assig.pl
-
7/31/2019 cs1621b
32/175
32
Control Statements
Primary types of control in imperative
languages
Selection
Choose between 1 or more different actions
Iteration
Repeat an action 0 or more times
-
7/31/2019 cs1621b
33/175
33
Control Statements
Selection
One-way selection
ifstatement exists in virtually everyimperative language
Idea here is that we either execute astatement or do not
In modern languages this is achieved usingan if without the optional else
Two-way selection
Now we incorporate the else with the if
-
7/31/2019 cs1621b
34/175
34
Control Statements
Typical syntax:
if
else
Interesting issues:
1) Form of condition?
2) What kinds of statements are allowed?
3) Is nesting allowed and how is it interpreted?
-
7/31/2019 cs1621b
35/175
35
Control Statements
1) Form of condition
Most languages require a booleanexpression (true or false only)
C/C++ are exceptions int values areallowed
2) Kinds of statements
Original FORTRAN and BASIC allowed only asingle statement This is not conducive to good programming
techniques Only way to have multiple statements is by using
an unconditional branch, i.e. GO TO
-
7/31/2019 cs1621b
36/175
36
Control Statements
ALGOL 60 introduced the compound
statement Now an arbitrary number of statements can be
used
All newer imperative languages (and updates ofolder languages) either use compoundstatements or allow multiple statements within
the if
3) Nesting
It logically follows that a statement withinan if clause or else clause could be another
if statement Remember orthogonality
What issues occur in this case?
-
7/31/2019 cs1621b
37/175
37
Control Statements
Only problem of interest is one we have
already discussed If the number of if clauses and else clauses are
not equal, how are they associated?
There are two main approaches to handlingthis:1) Use a rule (static semantics) to determine how
this is handled This is the approach taken in Pascal, C, C++ and
Java
System handles the rule consistently, so there isno ambiguity, but, like rules of precedence and
associativity, the programmer could forget it ormake a mistake that is not caught
Can lead to logic errors
We have already seen this example
-
7/31/2019 cs1621b
38/175
38
Control Statements
2) Use syntax to determine how it is handled This is the approach taken in Ada, BASIC, Modula-
2, ALGOL 68 Every if statement must be syntactically terminated
(ex: end if)
Now an inner if clause without an else clause muststill have an end if, and syntactically the outer elsecan only be associated with the outer if
Perl has a slightly different approach: thestatement for an if MUST be a compoundstatement. Result is the same, since the inner ifwill now be within a compound statement
-
7/31/2019 cs1621b
39/175
39
Control Statements
Multiple Selection
Idea is to choose from many possible options
Clearly one way of doing this is throughnested if statements Often preferable, especially if the means of
selection is a series of separate booleanexpressions// Break tie for A and B in some sport
if (A beat B twice) then
A wins tie
else if (B beat A twice) then
B wins tieelse if (A scored more points than B) then
A wins tie
else if (B scored more points than A) then
B wins tie
-
7/31/2019 cs1621b
40/175
40
Control Statements
However, in some situations, the options are
based on different result values of a singleexpression: Ex: Menu in which user chooses an option from 1
to 5; each option causes a different action
In these instances, nested ifs could be used In fact these are all we really need But the nesting gets complicated, often making the
statements harder to follow and making themmore prone to logic errors
So many languages supply a case statement
Specifically designed for multiple alternativeselection based on different results of a singleexpression
-
7/31/2019 cs1621b
41/175
41
Control Statements
There are some interesting issues to
consider here Many are the same as for two-way selection
Text discusses them at length
A few that we will look at
What happens after the code for the matchedselection is executed?
One option is to break out of the structure,
continuing with the next statement after it This makes each option mutually exclusive This approach is taken by Algol W, Pascal, Ada
Probably the most intuitive idea the choices aremutually exclusive by default
-
7/31/2019 cs1621b
42/175
42
Control Statements
C, C++ and Java do not automatically break
out after the selection has been executed This is good and bad (as usual)
Adds flexibility If the execution for one selection is a superset of
another, it makes sense to allow the flow tocontinue within the selection statement
Causes potential logic problems Programmer must manually add breaks
If one is missed no syntax error occurs
What happens if no match is found?
Two logical alternatives: 1. Do nothing
2. Error
-
7/31/2019 cs1621b
43/175
43
Control Statements
C, C++, Java adopt the do-nothing
approach Seems logical that if nothing matches nothing
should be done
ANSI Standard Pascal and Ada adopt theerror approach
More reliable, since now an accidental out of rangevalue will be detected as an error rather than justa do nothing
C, C++, Java, Ada, Turbo Pascal, BASIC alsoprovide a default choice
Good idea to always use so you can detect an outof range value without causing a runtime or logicerror
-
7/31/2019 cs1621b
44/175
44
Control Statements
Iteration
Three primary types of iterative loops:conditional loops, counting loops andarbitrary loops
1) Conditional (logically controlled) loops Number of iterations is determined by a
boolean condition, and cannot be (usually)precalculated
ex:while (infile && valid == 1) Note that we cannot predict when this condition
will become false
-
7/31/2019 cs1621b
45/175
45
Control Statements
Many languages have two versions of the
conditional loop Pretest condition is tested prior to entering the
loop body May execute loop body 0 times
Posttest condition is tested immediately afterexecuting loop body
Will always execute loop body at least 1 time Ada does not have this version
Two versions are provided for conveniencewe can always simulate one loop with the
other (plus some conditionals) See loops.cpp Clearly the difference is where each is more
appropriate
-
7/31/2019 cs1621b
46/175
46
Control Statements
Conditional loops are the most general kind
of loops, and are really all that is needed inan imperative programming language
However, many looping applications dealwith arrays and sequences of values
For convenience and efficiency it is prudentto provide a looping structure gearedtoward these applications
2) Counting Loops (counter-controlled loops) Number of iterations determined by a
control variable, an initial value, a terminalvalue, and an increment
-
7/31/2019 cs1621b
47/175
47
Control Statements
We can (usually) precalculate the number of
iterations based on the initial value, terminalvalue and increment
Ex: for (int i = 3; i
-
7/31/2019 cs1621b
48/175
48
Control Statements
Machine can use a register for the iteration countand not have to worry about obtaining operands
for the comparisons at each iteration of the loop,something that must be done with a conditionalloop
To allow precalculation and iteration countsto work, some restrictions must be made on
the loop Loop control variable cannot be altered by the
programmer within the loop body
Terminal value must be calculated only one time,when loop is first entered
It will also speed things up if the loop control
variable is an integer (or integral type) so no floatoperations are necessary
This is the approach taken in Pascal and Ada See for.p
-
7/31/2019 cs1621b
49/175
49
Control Structures
Pascal and Ada also do not allow an
increment other than 1 or1, and do notcarry the value of the control variable pastthe end of the loop In Pascal, the value is officially undefined, but in
any Pascal implementation it will typically be oneof two things: 1) The terminal value of the loop or2) The terminal value + 1 or 1. 1) typicallyindicates that iteration counts are being used
In Ada, the loop control variable is implicitlydeclared in the loop header, and becomes reallyundefined at the end of the loop accessing itafterward will cause an undeclared variable error
This is now generally accepted as a good idea, sinceit reduces side-effect problems of using loop controlvariables that were declared and assignedelsewhere. C++ and Java both allow (but do notrequire) this as well
-
7/31/2019 cs1621b
50/175
50
Control Structures
Attitude in Pascal and Ada is that if you want
more complex iteration (ex. increment otherthan 1 or1, option of changing number ofiterations during the loops execution) youshould use a while loop
C, C++ and Java have a different approach For loop is not really a for loop in the
traditional sense
It is a very general loop that can be used for
any looping application It more appropriately is a while loop with the
addition of an initialization-statement and apost-body statement
-
7/31/2019 cs1621b
51/175
51
Control Statements
for (init-expr; pretest-expr; post-body-expr)
Now really anything goes and the pre-test-expr and post-body-expr are evaluated foreach iteration of the loop
Can certainly be used for a counting loop, as
most of you have used it Can also be used as an arbitrary loop to do
more or less whatever programmer wants itto doAdded flexibility, with added danger
The usual for C, C++
see for.cpp
-
7/31/2019 cs1621b
52/175
52
"foreach" loop
Newer languages also have included a
"foreach" loop to iterate through data
Key difference between "for" and "foreach"
"for" iterates through indexes (typically),
which can be used to access an array /collection if desired Loop control variable is typically an integer
"foreach" iterates through the values in thecollection directly No indexing is used, at least not directly
Loop control variable is the data type we areaccessing in the collection
-
7/31/2019 cs1621b
53/175
53
"foreach" loop
foreach loop has its advantages and
disadvantages
Advantages:
Since no counter is used, we eliminate the
possibility of index out of bounds problems We can iterate over a collection without
having to know the implementation details ofthe collectionAllows for data hiding and improves error
prevention We will likely discuss this more when we discuss
object-oriented programming
-
7/31/2019 cs1621b
54/175
54
"foreach" loop
Disadvantage
When accessing an array, we may want orneed the index value Ex: What if we want to change the data in the
array or reorganize it Ex: Sorting would difficult using "foreach"
See forEach.java and foreach.pl
-
7/31/2019 cs1621b
55/175
55
Control Statements
3) Arbitrary Loops
Now the loop is basically an infinite loop,with the programmer expected to break outof it explicitly at some point
Ada allows this with theloop
end loop;
exit statement will break out of the loop,
and can be put into an if statement
Thus we can break out of the loop frommore than one place
-
7/31/2019 cs1621b
56/175
56
Control Statements
Although C, C++ and Java do not explicitly
have this construct, you can certainly build itby making a while or for loop an infiniteloop and using the breakstatement tobreak out
while (1) // C while (true) // Java
{ {
} }
Again this feature adds flexibility, but makes
code less readable and harder to debug
-
7/31/2019 cs1621b
57/175
57
Control Statements
Unconditional Branching
Transfer execution from one section of codeto another section of code
Commonly known as the goto
Used extensively in early languages whichlacked block control structures
Ex. early FORTRAN and BASIC programsrelied heavily on the goto
It was necessary then, but most modernlanguages contain block control structures
-
7/31/2019 cs1621b
58/175
58
Control Statements
Even then computer scientists were aware of
how problematic they could be Spaghetti code that results is very difficult toread
Modification of one code segment can significantlyimpact many parts of the program programmermust be aware of all places that can go to that
code segment Debugging is very difficult it is hard to find andfix logic errors since all possible execution pathsare difficult to trace
Now languages have blocks and extensive
control structures It has been shown that goto adds no functionality(i.e. nothing can be done with it that cannot bedone without it)
However, many languages still have goto
-
7/31/2019 cs1621b
59/175
59
Control Statements
Unrestricted goto allows code segments that
normally have only one entry and exit pointto have many Ex: What happens if you jump into the middle of a
procedure (what about parameters?) or a whileloop (condition is skipped)
Most newer languages that have the gotohave restrictions on it Ex: Cannot jump into an inactive statement or
block in Pascal
If restricted and used infrequently, can actually beuseful in some languages
Ex: Pascal does not have a break statement. If anexceptional situation would case an exit from aloop, using a goto may be more readable thanadding extra convoluted logic
-
7/31/2019 cs1621b
60/175
60
Control Statements
Some (newer) languages do not have goto at
all Ex: JavaAllows breaks from loops
Has exception handlers
-
7/31/2019 cs1621b
61/175
61
Subprograms
Subprograms
Semi-independent blocks of code with thefollowing basic characteristics:
Only one entry point the beginning of the
subprograms, and execute when called: Parameter information is passed to subprogram Caller execution is temporarily suspended, and
subprogram executes
When subprogram terminates, caller executionresumes at point directly following the subprogram
call
-
7/31/2019 cs1621b
62/175
62
Subprograms
What types of subprograms can we
have?
Most languages have two different types,procedures and functions
Procedures can be thought of as new namedstatements that can supplement thepredefined statements in the language
Ex: Statements to search or sort an array
Once defined, these can be used anywherethey are needed in a program
-
7/31/2019 cs1621b
63/175
63
Subprograms
In order to have an effect on the overall
program, a procedure needs to act onsomething other than just the variables localto the procedure. This can be done through: Outputting data to the display or to a file
Altering a (relatively) global variable that will be
accessed/used later by a different part of theprogram
Altering formal parameters such that the actualparameters in the caller are modified
This will be discussed in more detail soon
-
7/31/2019 cs1621b
64/175
64
Subprograms
Functions can be thought of as code
segments that calculate and return a singleresult
Modeled after math functions
Used within expressions, where result value is
substituted for the call
The effect of functions on the overall programis the value returned by them. Thus, from anideal (and mathematical) point of view,
functions should have NO OTHER effect onthe overall program
-
7/31/2019 cs1621b
65/175
65
Subprograms
Should NOT modify global variables
Should NOT alter actual parameters
Naturally, both of the above are allowed inmany languages In these cases it is up to the programmer to decide
how he/she wants to use functions
Again the tradeoff for the increased flexibility is themore potential for logic errors and more difficultyin debugging
C/C++/Java
Only have functions, no procedures
void functions can mimic the behavior ofprocedures
-
7/31/2019 cs1621b
66/175
66
Subprograms
Local variables
How/when are they allocated?
Stack-dynamic:
Default in most modern imperative languages
Required for recursive calls, since memorymust be associated with each call, not eachsubprogram Ex: Binary Search
mid = (left + right)/2;
Many different values for mid must be able tocoexist, one for each call on the run-time stack
Could not do it memory was statically allocated
-
7/31/2019 cs1621b
67/175
67
Subprograms
Overhead is time for allocation and
deallocation each time a subprogram is called May not seem like a lot of time is needed, but itcan add up if many calls are made in a program
Access must be indirect since actual memorylocation of variable will not be known until a
subprogram call is made Location in run-time stack depends upon calls
made prior to current one, which can differ fromrun to run
Also adds some time overhead
Static: Used in languages that do not support
recursion (ex. older FORTRAN)
-
7/31/2019 cs1621b
68/175
68
Subprograms
Also optional in other languages, such as C
and C++Allow variables to retain values from call to
call Remember the lifetime is the duration of the
program
Ex: In CS1501 LZW algorithm writing codewords to afile, the bit buffer is static
The leftover bits are kept in the buffer for the nextcall
-
7/31/2019 cs1621b
69/175
69
Subprograms
Parameters
Parameters are vital to subprograms
Allow information to be:
Passed IN to the subprogram
Passed OUT from the subprogram
Passed IN and OUT to and from thesubprogram
When writing subprograms, programmerdecides which is required for a givensubprogram
-
7/31/2019 cs1621b
70/175
70
Subprograms
Then programmer utilizes syntax/rules in
language being used to achieve the desiredoption
Sometimes the syntax/rules of the languagedo not fit exactly with the 3 use options given
In these cases programmer must be carefulto use the parameters as he/she intends
Some definitions:
Formal Parameter: Parameter specified in the subprogram header Only exists during duration of subprogram exec
Sometimes called "parameter"
-
7/31/2019 cs1621b
71/175
71
Subprograms
Actual Parameter:
Parameter specified in call of the subprogram May exist outside of the scope of the procedure
Sometimes called just "argument"
Rules for Formal and Actual parametersdiffer, as we will discuss
-
7/31/2019 cs1621b
72/175
72
Subprograms
Parameter Passing Options
Pass-by-Value
Pass-by-Reference
Pass-by-Result
Pass-by-Value-Result
Pass-by-Name
You should be familiar with Pass-by-Value
and Pass-by-Reference
Others may be new to you
Well discuss each
-
7/31/2019 cs1621b
73/175
73
Subprograms
Pass-by-Value
Formal parameter is a copy of the actualparameter
i.e. get r-value of actual parameter and copyit into the formal parameter
Default in many imperative languages
Only kind used in C and Java
Used for IN parameter passing
Actual can typically be a variable, constantor expression
-
7/31/2019 cs1621b
74/175
74
Subprograms
Benefit is that actual parameters cannot be
altered through manipulation of the formalsAlso useful in some recursive calls, since a
new copy is made with each call
Problem is that copying a parameter can bequite expensive, both in terms of time andmemory
Ex: Consider an object with an array of 1000
floats Object is copied with each call to the function If, for example, recursive calls are made, a lot of
memory can be consumed very quickly
-
7/31/2019 cs1621b
75/175
75
Subprograms
Implementation:
Using a run-time stack, this is straightforward When subprogram is called, copy of actual
parameter is placed into a local variable, which isstored on the run-time stack (in the activationrecord for the subprogram)
During subprogram execution, formal parameter isused like any other local variable for thesubprogram
Only difference is that it is initialized via the actualparameter
-
7/31/2019 cs1621b
76/175
76
Subprograms
Pass-by-Reference
Formal parameter is a reference to (oraddress of) the actual parameter variable
get l-value of actual param and copy it intothe formal param, then access the actualparam indirectly through the formal param
Used in Pascal (var parameters), in C (usingexplicit pointers) and C++ and PHP (&)
Most appropriate for IN and OUT parameterpassing, but can be used for all
Actual param usually restricted to a variable
-
7/31/2019 cs1621b
77/175
77
Subprograms
Benefit is that we can change or not change
the actual parameter using the formal it isup to the programmer
Also good that memory is saved only anaddress is copied
Problem is that we can miss logic errors ifwe accidentally alter an actual parameterthrough the formal parameter
Also some applications (ex: some recursion)
dont work as well We may not want change at one call to affect
another call
-
7/31/2019 cs1621b
78/175
78
Subprograms
Constant Reference Parameters
Developers of C++ realized that valueparameters are not practical for large dataobjects (too much time and memory, esp. forrecursive algorithms)
Reference parameters have danger ofaccidental side effects (when used for INparameters)
Solution is to pass parameters by reference,
but not allow them to be altered constantreference Now compiler gives error if parameter is changed
within subprogram
Copy made if passed by reference to another sub
-
7/31/2019 cs1621b
79/175
79
Subprograms
Good concept, but not perfect
Programmer can get around it by casting to apointer and altering indirectly
See params.cpp
Ada IN parameters have a similar idea Cannot be assigned/altered within the function
Cannot be passed by out or in out to another sub More on Ada params shortly
Implementation:
Using run-time stack, address of actual is
stored in activation recordActual is accessed indirectly in sub through its
address
b
-
7/31/2019 cs1621b
80/175
80
Subprograms
Pass-by-Result
Reference parameters are not an exact fitfor out parameters
Ex: A procedure designed to read data from afile into an object
Here we dont care about what used to be inthe object we just want to be sure that atthe end the appropriate value is assigned
With reference parameters we COULD accessthe old value and use it if we wanted to (orby mistake)
Pass-by-Result prevents this
S b
-
7/31/2019 cs1621b
81/175
81
Subprograms
In Pass-by-Result, actual parameter is not
actually passed to the subprogram it onlywaits to have a value passed back to it
Formal parameter is a local variable
During life of subprogram its value does notaffect actual parameter at all
At end of subprogram its value is passed backto the actual parameter
So what is actually needed of actualparameter is its address (lvalue)
When address is obtained can affect result forsome contrived examples
S b
-
7/31/2019 cs1621b
82/175
82
Subprograms
// Note: This is NOT real code
int A[8];
for (int i = 0; i < 8; i++) A[i] = i;
global int j = 2;
foo(A[j]);
output(A[]);
sub foo(int param)
{int temp = 25;
j = 5;
param = temp;
}
------------------------------------------------Output: 0 1 25 3 4 5 6 7 // if address obtained
// at call
Output: 0 1 2 3 4 25 6 7 // if obtained at ret.
S b
-
7/31/2019 cs1621b
83/175
83
Subprograms
If used, address is typically obtained at call
Ada 83 out parameters for simple types areALMOST this, but the formal parameter valuecannot be accessed within the sub (so it isnot really a local variable) Ada 95 changed out parameters to allow them to
be accessed, fitting the Pass-By-Result model moreclosely
Implementation:
At sub call, actual param address is calculated
and stored in run-time stack, as is the formalparam (as a local)
Final result of formal is copied back to actualaddress at end of sub
S b
-
7/31/2019 cs1621b
84/175
84
Subprograms
Pass-by-Value-Result
Now actual parameters value is passed tothe formal parameter when subprogram iscalled, being stored and used as a localvariable
At the end of the subprogram the value ispassed back to the actual parameter
As the name indicates, this is a combination
of Pass-by-Value and Pass-by-Result
Used for IN and OUT parameters
Subprograms
-
7/31/2019 cs1621b
85/175
85
Subprograms
If aliasing is NOT allowed/used, and if no
exceptions occur in the subprogram theeffect of value-result and reference is thesame
Precondition: Actual parameter has value
obtained previous to call During subprogram: Only formal parameter is
accessed, updated as desired
Postcondition: Actual parameter has last
value assigned within subprogram
-
7/31/2019 cs1621b
86/175
Subprograms
-
7/31/2019 cs1621b
87/175
87
Subprograms
Idea is that language creators did not want to
require the params to be passed in anyspecific way They just wanted to require the in-out effect
If the result could differ based on whether paramsare value-result or reference, then the program iserroneous
Up to programmer to NOT use aliases
Ada 95 clarified, requiring all structured in-out parameters to be reference
See params.adb
Implementation:
Value + Result
Subprograms
-
7/31/2019 cs1621b
88/175
88
Subprograms
Pass-by-Name
Definitely wackiest way of param passing
Used for IN and OUT parameters, and onlyin Algol
Idea is that actual parameter is textuallysubstituted for the formal in all places that itis accessed in the subprogram
Kind of like a macro substitution
It is only evaluated at the point of use in thesubprogram Evaluated EACH TIME it is used in subprogram
Subprograms
-
7/31/2019 cs1621b
89/175
89
Subprograms
Thus the parameter value or address could
change based on where/when in thesubprogram it is evaluated
However, the referencing environment usedis that of the CALLER, not of the subprogram So only changes within the subprogram that have
a global effect will change its evaluation This also makes implementation more difficult
For simple variables this is equivalent topass-by-reference
Variable address evaluates the same wayregardless of where in the subprogram it islocated
-
7/31/2019 cs1621b
90/175
Subprograms
-
7/31/2019 cs1621b
91/175
91
Subprograms
global int i = 0, var = 11, n = 5;
global int A[2] = {4, 8};
foo(var, 2*n, A[i]); // all pass by name
void foo(int x, int y, int z)
{
x = x + 1; output(var);
output(y); n = n + 1; output(y);
output(z); z = z + 1; output(z);
i = i + 1; z = z + 1; output(z);}
Subprograms
-
7/31/2019 cs1621b
92/175
92
Subprograms
Implementation:
It is not trivial to allow macro to be evaluatedand reevaluated in environment of thecaller
Parameterless subprograms called thunks are
used Thunk evaluates parameter in current state of
callers referencing environment
Returns the resulting address or value
Clearly this is a lot of overhead
Overhead and confusing results are why thisis not used in newer languages
Subprograms
-
7/31/2019 cs1621b
93/175
93
Subprograms
Subprograms as Parameters
We allow variables as parameters so that wecan access their values (or addresses) fromwithin a subprogram
Why not allow subprograms so that we canexecute them from within a subprogram?
Some languages do allow this (ex. Pascal,C++, PHP)
However, there are some issues to consider
Subprograms
-
7/31/2019 cs1621b
94/175
94
Subprograms
Can the parameter subprogram arguments
differ in form from each other? If so, how to type check and even check thenumber of arguments when the subprogram isactually called?
Easiest solution is to require the arguments to
all have the same form Header of parameter subprogram must be givenwithin the header of the subprogram it is beingpassed to
Scope is also an issue what is the
referencing environment of the subprogramthat is being passed as a parameter? Threereasonable possibilities exist:
Subprograms
-
7/31/2019 cs1621b
95/175
95
Subprograms
1) The referencing environment in which the
parameter subprogram is CALLED: shallowbinding
2) The referencing environment in which theparameter subprogram is DEFINED: deepbinding
3) The referencing environment in which theparameter subprogram is PASSED as anargument: ad hoc binding
Note that shallow binding fits well withdynamic scoping and deep binding fits wellwith static scoping
Subprograms
-
7/31/2019 cs1621b
96/175
96
Subprograms
Pascal and C++ both use deep binding
Shallow binding is used by SNOBOL, whichalso uses dynamic scoping
Ad hoc binding has never been used
See fnparams.cpp
Subprograms
-
7/31/2019 cs1621b
97/175
97
Subprograms
Overloading (ad hoc polymorphism)
Using the same subprogram name withdifferent parameter lists
When a subprogram is called, the compilerselects the correct version based on the
parameter lists In Ada, return type for a function is also
used, since coercion is not done in Ada andfunction return values cannot be ignored
Enables programmer to use the same namefor similar functions that take differentargument types
Subprograms
-
7/31/2019 cs1621b
98/175
98
Subp og a s
Use: Make it easier for the programmer to
use consistent names for subprograms Without overloading: Programmer must make
up different but similar names forsubprograms that do similar things but for
different types Ex: abs(int) fabs(float) labs(long) Ex: ISort(int * A) FSort(float * A)
With overloading: Programmer uses the samename and the compiler decides which to use
Ex: abs(int) abs(float) abs(long) Ex: Sort(int * A) Sort(float * A)
Subprograms
-
7/31/2019 cs1621b
99/175
99
p g
But programmer must be careful:
Ada and C++ both allow overloading anddefault parameters
Leaving out some parameters in the call couldmake a call ambiguous
i.e. it matches more than one function header Call can also be ambiguous if implicit casting
of arguments is done
Operator Overloading is the same idea, but
with symbols rather than identifiers We discussed these issues previously
See Slide 12 of cs1621b.ppt
-
7/31/2019 cs1621b
100/175
Generics
-
7/31/2019 cs1621b
101/175
101
Motivation:
Programmers often apply data structures andalgorithms to more than one data type Ex. Sorting, Searching algos
Ex. BST, PQ, Stack, Queue data structures
Even with overloading, the programmer muststill write different (identical except for type)versions of the code
Generics simply transfer the job of makingthe different versions from the programmerto the compilerautomates the overloadingprocess Note that DIFFERENT VERSIONS of the code MUST
STILL BE generated
Generics
-
7/31/2019 cs1621b
102/175
102
So the reason we have generics is to save the
programmer some time (and perhaps someconfusion)
Ada vs. C++:
In Ada, template instantiations must be
explicit Programmer specifies template arguments using
the new statement
Ex: package int_io is new integer_io(integer);
The generic package is integer_io
The instantiated package is int_io The type argument is integer
As is usual in Ada, if declaration is explicit,there will be no surprises
Generics
-
7/31/2019 cs1621b
103/175
103
In C++, template instantiations can beexplicit or implicit
Implicit: generated automatically by thecompiler when a call is seen with theappropriate arguments Duplicate instantiations are merged into a single
code segment Coercion cannot be done, since the types wont
match the template correctly
Saves programmer some typing
Explicit: programmer declares each version
Coercion can be done using regular C++promotion and conversion rules
Programmer is aware of each version
See template.cpp and tordlist.h
-
7/31/2019 cs1621b
104/175
Generics
-
7/31/2019 cs1621b
105/175
105
However, retrieving objects back from the
collection required explicit casting to theactual type if we wanted full access to themArrayList A = new ArrayList();
A.add(new String("Wacky"));
String S = (String) A.remove(0);
Also any typing mistakes (mixing types inthe collection unintentionally) could only becaught at run-time (via casting exceptions)
Overall not bad, but some people thoughttype parameters should be allowed
Generics
-
7/31/2019 cs1621b
106/175
106
JDK 1.5 added syntax very similar to that for
C++ templatesHowever, it is very different from C++
templates (and Ada generics as well)
It is not really adding any new generic
abilities to the language
It is not creating new code for each versionof the class or method
It is designed to make collections of objectsmore type-safe
See more details in the handout
Implementing Subprograms
-
7/31/2019 cs1621b
107/175
107
What is involved when a subprogram is
called, during its execution, and when itterminates?
This will differ depending on if recursion is
allowed in a language or notMost modern languages allow recursion, but
original FORTRAN (up to FORTRAN 77) didnot allow it
Implementing Subprograms
-
7/31/2019 cs1621b
108/175
108
FORTRAN 77 (and before)
All variables within a subprogram werestatic, and recursive calls were not allowed
Activation records were still used, but they
also could be static Since all data was static, the size was known
at compile time
Run-time stack not needed, since at most one
call per sub could be performed at a timeWhat do we need to know when a
subprogram is called?
Implementing Subprograms
-
7/31/2019 cs1621b
109/175
109
Return Value
Local Variables
Parameters
Return Address
If sub is a function
Static
Like local variablesthat are initialized
Where to go back towhen subprogramends
-
7/31/2019 cs1621b
110/175
Implementing Subprograms
-
7/31/2019 cs1621b
111/175
111
So the activation record looks similar to thatused in FORTRAN With additional link location to access global
variables
Now multiple instances of an activation recordcan occur at the same time, so they must be
created dynamically (at run-time), unlike inFORTRAN
Lets look at some of the contents of anactivation record
Implementing Subprograms
-
7/31/2019 cs1621b
112/175
112
Temporaries
Local Variables
Parameters
Dynamic Link toprevious call
Static Link to Non-Locals
Return Address
Temps and local variables areallocated within the subprog.
call. In Pascal, C and C++,the local variables must be offixed size. In Ada, they can bevariable size (ex. arrays)
Parameters, links to non-Localsand the return address areplaced into the AR by the callerof the subprogram, so they arelower in the record
Implementing Subprograms
-
7/31/2019 cs1621b
113/175
113
See rtstack.cpp
Accessing non-local variables within asubprogram
Local variables are located within theactivation record (AR)
Can be accessed by knowing the baseaddress of the AR plus a local_offset for eachvariable
Ex: Base address of AR = 162
int x, y[5]; // address of x is 162 + (other AR stuff)float z; // address of z is 162 + (other AR stuff)
// + 4 + 20
Implementing Subprograms
-
7/31/2019 cs1621b
114/175
114
Non-locals are located elsewhere
For languages like C and C++: Subprograms cannot be nested
Besides locals there are global variables
For languages like Ada and Pascal: Subprograms can be nested to arbitrary depth
A sub can be declared within a sub, which is withina sub, which is within a sub
Using static scope, variables declared in a textualparent sub are accessible from an inner sub
Relative global variables
But the variable locations could be in differentplaces on the run-time stack
How to find them?
-
7/31/2019 cs1621b
115/175
Implementing Subprograms
-
7/31/2019 cs1621b
116/175
116
Two techniques used to locate AR
1) Static links
A link is kept in an AR to that ARs textualparent (from the declaration)
To access a single nonlocal many links maybe crossed
2) Display
A single array is kept to indicate all of the
currently accessible nested subs Any nonlocal can be accessed with two
indirect accesses
-
7/31/2019 cs1621b
117/175
Implementing Subprograms
-
7/31/2019 cs1621b
118/175
118
However, textual parent does NOT have to
be previous call on run-time stack So dynamic link in AR is not enough (but
would work for dynamic scoping)
sub foo
{
sub innerA
{ }
sub innerB
{ innerA; }
innerB;
}
main
{ foo; }
innerA
innerB
foo
Implementing Subprograms
-
7/31/2019 cs1621b
119/175
119
Static links connect an AR to the AR of the
subs textual parent, no matter wherepreviously on the RT stack it is
How is this used to access nonlocalvariables?
Can be determined and maintained based onthe nesting depths of the subprograms thatare called The difference in the nesting depths between the
sub using a nonlocal variable and the sub in which
the nonlocal is declared is equal to the number ofstatic links that must be crossed to find the correctAR for the variable
Implementing Subprograms
-
7/31/2019 cs1621b
120/175
120
This difference can be stored for each variablewhen the program is compiled, so that at run-timefinding the variable is simple
sub parent {
var X, Y
sub child1 {
var X, Z
sub grand1 {var Z
}
}
sub child2 {
var Y
call child1}
}
main {
call parent }
If variable Y is accessedwithin grand1
chain offset is 2, since Y
is declared two levelsoutside grand1
so search for Y only hasto be done once atcompile-time
at run-time we know tofollow two static links,whatever call sequence is
Implementing Subprograms
-
7/31/2019 cs1621b
121/175
121
What actually happens when a sub is called?
AR for textual parent of sub must be located on therun-time stack, so that the static link can be linked toit
A clear (but inefficient) way to do this is to followdynamic links down the RTS until the AR for the parentsub is found
A better way can take advantage of the fact that thecalling sub and the called sub must be relatives inthe declaration tree
Calling sub could be parent of called sub (but notgrandparent)
Calling sub could be called sub (direct recursion)
Calling sub could be a sibling of called sub Calling sub could be a descendent of called sub (indirect
recursion) Calling sub could be a niece of called sub
Implementing Subprograms
-
7/31/2019 cs1621b
122/175
122
So instead of following dynamic links, atcompile-time we can pre-calculate thenumber of static links (from caller) to followto find the appropriate textual parent ARAlways equal to: nesting_depth (calling sub)
nesting_depth(called sub) + 1
Calling sub could be parent of called sub X (X+1) + 1 = 0 static links (user caller's AR)
Calling sub could be called sub (direct recursion) X X + 1 = 1 static link same textual parent
Calling sub could be a sibling of called sub X X + 1 = 1 static link same textual parent
Calling sub could be a descendent of called sub(indirect recursion)
Calling sub could be a niece of called sub Follow diff. in nesting depth + 1 static links
Implementing Subprogams
d Bi b i
-
7/31/2019 cs1621b
123/175
123
procedure Bigsub is
procedure A(Flag: Boolean) is
procedure B is
...A(false);
end; -- B
begin -- A
if flag
then B;
else C;
end; -- Aprocedure C is
procedure D is
here
end; -- D
...
D;end; -- C
begin -- Bigsub
A(true);
end; -- Bigsub
D dynamic link to C
static link to C
return addr. to C
C dynamic link to Astatic link to Bigsub
return addr. to A
A param flag ( = false)
dynamic link to B
static link to Bigsub
return addr. to B
B dynamic link to A
static link to A
return addr. to A
A param flag ( = true)
dynamic link to Bigsub
static link to Bigsub
return addr. to Bigsub
Bigsub dynamic link to caller
static link
return addr.
Implementing Subprograms
-
7/31/2019 cs1621b
124/175
124
Evaluation of static links
Maintaining is not too time-consuming
Chain offsets can be calculated at compiletime
Local variables can be accessed directlyNon-locals must follow 1 or more static links
Works well if nesting depths do not get toodeep
For deep sub nesting, cost of non-local accesscan be high But usually 2 or 3 levels is max used
Implementing Subprograms
-
7/31/2019 cs1621b
125/175
125
Display
Uses a single array to store links to ARs atall relevant nesting depths
To access a nonlocal at a given nestingdepth, we just follow the display entry forthat depth, then the local_offset Never more than one link to follow
Array is updated as subs are called and asthey terminate
Generally faster than static links if manynesting levels are used
We will skip the details here read the text
Implementing Subprograms
-
7/31/2019 cs1621b
126/175
126
Nested declaration blocks
Idea could be similar to nested subs
Blocks could be treated as parameterless subs
Static links could be used to determine textualparent
But it is actually much easier to handle, sinceblock entry and exit is always the same
Parent block goes to child block
When child block terminates, we revert to parentblock
Implementing Subprograms
-
7/31/2019 cs1621b
127/175
127
Simply push new block declarations onto
run-time stack, and pop them when blockterminates
But we only have one activation record, sono links are required
"Non-locals" can be accessed just like locals
-
7/31/2019 cs1621b
128/175
Data Abstraction
-
7/31/2019 cs1621b
129/175
129
Procedural (process) abstraction:
Action can be performed without requiringdetailed knowledge of how it is performed
Data abstraction:
New type can be used without requireddetailed knowledge of how it is implemented
We don't need to know the details of how it isstored in memory
We don't need to know the details of how it ismanipulated via operations
Data Abstraction
M f ll ADT t ti f t
-
7/31/2019 cs1621b
130/175
130
More formally, an ADT must satisfy two
conditions:1) The declarations of the type and operations
(interface) are contained in a single syntacticunit ENCAPSULATION
The interface does not depend on how theobjects are represented or how the operationsare implemented
2) The representation of the objects is hidden from
users of the ADT DATA HIDING
Objects can only be manipulated via the providedinterface
Data Abstraction
-
7/31/2019 cs1621b
131/175
131
Ex: Stack Data: something that can store and access
multiple data values in the manner dictated by theoperations
Operations: Push add new value to top of stack
Pop remove top value from stack
Top view top value (or a copy) without removing
Empty is stack empty
User of stack only needs to know the parametersand effect of each operation to use a stackcorrectly
Implementation could be an array, a linked-list, ormaybe something different
Does not affect use
Implementer can hide these details from the userthrough private declarations
-
7/31/2019 cs1621b
132/175
Data Abstraction
-
7/31/2019 cs1621b
133/175
133
Newer languages added true data
abstractionAda via packages
C++, Java, C#, Ada95 via classes / objects
Encapsulation units that contain all details ofthe new type
Access modifiers that prevent access tointernal details of the ADT from outside the
encapsulation unit
See text for more details
Object-Oriented Programming (OOP)
-
7/31/2019 cs1621b
134/175
134
Characteristics of OOP
1) Data abstraction: encapsulation +information-hiding
The operations for manipulating data areconsidered to be part of the data type
(encapsulated)
The implementation details of the data type(both the structure of the data and theimplementation of the operations) are
separate from their specifications and(possibly) hidden from the user As we discussed with ADTs
OOP
-
7/31/2019 cs1621b
135/175
135
2) Inheritance
The characteristics of an ADT (data +operations) can be passed on to a subtype Subtype can also add new data and operations
Allows programmer to build new (derived)types from old (parent) ones
Common data/operations do not have to berewritten (or copied)
Operations that are slightly different in derivedtype can be rewritten (overridden) for that type
New data/operations tailor the derived type to
the problem at hand Parent type is unchanged and may (sometimes)be used together with derived type
-
7/31/2019 cs1621b
136/175
OOP
-
7/31/2019 cs1621b
137/175
137
3) Polymorphism
Variables of a parent class can also beassigned objects of a subclass (or subclassof a subclass)
Operations used with a variable are based
upon the class of the object currently stored(could be a parent type object or a derivedtype object) Operations may have been overridden in the
derived class
Dynamic binding allows parent and derivedobjects to be used together in a logical way
OOP
Sh l
-
7/31/2019 cs1621b
138/175
138
Ex: Shape class We could declare:
Shape shapelist[100];
shapelist[0] = new Rectangle(0, 0, 10, 20);
shapelist[1] = new Square(50, 100, 30, 30);
shapelist[2] = new Circle(100, 50, 25);
for (int i = 0; i < 3; i++)shapelist[i].Draw();
Polymorphism allows these different objectsto be accessed consistently within the same
array Think about how you could do the code
above in C or Pascal It would not be easy!
OOP
O i M k i d
-
7/31/2019 cs1621b
139/175
139
One option: Make one giant struct or recordto contain all of the data, including a union orvariant Base class would use only the core data items
Derived classes would use additional data itemsas provided in the union or variant
To do the operations, we would need a switch or
case to test which type the variable is, so that itcan be written out appropriately
Now what if we want to add another newderived class, Pentagon? With OOP, it is simple to add any new data and
override the necessary operations Without OOP we would have to change the overall
structure of the data and operations old typeswould change, possibly causing problems
OOP
-
7/31/2019 cs1621b
140/175
140
OO Languages
1) Smalltalkwas the first and purest OOL All data (even numeric literals) are objects,
and are all descendents of class Object
Objects are all allocated from the heap, andimplicitly deallocated (garbage collection)
Variables are references, with implicitdereferencing
Execution of a program (logically) involvesobjects sending messages to each other,executing methods, and responding back So the data is driving the execution, not the
control statements
-
7/31/2019 cs1621b
141/175
-
7/31/2019 cs1621b
142/175
OOP
E i l t t i d
-
7/31/2019 cs1621b
143/175
143
Equivalent to previous code:| letters |
letters := 0.(Prompter prompt: 'Enter your name' default:'')
do: [ :c | c isLetter
ifTrue: [ letters := letters + 1 ].
].
letters printNl.
Now we cascade the messages to allow fewer statements(also do: loop iterates through characters in a string, sowe dont need the loop counter
(((Prompter prompt: 'Enter your name' default:'')
select: [ :c | c isLetter ]) size printNl.
Now the select: loop generates a string based on thecondition in the block
OOP
M S llt lk ( l d bj t )
-
7/31/2019 cs1621b
144/175
144
More on Smalltalk (classes and objects)
Data in an object can be an instance variableor a class variable Instance variables are associated with objects
Separate data for each object
Accessible only through the methods defined forthat object always private to the class
Class variables are associated with classes Shared data for all objects of the same class
Accessible from all objects, but still private to theclass
Methods have a similar grouping, but are
public Instance methods associated with objects
Class methods associated with entire class
OOP
M S llt lk (i h it )
-
7/31/2019 cs1621b
145/175
145
More on Smalltalk (inheritance)
Object base class of all others Only single inheritance allowed
All inheritance is implementation inheritance Data and methods of parent class are always
accessible to the derived class i.e. Cannot hide implementation details from
derived class
Advantage: Derived class can likely implement itsmethods more efficiently with access to parentdata
Disadvantage: Change in parent classimplementation will likely require change in derivedclass implementation
Ex. Traversable stack
OOP
M S llt lk ( l hi )
-
7/31/2019 cs1621b
146/175
146
More on Smalltalk (polymorphism)
All messages are dynamically bound tomethodsAt run-time, when a message is received, the
objects class is searched for a method, then, ifnecessary its superclass, its super-superclass andso on up to Object
Variables have no types since they are onlyused to refer to objects, not to determine themessages an object can receive
Clearly some liabilities with this approach
Slows language down due to run-time overhead Programmer type errors cannot be caught until
execution time
OOP
L t' l k t l
-
7/31/2019 cs1621b
147/175
147
Let's look at some examples:
person.cls as an example of a new class See personTest.st
student.cls as an example of a subclass
studentTest.st as an example showing polymorphic
access twodarry.cls as another subclass example
See twodTest.st
For more information, see the GNU Smalltalk
User's Guide: http://www.gnu.org/software/smalltalk/gst-manual/gst.html
OOP
2) C++ is an impe ati e/OO mi
http://www.gnu.org/software/smalltalk/gst-manual/gst.htmlhttp://www.gnu.org/software/smalltalk/gst-manual/gst.htmlhttp://www.gnu.org/software/smalltalk/gst-manual/gst.htmlhttp://www.gnu.org/software/smalltalk/gst-manual/gst.html -
7/31/2019 cs1621b
148/175
148
2) C++ is an imperative/OO mix Had to be backward compatible with C
Wanted to add object-oriented features
Result is that programmer can use as few or asmany OO features as he/she wants to
C++ Classes and Objects Can be static, stack-dynamic or heap-dynamic
Member data and member functions can be private,protected or public
Allows programmer to decide
Like Smalltalk, has notion of class variables Delcared as static in C++
Destructor needed if object uses dynamic memory
OOP
C++ Inheritance
-
7/31/2019 cs1621b
149/175
149
C++ Inheritance
Do not need a superclass (no Object baseclass for all other classes)
Multiple inheritance is allowed Complex and difficult to use
Implementation inheritance or interfaceinheritance are allowed With interface inheritance, all data and functions
are still inherited, but only public ones are directlyaccessible to the derived class
Advantage: Modifications to parent class do not
affect derived class, as long as they do not changethe interface
Disadvantage: Operations may be slower, sincethey cannot access the data directly
-
7/31/2019 cs1621b
150/175
OOP
3) Java falls in between Smalltalk and C++
-
7/31/2019 cs1621b
151/175
151
3) Java falls in between Smalltalk and C++
Like Smalltalk: Object is base class to other classes Single inheritance only
Objects are (almost) all dynamic, with garbagecollection
References used to access
Method names are (by default) dynamicallybound
Like C++: Access can be private, public or protected
Static binding can optionally be used to improverun-time speed
Overall syntax for member data and functionaccess
Variables are typed
OOP
Other Java OOP features:
-
7/31/2019 cs1621b
152/175
152
Interfaces allow for a simplified form of
multiple inheritanceAn interface is in a sense a base class with no data
and only abstract (pure virtual) methods
A class that implements an interface simplyimplements the methods specified therein
Advantages: Objects that implement an interfacecan be used whereever the interface is specified.This allows for a type of generic behavior
Ex: Comparable interface, Runnable interface
Disadvantage: Can become complicated wheninterfaces and inheritance are both used
Reflection that allows us to manipulate theclasses themselves
See poly.java
OOP
OOL Implementation
-
7/31/2019 cs1621b
153/175
153
OOL Implementation
Data: Typically a record/struct type of storage is
usedClass Instance Record (CIR)
Data members are accessed by name, in the
same way as records
Subclass adds extra data to CIR of parentclass
Private access enforced by limiting visibility ofthe data
OOP
Subprograms:
-
7/31/2019 cs1621b
154/175
154
Subprograms:
Static binding Subprograms that will be called are determined bythe variable type
Variable types are known at compile time and codecan be determined then
Dynamic Binding: Subprograms that will be called are determined bythe objects type, not the variables type
Objects stored in a variable are determined at runtime
Appropriate links must be stored with the object
But they are the same for all objects of that classVirtual Method Table (VMT) used to store links to
all pertinent subprograms
Parallelism
Parallelism is incorporated into
-
7/31/2019 cs1621b
155/175
155
Parallelism is incorporated into
programs for 2 primary reasons:1) Program is running in a multiprocessing or
distributed environment
Many computers now have multiple CPUs
Many jobs are distributed over multiplecomputers in a network
A programming language should be able totake advantage of this parallelism Many algorithms can be improved if designed for
parallel execution
This is PHYSICAL PARALLELISM
Parallelism
2) Program is running in a simulated parallel
-
7/31/2019 cs1621b
156/175
156
2) Program is running in a simulated parallelenvironment, allowing for asynchronousactivity
Ex: Two windows are displayed to the user.One shows the current time (incrementedby seconds) and one allows the user todraw images on the screen We dont want the act of the user drawing to
stop the clock
We dont want the clock running to prevent theuser from drawing
Even with a single processor, we want both ofthese activities to execute in parallel
This is LOGICAL PARALLELISM
-
7/31/2019 cs1621b
157/175
Parallelism
If the tasks have some dependencies there
-
7/31/2019 cs1621b
158/175
158
If the tasks have some dependencies, therecan be a problem Most common dependency is shared data To handle this we must synchronize the tasks
Cooperation Synchronization
One task is dependent upon an output/outcome ofanother
Ex: Task B must process data produced by Task A Contractor B cannot put up drywall until contractor
A has finished the wiring
Task to count ballots cannot proceed until task thatcollects ballots provides it with some
We must have a mechanism that allows Task B to
pause until the data is available B could loop and keep checking for data B could wait for some signal from A
Parallelism
Competition Synchronization
-
7/31/2019 cs1621b
159/175
159
Both tasks are competing for the same sharedresource
If one or both tasks modify the data, it could causedata inconsistencies
Ex: Task A and Task B are MAC machine accessesof the same bank account
Task A checks the balance: $200
Task B checks the balance: $200 Task A withdraws $200
Task A updates balance to $0
Task B withdraws $200
Task B updates balance to $-200
We must have some mechanism that ensures
MUTUAL EXCLUSION for CRITICAL DATA We could have a LOCK on the data, or a similar
mechanism allowing only one task to access it at atime
Parallelism
Synchronization Mechanisms
-
7/31/2019 cs1621b
160/175
160
Synchronization Mechanisms
Semaphores Devised by Dijkstra
Basically guards that are placed around code P must succeed to gain access to code
Decrements a counter when it succeedsV executes when critical section ends
Based on initial value of counter, we can controlhow many tasks are allowed to access the criticalsection at once
If used properly, can guarantee eithercooperation or competition synchronization
However, it is easy to NOT use them properly Can cause problems
Parallelism
Monitors
-
7/31/2019 cs1621b
161/175
161
Monitors
Devised by Hansen and Hoare Critical data section is part of a data object
that allows only one task entry at a time
Better than semaphores for competition
synchronization, because mechanism is builtinto the monitor Harder to programmer to mess up
No better for cooperation synchronization Still must be done manually
Used in Concurrent Pascal, Modula-2 and(somewhat) in Java
Parallelism
Message Passing
-
7/31/2019 cs1621b
162/175
162
Message Passing
Proposed by Hansen and Hoare More general than either of the two previous
techniques
Tasks are synchronized via messages sent to
each other Message is similar in look/execution to a
subprogram call, but with restrictions: Caller (or passer) of the message is blocked at the
call until the receiver is ready to receive it
Receiver (or executer) of the message is blockedat the message code until the message is called
Caller and Receiver meet at a rendezvous
Parallelism
Idea is that we know exactly where in the
-
7/31/2019 cs1621b
163/175
163
Idea is that we know exactly where in thecode both tasks will be when a rendezvous
occurs So even though tasks execute asynchronously, we
synchronize them with respect to each other at arendezvous
Ex: Ada
Still much of the work is up to theprogrammer
Parallelism
Parallel processing concerns
-
7/31/2019 cs1621b
164/175
164
Parallel processing concerns
Data consistency We have already discussed this
Mutual exclusion is needed to preventmultiple tasks from accessing critical data at
the same time
However, efforts to ensure data consistencycan cause other problems, such asDEADLOCK and STARVATION
Parallelism
Deadlock
-
7/31/2019 cs1621b
165/175
165
Deadlock
When a (shared) resource has restrictedaccess, it can cause a task to stop execution Wait in a semaphore queue
Wait in a monitor queue
Wait in an accept queue
If a circular resource dependency exists, wecan get deadlock
Ex:Task A has acquired binary semaphore S1
Task B has acquired binary semaphore S2Task A is waiting for binary semaphore S2
Task B is waiting for binary semaphore S1
Parallelism
Starvation
-
7/31/2019 cs1621b
166/175
166
Starvation
To combat deadlock, most languages allow atask to release a resource prematurely insome circumstances Ex: If one of the Tasks in the previous example
release the semaphore, the other can proceed
Under these circumstances there is thepossibility that a task may never acquire all ofthe resources that it needs at the time itneeds them starvation
We must be careful to avoid all of theseproblems when programming in parallel
-
7/31/2019 cs1621b
167/175
-
7/31/2019 cs1621b
168/175
Prolog
Rules are predicates that consist of a head
-
7/31/2019 cs1621b
169/175
169
pand a body
In order for the head to "succeed" in itsevaluation, all of the goals in the body mustbe satisfied These goals could be facts, or could be other rules
Ex from ex1.pl:sibling(X,Y) :- X \== Y, parent(P,X), parent(P,Y). The :- can be thought of as "if"
Execution of a program is in fact a sequenceof questions, or assertions
Database is searched in an effort to satisfy allof the assertions
Prolog
If assertions can be satisfied, answer is yes
-
7/31/2019 cs1621b
170/175
170
, y Otherwise, answer is no
If a given assertion succeeds, executionproceeds to the next one
If a given assertion fails, execution backtracksand attempts to re-satisfy the previous
assertion
So what about variable assignments?
These are in fact just side effects that occurin an effort to satisfy the query
In fact variables are not assigned in thetraditional (imperative language) sense
Prolog
Variables in Prolog are dynamically typed and
-
7/31/2019 cs1621b
171/175
171
g y y yphave two states:
Uninstantiated:Variable is not associated with a value
InstantiatedVariable is associated with a value
Once a variable is instantiated, it keeps that value,and all occurrences of that variable within thesame scope have that value Cannot be re-assigned in sense of imperative languages
However, if execution backtracks past the point at whichit was instantiated, it can again become uninstantiated
Let's look again at ex1.pl
Prolog
Recursion and database search
-
7/31/2019 cs1621b
172/175
172
Recursion and database search
Recursion is a fundamental part ofprogramming in prolog
Execution is simply satisfaction of goals, andthere are no loops as in imperativelanguages
Thus, to build complex "programs" we mustutilize recursive programming
Each attempt to satisfy a goal initiates asearch of the database
-
7/31/2019 cs1621b
173/175
-
7/31/2019 cs1621b
174/175
Prolog Lists
As in Lisp, the list is an important data
-
7/31/2019 cs1621b
175/175
s sp, t e st s a po ta t data
structure in PrologA list consists of a head and a tail
Tail could be the empty list