cs1621b

7/31/2019 cs1621b

1/175

Course Notes for

CS1621 Structure of

Programming LanguagesPart BBy

John C. Ramirez

Department of Computer ScienceUniversity of Pittsburgh

7/31/2019 cs1621b

2/175

2

These notes are intended for use by students in CS1621 at

the University of Pittsburgh and no one else

These notes are provided free of charge and may not be soldin any shape or form

Material from these notes is obtained from various sources,

including, but not limited to, the textbooks: Concepts of Programming Languages, Seventh Edition, by Robert

W. Sebesta (Addison Wesley)

Programming Languages, Design and Implementation, FourthEdition, by Terrence W. Pratt and Marvin V. Zelkowitz (Prentice

Hall) Compilers Principles, Techniques, and Tools, by Aho, Sethi and

Ullman (Addison Wesley)

7/31/2019 cs1621b

3/175

3

Expressions

Expressions are vital to programs

Allow programmer to specify the calculationsthat computer is to perform

It is important that programmer understand

how a language evaluates expressions

Things to consider:

Precedence and associativity

Order of operand evaluation Side-effects of evaluation

Overloadings and coercions

7/31/2019 cs1621b

4/175

4

Expressions

Precedence and Associativity

We always learn these rules for any newlanguage

Vital to using expressions correctly

Most languages have similar precedence forthe standard operators: * / then +

But programmer needs to understand

precedence and associativity for alloperators, especially those that may beunusual

7/31/2019 cs1621b

5/175

5

Expressions

Ex: boolean and relational operators

and or not < > = != ==

In Pascal, the boolean operators have higherprecedence than the relational operators(opposite of C++)

if x < y then writeln(Less);

if x < y and y < z then writeln(Middle); Above is an error in Pascal, since the first sub-

expression evaluated would be y and y

if (x < y) and (y < z) then writeln(Middle);

Now it is ok

In C++if (x < y && y < z) cout

7/31/2019 cs1621b

6/175

6

Expressions

Ex: unary ++ and -- in C++

Precedence and associativity are wacky!#include

using namespace std;

int main()

{

unsigned int i1 = 0, i2, i3, i4, i5, j, k, m1, m2, m3, m4, m5;

j = i1++; k = ++i1;

cout

7/31/2019 cs1621b

7/175

7

Expressions

Output? See plusplus.cpp try it on different

platforms http://www.cppreference.com/operator_precedenc

e.html See problem in Assignment 3

Compare to plusplus.java and plusplus.pl
http://www.cppreference.com/operator_precedence.htmlhttp://www.cppreference.com/operator_precedence.htmlhttp://www.cppreference.com/operator_precedence.htmlhttp://www.cppreference.com/operator_precedence.html

7/31/2019 cs1621b

8/175

8

Expressions

In some cases, expression is ambiguous and

compiler will not let you do it, or warn youabout it

Ex: A ** B ** C in Ada Must have parentheses

Ex: Mixing bitwise operators in C++ Warning to use parentheses

Sometimes you could probably figure it out,

but youre better off not trying Ex: If more than one coercion can occur in

C++ May have defined constructor and conversion fn

7/31/2019 cs1621b

9/175

9

Expressions

Sometimes you dont think you should care,

about precedence and associativity, but youshould

In math, addition and multiplication areassociative and commutative

On computer, overflow can cause this to notalways be the case: floats x = 1e+30, B = 1.0/1e+30, C = 1e+30

A * B * C A * C * B

~= 1e+30 = infinitysee Overflow.cpp

F1.add(F2); F2.add(F1)

-- If F1 and F2 are from different classes, the operationsmay be different or perhaps not even legal

7/31/2019 cs1621b

10/175

10

Expressions

Side-effects can also cause evaluation order

problems

Expressions can involve function calls, whichcan change variable valuesY = f(X) + X;

Y = X + f(X);

Without side-effects, the results are thesame, but if f(X) changes the value of X, theresults could be different

Most languages allow reference parameters withfunctions

These can cause logic errors if used improperly

See side.cpp

7/31/2019 cs1621b

11/175

11

Expressions

How to handle this?

Leave it up to the programmer, as in Pascal andC++

Limits compiler optimizations, some of which mayinclude reordering of operations

Compiler cannot reorder if it could possibly changeresult

Do not allow (most) side-effects to occur, as inAda Ada functions cannot change parameters

Now optimizations can reorder expressions withoutchanging result (at least due to this)

Best advice is to program in such away as to either

avoid all side-effects, or to only allow them incases where they will not affect expressionevaluation

7/31/2019 cs1621b

12/175

12

Expressions

Operator Overloading

Used in many newer high-level languages

Can be good and bad

Good:Aids in readability and simplifies code if used

correctly Ex: New class Complex variables A, B and C

A + B + C is more clear than (A.add(B)).add(C)

Ex: String variables can be comparedif (A < B)

is clearer thanif (A.compareTo(B) < 1)

7/31/2019 cs1621b

13/175

13

Expressions

Bad:

Can harm readability if used incorrectly Ex: + defined to do multiplication

But methods could be improperly named as well

Function calls are not obvious, especially if

other versions of the function exist In C++ we could have an member function + and

also a friend function + which is used?

Can allow some logic errors to go undetected Ex: C++ uses / for float and integer division

If user expects a value between 0 and 1, its notgoing to happen if integer division is used

7/31/2019 cs1621b

14/175

14

Expressions

Some languages like C++ and Ada allow

programmer-defined operator overloading

Others like Java do not

Both positions have support

7/31/2019 cs1621b

15/175

15

Expressions

Coercion and conversion

In many expressions we use more than onedatatype Mixed expressions

This seems a reasonable thing to allow

However, often the operators and functionsused are defined for only a single type

In this case, to allow mixed expressions to beused, some types must be converted to other

types The differences in languages are whether

these conversions should be IMPLICIT orEXPLICIT

7/31/2019 cs1621b

16/175

16

Expressions

Explicit conversion

In this case the language allows little or nomixed expressions in the code

To allow mixing of data types, theprogrammer must convert through an

operation of function call Ex: Ada does not even allow mixing of floats and

integers

Good:

Everything is clear no uncertainty or ambiguity Programmer can more easily verify correctness ofprograms

Easier to avoid logic errors

7/31/2019 cs1621b

17/175

17

Expressions

Bad:

Makes language very wordy Can be annoying, especially when the types are

similar (ex. addition of integers and floats)

Implicit conversion coercion

In this case mixed expressions are allowed,and the language coerces types whereneeded to allow types to match

Usually a language has some rules by which

the coercions are performed Good:

Less wordy makes programs shorter andsometimes easier to write

7/31/2019 cs1621b

18/175

18

Expressions

Bad:

Programs are harder to verify for correctness It is not always clear which coercion is being done,

especially when programmer-defined coercions areallowed

Can lead to logic errors in programs

Ex: In C++ expressions are always coerced if theycan be

Standard rules of promotion for predefined typescan be easily remembered

However, programmer can also define functionsthat will be used for coercion

Constructors for classes and conversion functionsare both implicitly called if necessary

Now the rules are less clear and can lead toambiguity and logic errors

7/31/2019 cs1621b

19/175

19

Expressions

Consider A = B + C where A, B and C are all ofdifferent types

Any/all of the following could exist: + operator with two type B arguments

+ operator with two type C arguments

Constructor for type B with argument type C

Constructor for type C with argument type B

Coercion function from C to B Coercion function from B to C

Constructor for type A with argument type B

Constructor for type A with argument type C

How does programmer know which will be used?

Should NOT assume any particular coercion willoccur in this case

Here explicit coercion should be used to removeambiguity

See coercion.cpp and rational.h

7/31/2019 cs1621b

20/175

20

Expressions

Boolean expressions

Expressions that evaluate to TRUE or FALSE

Formed using relational operators andboolean operators

Relational operators operators whichcompare values Operands can be most primitive types and complex

types as well in some cases

Boolean operators operators used tocombine boolean results Operands must be boolean values

Exception is C/C++

7/31/2019 cs1621b

21/175

7/31/2019 cs1621b

22/175

22

Expressions

Short-Circuit Evaluation

Important note (that we may not haveemphasized earlier):

Operator precedence and associativity are for

OPERATORS, not OPERANDS The operators simply indicate how the operands

are combined/utilized, NOT the order in which theyare accessed/determined

For example: A + B + C + D

We know we first add A and B, then add C, then

add D But the VALUES for A, B, C and D could be

obtained in ANY ORDER Done to optimize execution (ex. in parallel)

7/31/2019 cs1621b

23/175

23

Expressions

This is significant in (at least) 2 situations:

1) Operand evaluation produces a side-effect thatchanges result of subsequent operand evaluation As we discussed previously, operand could be a

function call with a reference parameter

Operand could be used/modified more than once,as with ++ example

2) An operand may not be even be valid if aprevious operand evaluates in a certain wayEx: if ((X != 0) && (Y/X < 1)) cout

7/31/2019 cs1621b

24/175

24

Expressions

Idea of SSE is simple:

Evaluate boolean expressions only until a finalanswer can be determined For example with &&, we know that

FALSE && ANYTHING == FALSE

so we would not get the division by zero error

SSE is nice because it makes our codesimpler

If we know compiler uses SSE, we can put

into a single expression what otherwise wouldrequire two

7/31/2019 cs1621b

25/175

25

Expressions

Ex: if ((X != 0) && (Y/X < 1)) cout

7/31/2019 cs1621b

26/175

26

Expressions

Solution is to offer programmer the choice

Ada uses arbitrary evaluation of operandsnormally But special operators and then and or else provide

short-circuit evaluation if desired

C++ and Java use SSE for && and || butarbitrary evaluation for bitwise & and |

7/31/2019 cs1621b

27/175

27

Expressions

Assignment

Central to Imperative Languages

Gives a value to a variable

Typical syntax:

Semantics:1) Compute lvalue of variable

2) Compute rvalue of expression

3) Store computed rvalue in lvalue location

7/31/2019 cs1621b

28/175

28

Expressions

Variations

Some languages allow multiple targets

C++ and Java allow conditional targets Wacky ?: operator

C, C++ and Java have many assignmentvariations for convenience Ex: ++, +=, *=

C, C++ and Java return the rvalue asoperation result

Allows assignment to be mixed within otherexpressions

As with many features from C, C++, this is bothgood and bad

7/31/2019 cs1621b

29/175

29

Expressions

Allows shorter code in cases such as:A = B = C

while ((ch = getchar()) != EOF)

Since it is changing the value of a variable, orderof evaluation is critical

Typically associates right to left, and it is a goodidea to parenthesize (as above)

Famous C/C++ bug that we mentioned before:if (x = y) is wacky! Will ALWAYS be true if y is non-zero

Will ALWAYS be false if y is zero

Newer compilers warn you about it

Not possible in Java since if requires a boolean

Concern also must be given for overloading theassignment operator (legal in C++ and Ada) It is possible to cause it to behave differently from

what is normally expected

Care has to be taken so that it works in all cases

7/31/2019 cs1621b

30/175

30

Expressions

Ex: Overloading = for a linked list variableLList A, B;

// Fill B with various nodes

A = B;

If we want to use this assignment as with otherassignments, we need to return the assigned resultas the result of the assignment

In C++ this is typically a reference return value, sothat we can cascade the operator effectivelyA = (B = C); (A = B) = C;

On the left, when the assignment B = C is finished,we need the rvalue of the result

On the right, when the assignment A = B isfinished, we need the lvalue of the result

Reference allows both (even though right seemssilly to do)

Also, how about A = A; If we destroy old LL before assigning new one, this

could destroy the value

7/31/2019 cs1621b

31/175

31

Expressions

One issue that you may not normally

consider: How is the rvalue evaluated? For statically typed languages, there is usually

no ambiguity expression result type mustmatch the type of the variable

But for dynamically typed languages, it is nolonger clear Ex: in Prolog

A = 5 + 3

Since A is not necessarily an integer, 5 + 3 couldbe taken as a string just as reasonably as it couldbe taken as an arithmetic expression

See assig.pl

7/31/2019 cs1621b

32/175

32

Control Statements

Primary types of control in imperative

languages

Selection

Choose between 1 or more different actions

Iteration

Repeat an action 0 or more times

7/31/2019 cs1621b

33/175

33

Control Statements

Selection

One-way selection

ifstatement exists in virtually everyimperative language

Idea here is that we either execute astatement or do not

In modern languages this is achieved usingan if without the optional else

Two-way selection

Now we incorporate the else with the if

7/31/2019 cs1621b

34/175

34

Control Statements

Typical syntax:

if

else

Interesting issues:

1) Form of condition?

2) What kinds of statements are allowed?

3) Is nesting allowed and how is it interpreted?

7/31/2019 cs1621b

35/175

35

Control Statements

1) Form of condition

Most languages require a booleanexpression (true or false only)

C/C++ are exceptions int values areallowed

2) Kinds of statements

Original FORTRAN and BASIC allowed only asingle statement This is not conducive to good programming

techniques Only way to have multiple statements is by using

an unconditional branch, i.e. GO TO

7/31/2019 cs1621b

36/175

36

Control Statements

ALGOL 60 introduced the compound

statement Now an arbitrary number of statements can be

used

All newer imperative languages (and updates ofolder languages) either use compoundstatements or allow multiple statements within

the if

3) Nesting

It logically follows that a statement withinan if clause or else clause could be another

if statement Remember orthogonality

What issues occur in this case?

7/31/2019 cs1621b

37/175

37

Control Statements

Only problem of interest is one we have

already discussed If the number of if clauses and else clauses are

not equal, how are they associated?

There are two main approaches to handlingthis:1) Use a rule (static semantics) to determine how

this is handled This is the approach taken in Pascal, C, C++ and

Java

System handles the rule consistently, so there isno ambiguity, but, like rules of precedence and

associativity, the programmer could forget it ormake a mistake that is not caught

Can lead to logic errors

We have already seen this example

7/31/2019 cs1621b

38/175

38

Control Statements

2) Use syntax to determine how it is handled This is the approach taken in Ada, BASIC, Modula-

2, ALGOL 68 Every if statement must be syntactically terminated

(ex: end if)

Now an inner if clause without an else clause muststill have an end if, and syntactically the outer elsecan only be associated with the outer if

Perl has a slightly different approach: thestatement for an if MUST be a compoundstatement. Result is the same, since the inner ifwill now be within a compound statement

7/31/2019 cs1621b

39/175

39

Control Statements

Multiple Selection

Idea is to choose from many possible options

Clearly one way of doing this is throughnested if statements Often preferable, especially if the means of

selection is a series of separate booleanexpressions// Break tie for A and B in some sport

if (A beat B twice) then

A wins tie

else if (B beat A twice) then

B wins tieelse if (A scored more points than B) then

A wins tie

else if (B scored more points than A) then

B wins tie

7/31/2019 cs1621b

40/175

40

Control Statements

However, in some situations, the options are

based on different result values of a singleexpression: Ex: Menu in which user chooses an option from 1

to 5; each option causes a different action

In these instances, nested ifs could be used In fact these are all we really need But the nesting gets complicated, often making the

statements harder to follow and making themmore prone to logic errors

So many languages supply a case statement

Specifically designed for multiple alternativeselection based on different results of a singleexpression

7/31/2019 cs1621b

41/175

41

Control Statements

There are some interesting issues to

consider here Many are the same as for two-way selection

Text discusses them at length

A few that we will look at

What happens after the code for the matchedselection is executed?

One option is to break out of the structure,

continuing with the next statement after it This makes each option mutually exclusive This approach is taken by Algol W, Pascal, Ada

Probably the most intuitive idea the choices aremutually exclusive by default

7/31/2019 cs1621b

42/175

42

Control Statements

C, C++ and Java do not automatically break

out after the selection has been executed This is good and bad (as usual)

Adds flexibility If the execution for one selection is a superset of

another, it makes sense to allow the flow tocontinue within the selection statement

Causes potential logic problems Programmer must manually add breaks

If one is missed no syntax error occurs

What happens if no match is found?

Two logical alternatives: 1. Do nothing

2. Error

7/31/2019 cs1621b

43/175

43

Control Statements

C, C++, Java adopt the do-nothing

approach Seems logical that if nothing matches nothing

should be done

ANSI Standard Pascal and Ada adopt theerror approach

More reliable, since now an accidental out of rangevalue will be detected as an error rather than justa do nothing

C, C++, Java, Ada, Turbo Pascal, BASIC alsoprovide a default choice

Good idea to always use so you can detect an outof range value without causing a runtime or logicerror

7/31/2019 cs1621b

44/175

44

Control Statements

Iteration

Three primary types of iterative loops:conditional loops, counting loops andarbitrary loops

1) Conditional (logically controlled) loops Number of iterations is determined by a

boolean condition, and cannot be (usually)precalculated

ex:while (infile && valid == 1) Note that we cannot predict when this condition

will become false

7/31/2019 cs1621b

45/175

45

Control Statements

Many languages have two versions of the

conditional loop Pretest condition is tested prior to entering the

loop body May execute loop body 0 times

Posttest condition is tested immediately afterexecuting loop body

Will always execute loop body at least 1 time Ada does not have this version

Two versions are provided for conveniencewe can always simulate one loop with the

other (plus some conditionals) See loops.cpp Clearly the difference is where each is more

appropriate

7/31/2019 cs1621b

46/175

46

Control Statements

Conditional loops are the most general kind

of loops, and are really all that is needed inan imperative programming language

However, many looping applications dealwith arrays and sequences of values

For convenience and efficiency it is prudentto provide a looping structure gearedtoward these applications

2) Counting Loops (counter-controlled loops) Number of iterations determined by a

control variable, an initial value, a terminalvalue, and an increment

7/31/2019 cs1621b

47/175

47

Control Statements

We can (usually) precalculate the number of

iterations based on the initial value, terminalvalue and increment

Ex: for (int i = 3; i

7/31/2019 cs1621b

48/175

48

Control Statements

Machine can use a register for the iteration countand not have to worry about obtaining operands

for the comparisons at each iteration of the loop,something that must be done with a conditionalloop

To allow precalculation and iteration countsto work, some restrictions must be made on

the loop Loop control variable cannot be altered by the

programmer within the loop body

Terminal value must be calculated only one time,when loop is first entered

It will also speed things up if the loop control

variable is an integer (or integral type) so no floatoperations are necessary

This is the approach taken in Pascal and Ada See for.p

7/31/2019 cs1621b

49/175

49

Control Structures

Pascal and Ada also do not allow an

increment other than 1 or1, and do notcarry the value of the control variable pastthe end of the loop In Pascal, the value is officially undefined, but in

any Pascal implementation it will typically be oneof two things: 1) The terminal value of the loop or2) The terminal value + 1 or 1. 1) typicallyindicates that iteration counts are being used

In Ada, the loop control variable is implicitlydeclared in the loop header, and becomes reallyundefined at the end of the loop accessing itafterward will cause an undeclared variable error

This is now generally accepted as a good idea, sinceit reduces side-effect problems of using loop controlvariables that were declared and assignedelsewhere. C++ and Java both allow (but do notrequire) this as well

7/31/2019 cs1621b

50/175

50

Control Structures

Attitude in Pascal and Ada is that if you want

more complex iteration (ex. increment otherthan 1 or1, option of changing number ofiterations during the loops execution) youshould use a while loop

C, C++ and Java have a different approach For loop is not really a for loop in the

traditional sense

It is a very general loop that can be used for

any looping application It more appropriately is a while loop with the

addition of an initialization-statement and apost-body statement

7/31/2019 cs1621b

51/175

51

Control Statements

for (init-expr; pretest-expr; post-body-expr)

Now really anything goes and the pre-test-expr and post-body-expr are evaluated foreach iteration of the loop

Can certainly be used for a counting loop, as

most of you have used it Can also be used as an arbitrary loop to do

more or less whatever programmer wants itto doAdded flexibility, with added danger

The usual for C, C++

see for.cpp

7/31/2019 cs1621b

52/175

52

"foreach" loop

Newer languages also have included a

"foreach" loop to iterate through data

Key difference between "for" and "foreach"

"for" iterates through indexes (typically),

which can be used to access an array /collection if desired Loop control variable is typically an integer

"foreach" iterates through the values in thecollection directly No indexing is used, at least not directly

Loop control variable is the data type we areaccessing in the collection

7/31/2019 cs1621b

53/175

53

"foreach" loop

foreach loop has its advantages and

disadvantages

Advantages:

Since no counter is used, we eliminate the

possibility of index out of bounds problems We can iterate over a collection without

having to know the implementation details ofthe collectionAllows for data hiding and improves error

prevention We will likely discuss this more when we discuss

object-oriented programming

7/31/2019 cs1621b

54/175

54

"foreach" loop

Disadvantage

When accessing an array, we may want orneed the index value Ex: What if we want to change the data in the

array or reorganize it Ex: Sorting would difficult using "foreach"

See forEach.java and foreach.pl

7/31/2019 cs1621b

55/175

55

Control Statements

3) Arbitrary Loops

Now the loop is basically an infinite loop,with the programmer expected to break outof it explicitly at some point

Ada allows this with theloop

end loop;

exit statement will break out of the loop,

and can be put into an if statement

Thus we can break out of the loop frommore than one place

7/31/2019 cs1621b

56/175

56

Control Statements

Although C, C++ and Java do not explicitly

have this construct, you can certainly build itby making a while or for loop an infiniteloop and using the breakstatement tobreak out

while (1) // C while (true) // Java

{ {

} }

Again this feature adds flexibility, but makes

code less readable and harder to debug

7/31/2019 cs1621b

57/175

57

Control Statements

Unconditional Branching

Transfer execution from one section of codeto another section of code

Commonly known as the goto

Used extensively in early languages whichlacked block control structures

Ex. early FORTRAN and BASIC programsrelied heavily on the goto

It was necessary then, but most modernlanguages contain block control structures

7/31/2019 cs1621b

58/175

58

Control Statements

Even then computer scientists were aware of

how problematic they could be Spaghetti code that results is very difficult toread

Modification of one code segment can significantlyimpact many parts of the program programmermust be aware of all places that can go to that

code segment Debugging is very difficult it is hard to find andfix logic errors since all possible execution pathsare difficult to trace

Now languages have blocks and extensive

control structures It has been shown that goto adds no functionality(i.e. nothing can be done with it that cannot bedone without it)

However, many languages still have goto

7/31/2019 cs1621b

59/175

59

Control Statements

Unrestricted goto allows code segments that

normally have only one entry and exit pointto have many Ex: What happens if you jump into the middle of a

procedure (what about parameters?) or a whileloop (condition is skipped)

Most newer languages that have the gotohave restrictions on it Ex: Cannot jump into an inactive statement or

block in Pascal

If restricted and used infrequently, can actually beuseful in some languages

Ex: Pascal does not have a break statement. If anexceptional situation would case an exit from aloop, using a goto may be more readable thanadding extra convoluted logic

7/31/2019 cs1621b

60/175

60

Control Statements

Some (newer) languages do not have goto at

all Ex: JavaAllows breaks from loops

Has exception handlers

7/31/2019 cs1621b

61/175

61

Subprograms

Subprograms

Semi-independent blocks of code with thefollowing basic characteristics:

Only one entry point the beginning of the

subprograms, and execute when called: Parameter information is passed to subprogram Caller execution is temporarily suspended, and

subprogram executes

When subprogram terminates, caller executionresumes at point directly following the subprogram

call

7/31/2019 cs1621b

62/175

62

Subprograms

What types of subprograms can we

have?

Most languages have two different types,procedures and functions

Procedures can be thought of as new namedstatements that can supplement thepredefined statements in the language

Ex: Statements to search or sort an array

Once defined, these can be used anywherethey are needed in a program

7/31/2019 cs1621b

63/175

63

Subprograms

In order to have an effect on the overall

program, a procedure needs to act onsomething other than just the variables localto the procedure. This can be done through: Outputting data to the display or to a file

Altering a (relatively) global variable that will be

accessed/used later by a different part of theprogram

Altering formal parameters such that the actualparameters in the caller are modified

This will be discussed in more detail soon

7/31/2019 cs1621b

64/175

64

Subprograms

Functions can be thought of as code

segments that calculate and return a singleresult

Modeled after math functions

Used within expressions, where result value is

substituted for the call

The effect of functions on the overall programis the value returned by them. Thus, from anideal (and mathematical) point of view,

functions should have NO OTHER effect onthe overall program

7/31/2019 cs1621b

65/175

65

Subprograms

Should NOT modify global variables

Should NOT alter actual parameters

Naturally, both of the above are allowed inmany languages In these cases it is up to the programmer to decide

how he/she wants to use functions

Again the tradeoff for the increased flexibility is themore potential for logic errors and more difficultyin debugging

C/C++/Java

Only have functions, no procedures

void functions can mimic the behavior ofprocedures

7/31/2019 cs1621b

66/175

66

Subprograms

Local variables

How/when are they allocated?

Stack-dynamic:

Default in most modern imperative languages

Required for recursive calls, since memorymust be associated with each call, not eachsubprogram Ex: Binary Search

mid = (left + right)/2;

Many different values for mid must be able tocoexist, one for each call on the run-time stack

Could not do it memory was statically allocated

7/31/2019 cs1621b

67/175

67

Subprograms

Overhead is time for allocation and

deallocation each time a subprogram is called May not seem like a lot of time is needed, but itcan add up if many calls are made in a program

Access must be indirect since actual memorylocation of variable will not be known until a

subprogram call is made Location in run-time stack depends upon calls

made prior to current one, which can differ fromrun to run

Also adds some time overhead

Static: Used in languages that do not support

recursion (ex. older FORTRAN)

7/31/2019 cs1621b

68/175

68

Subprograms

Also optional in other languages, such as C

and C++Allow variables to retain values from call to

call Remember the lifetime is the duration of the

program

Ex: In CS1501 LZW algorithm writing codewords to afile, the bit buffer is static

The leftover bits are kept in the buffer for the nextcall

7/31/2019 cs1621b

69/175

69

Subprograms

Parameters

Parameters are vital to subprograms

Allow information to be:

Passed IN to the subprogram

Passed OUT from the subprogram

Passed IN and OUT to and from thesubprogram

When writing subprograms, programmerdecides which is required for a givensubprogram

7/31/2019 cs1621b

70/175

70

Subprograms

Then programmer utilizes syntax/rules in

language being used to achieve the desiredoption

Sometimes the syntax/rules of the languagedo not fit exactly with the 3 use options given

In these cases programmer must be carefulto use the parameters as he/she intends

Some definitions:

Formal Parameter: Parameter specified in the subprogram header Only exists during duration of subprogram exec

Sometimes called "parameter"

7/31/2019 cs1621b

71/175

71

Subprograms

Actual Parameter:

Parameter specified in call of the subprogram May exist outside of the scope of the procedure

Sometimes called just "argument"

Rules for Formal and Actual parametersdiffer, as we will discuss

7/31/2019 cs1621b

72/175

72

Subprograms

Parameter Passing Options

Pass-by-Value

Pass-by-Reference

Pass-by-Result

Pass-by-Value-Result

Pass-by-Name

You should be familiar with Pass-by-Value

and Pass-by-Reference

Others may be new to you

Well discuss each

7/31/2019 cs1621b

73/175

73

Subprograms

Pass-by-Value

Formal parameter is a copy of the actualparameter

i.e. get r-value of actual parameter and copyit into the formal parameter

Default in many imperative languages

Only kind used in C and Java

Used for IN parameter passing

Actual can typically be a variable, constantor expression

7/31/2019 cs1621b

74/175

74

Subprograms

Benefit is that actual parameters cannot be

altered through manipulation of the formalsAlso useful in some recursive calls, since a

new copy is made with each call

Problem is that copying a parameter can bequite expensive, both in terms of time andmemory

Ex: Consider an object with an array of 1000

floats Object is copied with each call to the function If, for example, recursive calls are made, a lot of

memory can be consumed very quickly

7/31/2019 cs1621b

75/175

75

Subprograms

Implementation:

Using a run-time stack, this is straightforward When subprogram is called, copy of actual

parameter is placed into a local variable, which isstored on the run-time stack (in the activationrecord for the subprogram)

During subprogram execution, formal parameter isused like any other local variable for thesubprogram

Only difference is that it is initialized via the actualparameter

7/31/2019 cs1621b

76/175

76

Subprograms

Pass-by-Reference

Formal parameter is a reference to (oraddress of) the actual parameter variable

get l-value of actual param and copy it intothe formal param, then access the actualparam indirectly through the formal param

Used in Pascal (var parameters), in C (usingexplicit pointers) and C++ and PHP (&)

Most appropriate for IN and OUT parameterpassing, but can be used for all

Actual param usually restricted to a variable

7/31/2019 cs1621b

77/175

77

Subprograms

Benefit is that we can change or not change

the actual parameter using the formal it isup to the programmer

Also good that memory is saved only anaddress is copied

Problem is that we can miss logic errors ifwe accidentally alter an actual parameterthrough the formal parameter

Also some applications (ex: some recursion)

dont work as well We may not want change at one call to affect

another call

7/31/2019 cs1621b

78/175

78

Subprograms

Constant Reference Parameters

Developers of C++ realized that valueparameters are not practical for large dataobjects (too much time and memory, esp. forrecursive algorithms)

Reference parameters have danger ofaccidental side effects (when used for INparameters)

Solution is to pass parameters by reference,

but not allow them to be altered constantreference Now compiler gives error if parameter is changed

within subprogram

Copy made if passed by reference to another sub

7/31/2019 cs1621b

79/175

79

Subprograms

Good concept, but not perfect

Programmer can get around it by casting to apointer and altering indirectly

See params.cpp

Ada IN parameters have a similar idea Cannot be assigned/altered within the function

Cannot be passed by out or in out to another sub More on Ada params shortly

Implementation:

Using run-time stack, address of actual is

stored in activation recordActual is accessed indirectly in sub through its

address

b

7/31/2019 cs1621b

80/175

80

Subprograms

Pass-by-Result

Reference parameters are not an exact fitfor out parameters

Ex: A procedure designed to read data from afile into an object

Here we dont care about what used to be inthe object we just want to be sure that atthe end the appropriate value is assigned

With reference parameters we COULD accessthe old value and use it if we wanted to (orby mistake)

Pass-by-Result prevents this

S b

7/31/2019 cs1621b

81/175

81

Subprograms

In Pass-by-Result, actual parameter is not

actually passed to the subprogram it onlywaits to have a value passed back to it

Formal parameter is a local variable

During life of subprogram its value does notaffect actual parameter at all

At end of subprogram its value is passed backto the actual parameter

So what is actually needed of actualparameter is its address (lvalue)

When address is obtained can affect result forsome contrived examples

S b

7/31/2019 cs1621b

82/175

82

Subprograms

// Note: This is NOT real code

int A[8];

for (int i = 0; i < 8; i++) A[i] = i;

global int j = 2;

foo(A[j]);

output(A[]);

sub foo(int param)

{int temp = 25;

j = 5;

param = temp;

}

------------------------------------------------Output: 0 1 25 3 4 5 6 7 // if address obtained

// at call

Output: 0 1 2 3 4 25 6 7 // if obtained at ret.

S b

7/31/2019 cs1621b

83/175

83

Subprograms

If used, address is typically obtained at call

Ada 83 out parameters for simple types areALMOST this, but the formal parameter valuecannot be accessed within the sub (so it isnot really a local variable) Ada 95 changed out parameters to allow them to

be accessed, fitting the Pass-By-Result model moreclosely

Implementation:

At sub call, actual param address is calculated

and stored in run-time stack, as is the formalparam (as a local)

Final result of formal is copied back to actualaddress at end of sub

S b

7/31/2019 cs1621b

84/175

84

Subprograms

Pass-by-Value-Result

Now actual parameters value is passed tothe formal parameter when subprogram iscalled, being stored and used as a localvariable

At the end of the subprogram the value ispassed back to the actual parameter

As the name indicates, this is a combination

of Pass-by-Value and Pass-by-Result

Used for IN and OUT parameters

Subprograms

7/31/2019 cs1621b

85/175

85

Subprograms

If aliasing is NOT allowed/used, and if no

exceptions occur in the subprogram theeffect of value-result and reference is thesame

Precondition: Actual parameter has value

obtained previous to call During subprogram: Only formal parameter is

accessed, updated as desired

Postcondition: Actual parameter has last

value assigned within subprogram

7/31/2019 cs1621b

86/175

Subprograms

7/31/2019 cs1621b

87/175

87

Subprograms

Idea is that language creators did not want to

require the params to be passed in anyspecific way They just wanted to require the in-out effect

If the result could differ based on whether paramsare value-result or reference, then the program iserroneous

Up to programmer to NOT use aliases

Ada 95 clarified, requiring all structured in-out parameters to be reference

See params.adb

Implementation:

Value + Result

Subprograms

7/31/2019 cs1621b

88/175

88

Subprograms

Pass-by-Name

Definitely wackiest way of param passing

Used for IN and OUT parameters, and onlyin Algol

Idea is that actual parameter is textuallysubstituted for the formal in all places that itis accessed in the subprogram

Kind of like a macro substitution

It is only evaluated at the point of use in thesubprogram Evaluated EACH TIME it is used in subprogram

Subprograms

7/31/2019 cs1621b

89/175

89

Subprograms

Thus the parameter value or address could

change based on where/when in thesubprogram it is evaluated

However, the referencing environment usedis that of the CALLER, not of the subprogram So only changes within the subprogram that have

a global effect will change its evaluation This also makes implementation more difficult

For simple variables this is equivalent topass-by-reference

Variable address evaluates the same wayregardless of where in the subprogram it islocated

7/31/2019 cs1621b

90/175

Subprograms

7/31/2019 cs1621b

91/175

91

Subprograms

global int i = 0, var = 11, n = 5;

global int A[2] = {4, 8};

foo(var, 2*n, A[i]); // all pass by name

void foo(int x, int y, int z)

{

x = x + 1; output(var);

output(y); n = n + 1; output(y);

output(z); z = z + 1; output(z);

i = i + 1; z = z + 1; output(z);}

Subprograms

7/31/2019 cs1621b

92/175

92

Subprograms

Implementation:

It is not trivial to allow macro to be evaluatedand reevaluated in environment of thecaller

Parameterless subprograms called thunks are

used Thunk evaluates parameter in current state of

callers referencing environment

Returns the resulting address or value

Clearly this is a lot of overhead

Overhead and confusing results are why thisis not used in newer languages

Subprograms

7/31/2019 cs1621b

93/175

93

Subprograms

Subprograms as Parameters

We allow variables as parameters so that wecan access their values (or addresses) fromwithin a subprogram

Why not allow subprograms so that we canexecute them from within a subprogram?

Some languages do allow this (ex. Pascal,C++, PHP)

However, there are some issues to consider

Subprograms

7/31/2019 cs1621b

94/175

94

Subprograms

Can the parameter subprogram arguments

differ in form from each other? If so, how to type check and even check thenumber of arguments when the subprogram isactually called?

Easiest solution is to require the arguments to

all have the same form Header of parameter subprogram must be givenwithin the header of the subprogram it is beingpassed to

Scope is also an issue what is the

referencing environment of the subprogramthat is being passed as a parameter? Threereasonable possibilities exist:

Subprograms

7/31/2019 cs1621b

95/175

95

Subprograms

1) The referencing environment in which the

parameter subprogram is CALLED: shallowbinding

2) The referencing environment in which theparameter subprogram is DEFINED: deepbinding

3) The referencing environment in which theparameter subprogram is PASSED as anargument: ad hoc binding

Note that shallow binding fits well withdynamic scoping and deep binding fits wellwith static scoping

Subprograms

7/31/2019 cs1621b

96/175

96

Subprograms

Pascal and C++ both use deep binding

Shallow binding is used by SNOBOL, whichalso uses dynamic scoping

Ad hoc binding has never been used

See fnparams.cpp

Subprograms

7/31/2019 cs1621b

97/175

97

Subprograms

Overloading (ad hoc polymorphism)

Using the same subprogram name withdifferent parameter lists

When a subprogram is called, the compilerselects the correct version based on the

parameter lists In Ada, return type for a function is also

used, since coercion is not done in Ada andfunction return values cannot be ignored

Enables programmer to use the same namefor similar functions that take differentargument types

Subprograms

7/31/2019 cs1621b

98/175

98

Subp og a s

Use: Make it easier for the programmer to

use consistent names for subprograms Without overloading: Programmer must make

up different but similar names forsubprograms that do similar things but for

different types Ex: abs(int) fabs(float) labs(long) Ex: ISort(int * A) FSort(float * A)

With overloading: Programmer uses the samename and the compiler decides which to use

Ex: abs(int) abs(float) abs(long) Ex: Sort(int * A) Sort(float * A)

Subprograms

7/31/2019 cs1621b

99/175

99

p g

But programmer must be careful:

Ada and C++ both allow overloading anddefault parameters

Leaving out some parameters in the call couldmake a call ambiguous

i.e. it matches more than one function header Call can also be ambiguous if implicit casting

of arguments is done

Operator Overloading is the same idea, but

with symbols rather than identifiers We discussed these issues previously

See Slide 12 of cs1621b.ppt

7/31/2019 cs1621b

100/175

Generics

7/31/2019 cs1621b

101/175

101

Motivation:

Programmers often apply data structures andalgorithms to more than one data type Ex. Sorting, Searching algos

Ex. BST, PQ, Stack, Queue data structures

Even with overloading, the programmer muststill write different (identical except for type)versions of the code

Generics simply transfer the job of makingthe different versions from the programmerto the compilerautomates the overloadingprocess Note that DIFFERENT VERSIONS of the code MUST

STILL BE generated

Generics

7/31/2019 cs1621b

102/175

102

So the reason we have generics is to save the

programmer some time (and perhaps someconfusion)

Ada vs. C++:

In Ada, template instantiations must be

explicit Programmer specifies template arguments using

the new statement

Ex: package int_io is new integer_io(integer);

The generic package is integer_io

The instantiated package is int_io The type argument is integer

As is usual in Ada, if declaration is explicit,there will be no surprises

Generics

7/31/2019 cs1621b

103/175

103

In C++, template instantiations can beexplicit or implicit

Implicit: generated automatically by thecompiler when a call is seen with theappropriate arguments Duplicate instantiations are merged into a single

code segment Coercion cannot be done, since the types wont

match the template correctly

Saves programmer some typing

Explicit: programmer declares each version

Coercion can be done using regular C++promotion and conversion rules

Programmer is aware of each version

See template.cpp and tordlist.h

7/31/2019 cs1621b

104/175

Generics

7/31/2019 cs1621b

105/175

105

However, retrieving objects back from the

collection required explicit casting to theactual type if we wanted full access to themArrayList A = new ArrayList();

A.add(new String("Wacky"));

String S = (String) A.remove(0);

Also any typing mistakes (mixing types inthe collection unintentionally) could only becaught at run-time (via casting exceptions)

Overall not bad, but some people thoughttype parameters should be allowed

Generics

7/31/2019 cs1621b

106/175

106

JDK 1.5 added syntax very similar to that for

C++ templatesHowever, it is very different from C++

templates (and Ada generics as well)

It is not really adding any new generic

abilities to the language

It is not creating new code for each versionof the class or method

It is designed to make collections of objectsmore type-safe

See more details in the handout

Implementing Subprograms

7/31/2019 cs1621b

107/175

107

What is involved when a subprogram is

called, during its execution, and when itterminates?

This will differ depending on if recursion is

allowed in a language or notMost modern languages allow recursion, but

original FORTRAN (up to FORTRAN 77) didnot allow it


7/31/2019 cs1621b

108/175

108

FORTRAN 77 (and before)

All variables within a subprogram werestatic, and recursive calls were not allowed

Activation records were still used, but they

also could be static Since all data was static, the size was known

at compile time

Run-time stack not needed, since at most one

call per sub could be performed at a timeWhat do we need to know when a

subprogram is called?


7/31/2019 cs1621b

109/175

109

Return Value

Local Variables

Parameters

Return Address

If sub is a function

Static

Like local variablesthat are initialized

Where to go back towhen subprogramends

7/31/2019 cs1621b

110/175


7/31/2019 cs1621b

111/175

111

So the activation record looks similar to thatused in FORTRAN With additional link location to access global

variables

Now multiple instances of an activation recordcan occur at the same time, so they must be

created dynamically (at run-time), unlike inFORTRAN

Lets look at some of the contents of anactivation record


7/31/2019 cs1621b

112/175

112

Temporaries

Local Variables

Parameters

Dynamic Link toprevious call

Static Link to Non-Locals

Return Address

Temps and local variables areallocated within the subprog.

call. In Pascal, C and C++,the local variables must be offixed size. In Ada, they can bevariable size (ex. arrays)

Parameters, links to non-Localsand the return address areplaced into the AR by the callerof the subprogram, so they arelower in the record


7/31/2019 cs1621b

113/175

113

See rtstack.cpp

Accessing non-local variables within asubprogram

Local variables are located within theactivation record (AR)

Can be accessed by knowing the baseaddress of the AR plus a local_offset for eachvariable

Ex: Base address of AR = 162

int x, y[5]; // address of x is 162 + (other AR stuff)float z; // address of z is 162 + (other AR stuff)

// + 4 + 20


7/31/2019 cs1621b

114/175

114

Non-locals are located elsewhere

For languages like C and C++: Subprograms cannot be nested

Besides locals there are global variables

For languages like Ada and Pascal: Subprograms can be nested to arbitrary depth

A sub can be declared within a sub, which is withina sub, which is within a sub

Using static scope, variables declared in a textualparent sub are accessible from an inner sub

Relative global variables

But the variable locations could be in differentplaces on the run-time stack

How to find them?

7/31/2019 cs1621b

115/175


7/31/2019 cs1621b

116/175

116

Two techniques used to locate AR

1) Static links

A link is kept in an AR to that ARs textualparent (from the declaration)

To access a single nonlocal many links maybe crossed

2) Display

A single array is kept to indicate all of the

currently accessible nested subs Any nonlocal can be accessed with two

indirect accesses

7/31/2019 cs1621b

117/175


7/31/2019 cs1621b

118/175

118

However, textual parent does NOT have to

be previous call on run-time stack So dynamic link in AR is not enough (but

would work for dynamic scoping)

sub foo

{

sub innerA

{ }

sub innerB

{ innerA; }

innerB;

}

main

{ foo; }

innerA

innerB

foo


7/31/2019 cs1621b

119/175

119

Static links connect an AR to the AR of the

subs textual parent, no matter wherepreviously on the RT stack it is

How is this used to access nonlocalvariables?

Can be determined and maintained based onthe nesting depths of the subprograms thatare called The difference in the nesting depths between the

sub using a nonlocal variable and the sub in which

the nonlocal is declared is equal to the number ofstatic links that must be crossed to find the correctAR for the variable


7/31/2019 cs1621b

120/175

120

This difference can be stored for each variablewhen the program is compiled, so that at run-timefinding the variable is simple

sub parent {

var X, Y

sub child1 {

var X, Z

sub grand1 {var Z

}

}

sub child2 {

var Y

call child1}

}

main {

call parent }

If variable Y is accessedwithin grand1

chain offset is 2, since Y

is declared two levelsoutside grand1

so search for Y only hasto be done once atcompile-time

at run-time we know tofollow two static links,whatever call sequence is


7/31/2019 cs1621b

121/175

121

What actually happens when a sub is called?

AR for textual parent of sub must be located on therun-time stack, so that the static link can be linked toit

A clear (but inefficient) way to do this is to followdynamic links down the RTS until the AR for the parentsub is found

A better way can take advantage of the fact that thecalling sub and the called sub must be relatives inthe declaration tree

Calling sub could be parent of called sub (but notgrandparent)

Calling sub could be called sub (direct recursion)

Calling sub could be a sibling of called sub Calling sub could be a descendent of called sub (indirect

recursion) Calling sub could be a niece of called sub


7/31/2019 cs1621b

122/175

122

So instead of following dynamic links, atcompile-time we can pre-calculate thenumber of static links (from caller) to followto find the appropriate textual parent ARAlways equal to: nesting_depth (calling sub)

nesting_depth(called sub) + 1

Calling sub could be parent of called sub X (X+1) + 1 = 0 static links (user caller's AR)

Calling sub could be called sub (direct recursion) X X + 1 = 1 static link same textual parent

Calling sub could be a sibling of called sub X X + 1 = 1 static link same textual parent

Calling sub could be a descendent of called sub(indirect recursion)

Calling sub could be a niece of called sub Follow diff. in nesting depth + 1 static links

Implementing Subprogams

d Bi b i

7/31/2019 cs1621b

123/175

123

procedure Bigsub is

procedure A(Flag: Boolean) is

procedure B is

...A(false);

end; -- B

begin -- A

if flag

then B;

else C;

end; -- Aprocedure C is

procedure D is

here

end; -- D

...

D;end; -- C

begin -- Bigsub

A(true);

end; -- Bigsub

D dynamic link to C

static link to C

return addr. to C

C dynamic link to Astatic link to Bigsub

return addr. to A

A param flag ( = false)

dynamic link to B

static link to Bigsub

return addr. to B

B dynamic link to A

static link to A

return addr. to A

A param flag ( = true)

dynamic link to Bigsub

static link to Bigsub

return addr. to Bigsub

Bigsub dynamic link to caller

static link

return addr.


7/31/2019 cs1621b

124/175

124

Evaluation of static links

Maintaining is not too time-consuming

Chain offsets can be calculated at compiletime

Local variables can be accessed directlyNon-locals must follow 1 or more static links

Works well if nesting depths do not get toodeep

For deep sub nesting, cost of non-local accesscan be high But usually 2 or 3 levels is max used


7/31/2019 cs1621b

125/175

125

Display

Uses a single array to store links to ARs atall relevant nesting depths

To access a nonlocal at a given nestingdepth, we just follow the display entry forthat depth, then the local_offset Never more than one link to follow

Array is updated as subs are called and asthey terminate

Generally faster than static links if manynesting levels are used

We will skip the details here read the text


7/31/2019 cs1621b

126/175

126

Nested declaration blocks

Idea could be similar to nested subs

Blocks could be treated as parameterless subs

Static links could be used to determine textualparent

But it is actually much easier to handle, sinceblock entry and exit is always the same

Parent block goes to child block

When child block terminates, we revert to parentblock


7/31/2019 cs1621b

127/175

127

Simply push new block declarations onto

run-time stack, and pop them when blockterminates

But we only have one activation record, sono links are required

"Non-locals" can be accessed just like locals

7/31/2019 cs1621b

128/175

Data Abstraction

7/31/2019 cs1621b

129/175

129

Procedural (process) abstraction:

Action can be performed without requiringdetailed knowledge of how it is performed

Data abstraction:

New type can be used without requireddetailed knowledge of how it is implemented

We don't need to know the details of how it isstored in memory

We don't need to know the details of how it ismanipulated via operations

Data Abstraction

M f ll ADT t ti f t

7/31/2019 cs1621b

130/175

130

More formally, an ADT must satisfy two

conditions:1) The declarations of the type and operations

(interface) are contained in a single syntacticunit ENCAPSULATION

The interface does not depend on how theobjects are represented or how the operationsare implemented

2) The representation of the objects is hidden from

users of the ADT DATA HIDING

Objects can only be manipulated via the providedinterface

Data Abstraction

7/31/2019 cs1621b

131/175

131

Ex: Stack Data: something that can store and access

multiple data values in the manner dictated by theoperations

Operations: Push add new value to top of stack

Pop remove top value from stack

Top view top value (or a copy) without removing

Empty is stack empty

User of stack only needs to know the parametersand effect of each operation to use a stackcorrectly

Implementation could be an array, a linked-list, ormaybe something different

Does not affect use

Implementer can hide these details from the userthrough private declarations

7/31/2019 cs1621b

132/175

Data Abstraction

7/31/2019 cs1621b

133/175

133

Newer languages added true data

abstractionAda via packages

C++, Java, C#, Ada95 via classes / objects

Encapsulation units that contain all details ofthe new type

Access modifiers that prevent access tointernal details of the ADT from outside the

encapsulation unit

See text for more details

Object-Oriented Programming (OOP)

7/31/2019 cs1621b

134/175

134

Characteristics of OOP

1) Data abstraction: encapsulation +information-hiding

The operations for manipulating data areconsidered to be part of the data type

(encapsulated)

The implementation details of the data type(both the structure of the data and theimplementation of the operations) are

separate from their specifications and(possibly) hidden from the user As we discussed with ADTs

OOP

7/31/2019 cs1621b

135/175

135

2) Inheritance

The characteristics of an ADT (data +operations) can be passed on to a subtype Subtype can also add new data and operations

Allows programmer to build new (derived)types from old (parent) ones

Common data/operations do not have to berewritten (or copied)

Operations that are slightly different in derivedtype can be rewritten (overridden) for that type

New data/operations tailor the derived type to

the problem at hand Parent type is unchanged and may (sometimes)be used together with derived type

7/31/2019 cs1621b

136/175

OOP

7/31/2019 cs1621b

137/175

137

3) Polymorphism

Variables of a parent class can also beassigned objects of a subclass (or subclassof a subclass)

Operations used with a variable are based

upon the class of the object currently stored(could be a parent type object or a derivedtype object) Operations may have been overridden in the

derived class

Dynamic binding allows parent and derivedobjects to be used together in a logical way

OOP

Sh l

7/31/2019 cs1621b

138/175

138

Ex: Shape class We could declare:

Shape shapelist[100];

shapelist[0] = new Rectangle(0, 0, 10, 20);

shapelist[1] = new Square(50, 100, 30, 30);

shapelist[2] = new Circle(100, 50, 25);

for (int i = 0; i < 3; i++)shapelist[i].Draw();

Polymorphism allows these different objectsto be accessed consistently within the same

array Think about how you could do the code

above in C or Pascal It would not be easy!

OOP

O i M k i d

7/31/2019 cs1621b

139/175

139

One option: Make one giant struct or recordto contain all of the data, including a union orvariant Base class would use only the core data items

Derived classes would use additional data itemsas provided in the union or variant

To do the operations, we would need a switch or

case to test which type the variable is, so that itcan be written out appropriately

Now what if we want to add another newderived class, Pentagon? With OOP, it is simple to add any new data and

override the necessary operations Without OOP we would have to change the overall

structure of the data and operations old typeswould change, possibly causing problems

OOP

7/31/2019 cs1621b

140/175

140

OO Languages

1) Smalltalkwas the first and purest OOL All data (even numeric literals) are objects,

and are all descendents of class Object

Objects are all allocated from the heap, andimplicitly deallocated (garbage collection)

Variables are references, with implicitdereferencing

Execution of a program (logically) involvesobjects sending messages to each other,executing methods, and responding back So the data is driving the execution, not the

control statements

7/31/2019 cs1621b

141/175

7/31/2019 cs1621b

142/175

OOP

E i l t t i d

7/31/2019 cs1621b

143/175

143

Equivalent to previous code:| letters |

letters := 0.(Prompter prompt: 'Enter your name' default:'')

do: [ :c | c isLetter

ifTrue: [ letters := letters + 1 ].

].

letters printNl.

Now we cascade the messages to allow fewer statements(also do: loop iterates through characters in a string, sowe dont need the loop counter

(((Prompter prompt: 'Enter your name' default:'')

select: [ :c | c isLetter ]) size printNl.

Now the select: loop generates a string based on thecondition in the block

OOP

M S llt lk ( l d bj t )

7/31/2019 cs1621b

144/175

144

More on Smalltalk (classes and objects)

Data in an object can be an instance variableor a class variable Instance variables are associated with objects

Separate data for each object

Accessible only through the methods defined forthat object always private to the class

Class variables are associated with classes Shared data for all objects of the same class

Accessible from all objects, but still private to theclass

Methods have a similar grouping, but are

public Instance methods associated with objects

Class methods associated with entire class

OOP

M S llt lk (i h it )

7/31/2019 cs1621b

145/175

145

More on Smalltalk (inheritance)

Object base class of all others Only single inheritance allowed

All inheritance is implementation inheritance Data and methods of parent class are always

accessible to the derived class i.e. Cannot hide implementation details from

derived class

Advantage: Derived class can likely implement itsmethods more efficiently with access to parentdata

Disadvantage: Change in parent classimplementation will likely require change in derivedclass implementation

Ex. Traversable stack

OOP

M S llt lk ( l hi )

7/31/2019 cs1621b

146/175

146

More on Smalltalk (polymorphism)

All messages are dynamically bound tomethodsAt run-time, when a message is received, the

objects class is searched for a method, then, ifnecessary its superclass, its super-superclass andso on up to Object

Variables have no types since they are onlyused to refer to objects, not to determine themessages an object can receive

Clearly some liabilities with this approach

Slows language down due to run-time overhead Programmer type errors cannot be caught until

execution time

OOP

L t' l k t l

7/31/2019 cs1621b

147/175

147

Let's look at some examples:

person.cls as an example of a new class See personTest.st

student.cls as an example of a subclass

studentTest.st as an example showing polymorphic

access twodarry.cls as another subclass example

See twodTest.st

For more information, see the GNU Smalltalk

User's Guide: http://www.gnu.org/software/smalltalk/gst-manual/gst.html

OOP

2) C++ is an impe ati e/OO mi
http://www.gnu.org/software/smalltalk/gst-manual/gst.htmlhttp://www.gnu.org/software/smalltalk/gst-manual/gst.htmlhttp://www.gnu.org/software/smalltalk/gst-manual/gst.htmlhttp://www.gnu.org/software/smalltalk/gst-manual/gst.html

7/31/2019 cs1621b

148/175

148

2) C++ is an imperative/OO mix Had to be backward compatible with C

Wanted to add object-oriented features

Result is that programmer can use as few or asmany OO features as he/she wants to

C++ Classes and Objects Can be static, stack-dynamic or heap-dynamic

Member data and member functions can be private,protected or public

Allows programmer to decide

Like Smalltalk, has notion of class variables Delcared as static in C++

Destructor needed if object uses dynamic memory

OOP

C++ Inheritance

7/31/2019 cs1621b

149/175

149

C++ Inheritance

Do not need a superclass (no Object baseclass for all other classes)

Multiple inheritance is allowed Complex and difficult to use

Implementation inheritance or interfaceinheritance are allowed With interface inheritance, all data and functions

are still inherited, but only public ones are directlyaccessible to the derived class

Advantage: Modifications to parent class do not

affect derived class, as long as they do not changethe interface

Disadvantage: Operations may be slower, sincethey cannot access the data directly

7/31/2019 cs1621b

150/175

OOP

3) Java falls in between Smalltalk and C++

7/31/2019 cs1621b

151/175

151

3) Java falls in between Smalltalk and C++

Like Smalltalk: Object is base class to other classes Single inheritance only

Objects are (almost) all dynamic, with garbagecollection

References used to access

Method names are (by default) dynamicallybound

Like C++: Access can be private, public or protected

Static binding can optionally be used to improverun-time speed

Overall syntax for member data and functionaccess

Variables are typed

OOP

Other Java OOP features:

7/31/2019 cs1621b

152/175

152

Interfaces allow for a simplified form of

multiple inheritanceAn interface is in a sense a base class with no data

and only abstract (pure virtual) methods

A class that implements an interface simplyimplements the methods specified therein

Advantages: Objects that implement an interfacecan be used whereever the interface is specified.This allows for a type of generic behavior

Ex: Comparable interface, Runnable interface

Disadvantage: Can become complicated wheninterfaces and inheritance are both used

Reflection that allows us to manipulate theclasses themselves

See poly.java

OOP

OOL Implementation

7/31/2019 cs1621b

153/175

153

OOL Implementation

Data: Typically a record/struct type of storage is

usedClass Instance Record (CIR)

Data members are accessed by name, in the

same way as records

Subclass adds extra data to CIR of parentclass

Private access enforced by limiting visibility ofthe data

OOP

Subprograms:

7/31/2019 cs1621b

154/175

154

Subprograms:

Static binding Subprograms that will be called are determined bythe variable type

Variable types are known at compile time and codecan be determined then

Dynamic Binding: Subprograms that will be called are determined bythe objects type, not the variables type

Objects stored in a variable are determined at runtime

Appropriate links must be stored with the object

But they are the same for all objects of that classVirtual Method Table (VMT) used to store links to

all pertinent subprograms

Parallelism

Parallelism is incorporated into

7/31/2019 cs1621b

155/175

155

Parallelism is incorporated into

programs for 2 primary reasons:1) Program is running in a multiprocessing or

distributed environment

Many computers now have multiple CPUs

Many jobs are distributed over multiplecomputers in a network

A programming language should be able totake advantage of this parallelism Many algorithms can be improved if designed for

parallel execution

This is PHYSICAL PARALLELISM

Parallelism

2) Program is running in a simulated parallel

7/31/2019 cs1621b

156/175

156

2) Program is running in a simulated parallelenvironment, allowing for asynchronousactivity

Ex: Two windows are displayed to the user.One shows the current time (incrementedby seconds) and one allows the user todraw images on the screen We dont want the act of the user drawing to

stop the clock

We dont want the clock running to prevent theuser from drawing

Even with a single processor, we want both ofthese activities to execute in parallel

This is LOGICAL PARALLELISM

7/31/2019 cs1621b

157/175

Parallelism

If the tasks have some dependencies there

7/31/2019 cs1621b

158/175

158

If the tasks have some dependencies, therecan be a problem Most common dependency is shared data To handle this we must synchronize the tasks

Cooperation Synchronization

One task is dependent upon an output/outcome ofanother

Ex: Task B must process data produced by Task A Contractor B cannot put up drywall until contractor

A has finished the wiring

Task to count ballots cannot proceed until task thatcollects ballots provides it with some

We must have a mechanism that allows Task B to

pause until the data is available B could loop and keep checking for data B could wait for some signal from A

Parallelism

Competition Synchronization

7/31/2019 cs1621b

159/175

159

Both tasks are competing for the same sharedresource

If one or both tasks modify the data, it could causedata inconsistencies

Ex: Task A and Task B are MAC machine accessesof the same bank account

Task A checks the balance: $200

Task B checks the balance: $200 Task A withdraws $200

Task A updates balance to $0

Task B withdraws $200

Task B updates balance to $-200

We must have some mechanism that ensures

MUTUAL EXCLUSION for CRITICAL DATA We could have a LOCK on the data, or a similar

mechanism allowing only one task to access it at atime

Parallelism

Synchronization Mechanisms

7/31/2019 cs1621b

160/175

160

Synchronization Mechanisms

Semaphores Devised by Dijkstra

Basically guards that are placed around code P must succeed to gain access to code

Decrements a counter when it succeedsV executes when critical section ends

Based on initial value of counter, we can controlhow many tasks are allowed to access the criticalsection at once

If used properly, can guarantee eithercooperation or competition synchronization

However, it is easy to NOT use them properly Can cause problems

Parallelism

Monitors

7/31/2019 cs1621b

161/175

161

Monitors

Devised by Hansen and Hoare Critical data section is part of a data object

that allows only one task entry at a time

Better than semaphores for competition

synchronization, because mechanism is builtinto the monitor Harder to programmer to mess up

No better for cooperation synchronization Still must be done manually

Used in Concurrent Pascal, Modula-2 and(somewhat) in Java

Parallelism

Message Passing

7/31/2019 cs1621b

162/175

162

Message Passing

Proposed by Hansen and Hoare More general than either of the two previous

techniques

Tasks are synchronized via messages sent to

each other Message is similar in look/execution to a

subprogram call, but with restrictions: Caller (or passer) of the message is blocked at the

call until the receiver is ready to receive it

Receiver (or executer) of the message is blockedat the message code until the message is called

Caller and Receiver meet at a rendezvous

Parallelism

Idea is that we know exactly where in the

7/31/2019 cs1621b

163/175

163

Idea is that we know exactly where in thecode both tasks will be when a rendezvous

occurs So even though tasks execute asynchronously, we

synchronize them with respect to each other at arendezvous

Ex: Ada

Still much of the work is up to theprogrammer

Parallelism

Parallel processing concerns

7/31/2019 cs1621b

164/175

164

Parallel processing concerns

Data consistency We have already discussed this

Mutual exclusion is needed to preventmultiple tasks from accessing critical data at

the same time

However, efforts to ensure data consistencycan cause other problems, such asDEADLOCK and STARVATION

Parallelism

Deadlock

7/31/2019 cs1621b

165/175

165

Deadlock

When a (shared) resource has restrictedaccess, it can cause a task to stop execution Wait in a semaphore queue

Wait in a monitor queue

Wait in an accept queue

If a circular resource dependency exists, wecan get deadlock

Ex:Task A has acquired binary semaphore S1

Task B has acquired binary semaphore S2Task A is waiting for binary semaphore S2

Task B is waiting for binary semaphore S1

Parallelism

Starvation

7/31/2019 cs1621b

166/175

166

Starvation

To combat deadlock, most languages allow atask to release a resource prematurely insome circumstances Ex: If one of the Tasks in the previous example

release the semaphore, the other can proceed

Under these circumstances there is thepossibility that a task may never acquire all ofthe resources that it needs at the time itneeds them starvation

We must be careful to avoid all of theseproblems when programming in parallel

7/31/2019 cs1621b

167/175

7/31/2019 cs1621b

168/175

Prolog

Rules are predicates that consist of a head

7/31/2019 cs1621b

169/175

169

pand a body

In order for the head to "succeed" in itsevaluation, all of the goals in the body mustbe satisfied These goals could be facts, or could be other rules

Ex from ex1.pl:sibling(X,Y) :- X \== Y, parent(P,X), parent(P,Y). The :- can be thought of as "if"

Execution of a program is in fact a sequenceof questions, or assertions

Database is searched in an effort to satisfy allof the assertions

Prolog

If assertions can be satisfied, answer is yes

7/31/2019 cs1621b

170/175

170

, y Otherwise, answer is no

If a given assertion succeeds, executionproceeds to the next one

If a given assertion fails, execution backtracksand attempts to re-satisfy the previous

assertion

So what about variable assignments?

These are in fact just side effects that occurin an effort to satisfy the query

In fact variables are not assigned in thetraditional (imperative language) sense

Prolog

Variables in Prolog are dynamically typed and

7/31/2019 cs1621b

171/175

171

g y y yphave two states:

Uninstantiated:Variable is not associated with a value

InstantiatedVariable is associated with a value

Once a variable is instantiated, it keeps that value,and all occurrences of that variable within thesame scope have that value Cannot be re-assigned in sense of imperative languages

However, if execution backtracks past the point at whichit was instantiated, it can again become uninstantiated

Let's look again at ex1.pl

Prolog

Recursion and database search

7/31/2019 cs1621b

172/175

172

Recursion and database search

Recursion is a fundamental part ofprogramming in prolog

Execution is simply satisfaction of goals, andthere are no loops as in imperativelanguages

Thus, to build complex "programs" we mustutilize recursive programming

Each attempt to satisfy a goal initiates asearch of the database

7/31/2019 cs1621b

173/175

7/31/2019 cs1621b

174/175

Prolog Lists

As in Lisp, the list is an important data

7/31/2019 cs1621b

175/175

s sp, t e st s a po ta t data

structure in PrologA list consists of a head and a tail

Tail could be the empty list

cs1621b

Documents