cs1621b

Upload: paulo-laxamana

Post on 05-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 cs1621b

    1/175

    Course Notes for

    CS1621 Structure of

    Programming LanguagesPart BBy

    John C. Ramirez

    Department of Computer ScienceUniversity of Pittsburgh

  • 7/31/2019 cs1621b

    2/175

    2

    These notes are intended for use by students in CS1621 at

    the University of Pittsburgh and no one else

    These notes are provided free of charge and may not be soldin any shape or form

    Material from these notes is obtained from various sources,

    including, but not limited to, the textbooks: Concepts of Programming Languages, Seventh Edition, by Robert

    W. Sebesta (Addison Wesley)

    Programming Languages, Design and Implementation, FourthEdition, by Terrence W. Pratt and Marvin V. Zelkowitz (Prentice

    Hall) Compilers Principles, Techniques, and Tools, by Aho, Sethi and

    Ullman (Addison Wesley)

  • 7/31/2019 cs1621b

    3/175

    3

    Expressions

    Expressions are vital to programs

    Allow programmer to specify the calculationsthat computer is to perform

    It is important that programmer understand

    how a language evaluates expressions

    Things to consider:

    Precedence and associativity

    Order of operand evaluation Side-effects of evaluation

    Overloadings and coercions

  • 7/31/2019 cs1621b

    4/175

    4

    Expressions

    Precedence and Associativity

    We always learn these rules for any newlanguage

    Vital to using expressions correctly

    Most languages have similar precedence forthe standard operators: * / then +

    But programmer needs to understand

    precedence and associativity for alloperators, especially those that may beunusual

  • 7/31/2019 cs1621b

    5/175

    5

    Expressions

    Ex: boolean and relational operators

    and or not < > = != ==

    In Pascal, the boolean operators have higherprecedence than the relational operators(opposite of C++)

    if x < y then writeln(Less);

    if x < y and y < z then writeln(Middle); Above is an error in Pascal, since the first sub-

    expression evaluated would be y and y

    if (x < y) and (y < z) then writeln(Middle);

    Now it is ok

    In C++if (x < y && y < z) cout

  • 7/31/2019 cs1621b

    6/175

    6

    Expressions

    Ex: unary ++ and -- in C++

    Precedence and associativity are wacky!#include

    using namespace std;

    int main()

    {

    unsigned int i1 = 0, i2, i3, i4, i5, j, k, m1, m2, m3, m4, m5;

    j = i1++; k = ++i1;

    cout

  • 7/31/2019 cs1621b

    7/175

    7

    Expressions

    Output? See plusplus.cpp try it on different

    platforms http://www.cppreference.com/operator_precedenc

    e.html See problem in Assignment 3

    Compare to plusplus.java and plusplus.pl

    http://www.cppreference.com/operator_precedence.htmlhttp://www.cppreference.com/operator_precedence.htmlhttp://www.cppreference.com/operator_precedence.htmlhttp://www.cppreference.com/operator_precedence.html
  • 7/31/2019 cs1621b

    8/175

    8

    Expressions

    In some cases, expression is ambiguous and

    compiler will not let you do it, or warn youabout it

    Ex: A ** B ** C in Ada Must have parentheses

    Ex: Mixing bitwise operators in C++ Warning to use parentheses

    Sometimes you could probably figure it out,

    but youre better off not trying Ex: If more than one coercion can occur in

    C++ May have defined constructor and conversion fn

  • 7/31/2019 cs1621b

    9/175

    9

    Expressions

    Sometimes you dont think you should care,

    about precedence and associativity, but youshould

    In math, addition and multiplication areassociative and commutative

    On computer, overflow can cause this to notalways be the case: floats x = 1e+30, B = 1.0/1e+30, C = 1e+30

    A * B * C A * C * B

    ~= 1e+30 = infinitysee Overflow.cpp

    F1.add(F2); F2.add(F1)

    -- If F1 and F2 are from different classes, the operationsmay be different or perhaps not even legal

  • 7/31/2019 cs1621b

    10/175

    10

    Expressions

    Side-effects can also cause evaluation order

    problems

    Expressions can involve function calls, whichcan change variable valuesY = f(X) + X;

    Y = X + f(X);

    Without side-effects, the results are thesame, but if f(X) changes the value of X, theresults could be different

    Most languages allow reference parameters withfunctions

    These can cause logic errors if used improperly

    See side.cpp

  • 7/31/2019 cs1621b

    11/175

    11

    Expressions

    How to handle this?

    Leave it up to the programmer, as in Pascal andC++

    Limits compiler optimizations, some of which mayinclude reordering of operations

    Compiler cannot reorder if it could possibly changeresult

    Do not allow (most) side-effects to occur, as inAda Ada functions cannot change parameters

    Now optimizations can reorder expressions withoutchanging result (at least due to this)

    Best advice is to program in such away as to either

    avoid all side-effects, or to only allow them incases where they will not affect expressionevaluation

  • 7/31/2019 cs1621b

    12/175

    12

    Expressions

    Operator Overloading

    Used in many newer high-level languages

    Can be good and bad

    Good:Aids in readability and simplifies code if used

    correctly Ex: New class Complex variables A, B and C

    A + B + C is more clear than (A.add(B)).add(C)

    Ex: String variables can be comparedif (A < B)

    is clearer thanif (A.compareTo(B) < 1)

  • 7/31/2019 cs1621b

    13/175

    13

    Expressions

    Bad:

    Can harm readability if used incorrectly Ex: + defined to do multiplication

    But methods could be improperly named as well

    Function calls are not obvious, especially if

    other versions of the function exist In C++ we could have an member function + and

    also a friend function + which is used?

    Can allow some logic errors to go undetected Ex: C++ uses / for float and integer division

    If user expects a value between 0 and 1, its notgoing to happen if integer division is used

  • 7/31/2019 cs1621b

    14/175

    14

    Expressions

    Some languages like C++ and Ada allow

    programmer-defined operator overloading

    Others like Java do not

    Both positions have support

  • 7/31/2019 cs1621b

    15/175

    15

    Expressions

    Coercion and conversion

    In many expressions we use more than onedatatype Mixed expressions

    This seems a reasonable thing to allow

    However, often the operators and functionsused are defined for only a single type

    In this case, to allow mixed expressions to beused, some types must be converted to other

    types The differences in languages are whether

    these conversions should be IMPLICIT orEXPLICIT

  • 7/31/2019 cs1621b

    16/175

    16

    Expressions

    Explicit conversion

    In this case the language allows little or nomixed expressions in the code

    To allow mixing of data types, theprogrammer must convert through an

    operation of function call Ex: Ada does not even allow mixing of floats and

    integers

    Good:

    Everything is clear no uncertainty or ambiguity Programmer can more easily verify correctness ofprograms

    Easier to avoid logic errors

  • 7/31/2019 cs1621b

    17/175

    17

    Expressions

    Bad:

    Makes language very wordy Can be annoying, especially when the types are

    similar (ex. addition of integers and floats)

    Implicit conversion coercion

    In this case mixed expressions are allowed,and the language coerces types whereneeded to allow types to match

    Usually a language has some rules by which

    the coercions are performed Good:

    Less wordy makes programs shorter andsometimes easier to write

  • 7/31/2019 cs1621b

    18/175

    18

    Expressions

    Bad:

    Programs are harder to verify for correctness It is not always clear which coercion is being done,

    especially when programmer-defined coercions areallowed

    Can lead to logic errors in programs

    Ex: In C++ expressions are always coerced if theycan be

    Standard rules of promotion for predefined typescan be easily remembered

    However, programmer can also define functionsthat will be used for coercion

    Constructors for classes and conversion functionsare both implicitly called if necessary

    Now the rules are less clear and can lead toambiguity and logic errors

  • 7/31/2019 cs1621b

    19/175

    19

    Expressions

    Consider A = B + C where A, B and C are all ofdifferent types

    Any/all of the following could exist: + operator with two type B arguments

    + operator with two type C arguments

    Constructor for type B with argument type C

    Constructor for type C with argument type B

    Coercion function from C to B Coercion function from B to C

    Constructor for type A with argument type B

    Constructor for type A with argument type C

    How does programmer know which will be used?

    Should NOT assume any particular coercion willoccur in this case

    Here explicit coercion should be used to removeambiguity

    See coercion.cpp and rational.h

  • 7/31/2019 cs1621b

    20/175

    20

    Expressions

    Boolean expressions

    Expressions that evaluate to TRUE or FALSE

    Formed using relational operators andboolean operators

    Relational operators operators whichcompare values Operands can be most primitive types and complex

    types as well in some cases

    Boolean operators operators used tocombine boolean results Operands must be boolean values

    Exception is C/C++

  • 7/31/2019 cs1621b

    21/175

  • 7/31/2019 cs1621b

    22/175

    22

    Expressions

    Short-Circuit Evaluation

    Important note (that we may not haveemphasized earlier):

    Operator precedence and associativity are for

    OPERATORS, not OPERANDS The operators simply indicate how the operands

    are combined/utilized, NOT the order in which theyare accessed/determined

    For example: A + B + C + D

    We know we first add A and B, then add C, then

    add D But the VALUES for A, B, C and D could be

    obtained in ANY ORDER Done to optimize execution (ex. in parallel)

  • 7/31/2019 cs1621b

    23/175

    23

    Expressions

    This is significant in (at least) 2 situations:

    1) Operand evaluation produces a side-effect thatchanges result of subsequent operand evaluation As we discussed previously, operand could be a

    function call with a reference parameter

    Operand could be used/modified more than once,as with ++ example

    2) An operand may not be even be valid if aprevious operand evaluates in a certain wayEx: if ((X != 0) && (Y/X < 1)) cout

  • 7/31/2019 cs1621b

    24/175

    24

    Expressions

    Idea of SSE is simple:

    Evaluate boolean expressions only until a finalanswer can be determined For example with &&, we know that

    FALSE && ANYTHING == FALSE

    so we would not get the division by zero error

    SSE is nice because it makes our codesimpler

    If we know compiler uses SSE, we can put

    into a single expression what otherwise wouldrequire two

  • 7/31/2019 cs1621b

    25/175

    25

    Expressions

    Ex: if ((X != 0) && (Y/X < 1)) cout

  • 7/31/2019 cs1621b

    26/175

    26

    Expressions

    Solution is to offer programmer the choice

    Ada uses arbitrary evaluation of operandsnormally But special operators and then and or else provide

    short-circuit evaluation if desired

    C++ and Java use SSE for && and || butarbitrary evaluation for bitwise & and |

  • 7/31/2019 cs1621b

    27/175

    27

    Expressions

    Assignment

    Central to Imperative Languages

    Gives a value to a variable

    Typical syntax:

    Semantics:1) Compute lvalue of variable

    2) Compute rvalue of expression

    3) Store computed rvalue in lvalue location

  • 7/31/2019 cs1621b

    28/175

    28

    Expressions

    Variations

    Some languages allow multiple targets

    C++ and Java allow conditional targets Wacky ?: operator

    C, C++ and Java have many assignmentvariations for convenience Ex: ++, +=, *=

    C, C++ and Java return the rvalue asoperation result

    Allows assignment to be mixed within otherexpressions

    As with many features from C, C++, this is bothgood and bad

  • 7/31/2019 cs1621b

    29/175

    29

    Expressions

    Allows shorter code in cases such as:A = B = C

    while ((ch = getchar()) != EOF)

    Since it is changing the value of a variable, orderof evaluation is critical

    Typically associates right to left, and it is a goodidea to parenthesize (as above)

    Famous C/C++ bug that we mentioned before:if (x = y) is wacky! Will ALWAYS be true if y is non-zero

    Will ALWAYS be false if y is zero

    Newer compilers warn you about it

    Not possible in Java since if requires a boolean

    Concern also must be given for overloading theassignment operator (legal in C++ and Ada) It is possible to cause it to behave differently from

    what is normally expected

    Care has to be taken so that it works in all cases

  • 7/31/2019 cs1621b

    30/175

    30

    Expressions

    Ex: Overloading = for a linked list variableLList A, B;

    // Fill B with various nodes

    A = B;

    If we want to use this assignment as with otherassignments, we need to return the assigned resultas the result of the assignment

    In C++ this is typically a reference return value, sothat we can cascade the operator effectivelyA = (B = C); (A = B) = C;

    On the left, when the assignment B = C is finished,we need the rvalue of the result

    On the right, when the assignment A = B isfinished, we need the lvalue of the result

    Reference allows both (even though right seemssilly to do)

    Also, how about A = A; If we destroy old LL before assigning new one, this

    could destroy the value

  • 7/31/2019 cs1621b

    31/175

    31

    Expressions

    One issue that you may not normally

    consider: How is the rvalue evaluated? For statically typed languages, there is usually

    no ambiguity expression result type mustmatch the type of the variable

    But for dynamically typed languages, it is nolonger clear Ex: in Prolog

    A = 5 + 3

    Since A is not necessarily an integer, 5 + 3 couldbe taken as a string just as reasonably as it couldbe taken as an arithmetic expression

    See assig.pl

  • 7/31/2019 cs1621b

    32/175

    32

    Control Statements

    Primary types of control in imperative

    languages

    Selection

    Choose between 1 or more different actions

    Iteration

    Repeat an action 0 or more times

  • 7/31/2019 cs1621b

    33/175

    33

    Control Statements

    Selection

    One-way selection

    ifstatement exists in virtually everyimperative language

    Idea here is that we either execute astatement or do not

    In modern languages this is achieved usingan if without the optional else

    Two-way selection

    Now we incorporate the else with the if

  • 7/31/2019 cs1621b

    34/175

    34

    Control Statements

    Typical syntax:

    if

    else

    Interesting issues:

    1) Form of condition?

    2) What kinds of statements are allowed?

    3) Is nesting allowed and how is it interpreted?

  • 7/31/2019 cs1621b

    35/175

    35

    Control Statements

    1) Form of condition

    Most languages require a booleanexpression (true or false only)

    C/C++ are exceptions int values areallowed

    2) Kinds of statements

    Original FORTRAN and BASIC allowed only asingle statement This is not conducive to good programming

    techniques Only way to have multiple statements is by using

    an unconditional branch, i.e. GO TO

  • 7/31/2019 cs1621b

    36/175

    36

    Control Statements

    ALGOL 60 introduced the compound

    statement Now an arbitrary number of statements can be

    used

    All newer imperative languages (and updates ofolder languages) either use compoundstatements or allow multiple statements within

    the if

    3) Nesting

    It logically follows that a statement withinan if clause or else clause could be another

    if statement Remember orthogonality

    What issues occur in this case?

  • 7/31/2019 cs1621b

    37/175

    37

    Control Statements

    Only problem of interest is one we have

    already discussed If the number of if clauses and else clauses are

    not equal, how are they associated?

    There are two main approaches to handlingthis:1) Use a rule (static semantics) to determine how

    this is handled This is the approach taken in Pascal, C, C++ and

    Java

    System handles the rule consistently, so there isno ambiguity, but, like rules of precedence and

    associativity, the programmer could forget it ormake a mistake that is not caught

    Can lead to logic errors

    We have already seen this example

  • 7/31/2019 cs1621b

    38/175

    38

    Control Statements

    2) Use syntax to determine how it is handled This is the approach taken in Ada, BASIC, Modula-

    2, ALGOL 68 Every if statement must be syntactically terminated

    (ex: end if)

    Now an inner if clause without an else clause muststill have an end if, and syntactically the outer elsecan only be associated with the outer if

    Perl has a slightly different approach: thestatement for an if MUST be a compoundstatement. Result is the same, since the inner ifwill now be within a compound statement

  • 7/31/2019 cs1621b

    39/175

    39

    Control Statements

    Multiple Selection

    Idea is to choose from many possible options

    Clearly one way of doing this is throughnested if statements Often preferable, especially if the means of

    selection is a series of separate booleanexpressions// Break tie for A and B in some sport

    if (A beat B twice) then

    A wins tie

    else if (B beat A twice) then

    B wins tieelse if (A scored more points than B) then

    A wins tie

    else if (B scored more points than A) then

    B wins tie

  • 7/31/2019 cs1621b

    40/175

    40

    Control Statements

    However, in some situations, the options are

    based on different result values of a singleexpression: Ex: Menu in which user chooses an option from 1

    to 5; each option causes a different action

    In these instances, nested ifs could be used In fact these are all we really need But the nesting gets complicated, often making the

    statements harder to follow and making themmore prone to logic errors

    So many languages supply a case statement

    Specifically designed for multiple alternativeselection based on different results of a singleexpression

  • 7/31/2019 cs1621b

    41/175

    41

    Control Statements

    There are some interesting issues to

    consider here Many are the same as for two-way selection

    Text discusses them at length

    A few that we will look at

    What happens after the code for the matchedselection is executed?

    One option is to break out of the structure,

    continuing with the next statement after it This makes each option mutually exclusive This approach is taken by Algol W, Pascal, Ada

    Probably the most intuitive idea the choices aremutually exclusive by default

  • 7/31/2019 cs1621b

    42/175

    42

    Control Statements

    C, C++ and Java do not automatically break

    out after the selection has been executed This is good and bad (as usual)

    Adds flexibility If the execution for one selection is a superset of

    another, it makes sense to allow the flow tocontinue within the selection statement

    Causes potential logic problems Programmer must manually add breaks

    If one is missed no syntax error occurs

    What happens if no match is found?

    Two logical alternatives: 1. Do nothing

    2. Error

  • 7/31/2019 cs1621b

    43/175

    43

    Control Statements

    C, C++, Java adopt the do-nothing

    approach Seems logical that if nothing matches nothing

    should be done

    ANSI Standard Pascal and Ada adopt theerror approach

    More reliable, since now an accidental out of rangevalue will be detected as an error rather than justa do nothing

    C, C++, Java, Ada, Turbo Pascal, BASIC alsoprovide a default choice

    Good idea to always use so you can detect an outof range value without causing a runtime or logicerror

  • 7/31/2019 cs1621b

    44/175

    44

    Control Statements

    Iteration

    Three primary types of iterative loops:conditional loops, counting loops andarbitrary loops

    1) Conditional (logically controlled) loops Number of iterations is determined by a

    boolean condition, and cannot be (usually)precalculated

    ex:while (infile && valid == 1) Note that we cannot predict when this condition

    will become false

  • 7/31/2019 cs1621b

    45/175

    45

    Control Statements

    Many languages have two versions of the

    conditional loop Pretest condition is tested prior to entering the

    loop body May execute loop body 0 times

    Posttest condition is tested immediately afterexecuting loop body

    Will always execute loop body at least 1 time Ada does not have this version

    Two versions are provided for conveniencewe can always simulate one loop with the

    other (plus some conditionals) See loops.cpp Clearly the difference is where each is more

    appropriate

  • 7/31/2019 cs1621b

    46/175

    46

    Control Statements

    Conditional loops are the most general kind

    of loops, and are really all that is needed inan imperative programming language

    However, many looping applications dealwith arrays and sequences of values

    For convenience and efficiency it is prudentto provide a looping structure gearedtoward these applications

    2) Counting Loops (counter-controlled loops) Number of iterations determined by a

    control variable, an initial value, a terminalvalue, and an increment

  • 7/31/2019 cs1621b

    47/175

    47

    Control Statements

    We can (usually) precalculate the number of

    iterations based on the initial value, terminalvalue and increment

    Ex: for (int i = 3; i

  • 7/31/2019 cs1621b

    48/175

    48

    Control Statements

    Machine can use a register for the iteration countand not have to worry about obtaining operands

    for the comparisons at each iteration of the loop,something that must be done with a conditionalloop

    To allow precalculation and iteration countsto work, some restrictions must be made on

    the loop Loop control variable cannot be altered by the

    programmer within the loop body

    Terminal value must be calculated only one time,when loop is first entered

    It will also speed things up if the loop control

    variable is an integer (or integral type) so no floatoperations are necessary

    This is the approach taken in Pascal and Ada See for.p

  • 7/31/2019 cs1621b

    49/175

    49

    Control Structures

    Pascal and Ada also do not allow an

    increment other than 1 or1, and do notcarry the value of the control variable pastthe end of the loop In Pascal, the value is officially undefined, but in

    any Pascal implementation it will typically be oneof two things: 1) The terminal value of the loop or2) The terminal value + 1 or 1. 1) typicallyindicates that iteration counts are being used

    In Ada, the loop control variable is implicitlydeclared in the loop header, and becomes reallyundefined at the end of the loop accessing itafterward will cause an undeclared variable error

    This is now generally accepted as a good idea, sinceit reduces side-effect problems of using loop controlvariables that were declared and assignedelsewhere. C++ and Java both allow (but do notrequire) this as well

  • 7/31/2019 cs1621b

    50/175

    50

    Control Structures

    Attitude in Pascal and Ada is that if you want

    more complex iteration (ex. increment otherthan 1 or1, option of changing number ofiterations during the loops execution) youshould use a while loop

    C, C++ and Java have a different approach For loop is not really a for loop in the

    traditional sense

    It is a very general loop that can be used for

    any looping application It more appropriately is a while loop with the

    addition of an initialization-statement and apost-body statement

  • 7/31/2019 cs1621b

    51/175

    51

    Control Statements

    for (init-expr; pretest-expr; post-body-expr)

    Now really anything goes and the pre-test-expr and post-body-expr are evaluated foreach iteration of the loop

    Can certainly be used for a counting loop, as

    most of you have used it Can also be used as an arbitrary loop to do

    more or less whatever programmer wants itto doAdded flexibility, with added danger

    The usual for C, C++

    see for.cpp

  • 7/31/2019 cs1621b

    52/175

    52

    "foreach" loop

    Newer languages also have included a

    "foreach" loop to iterate through data

    Key difference between "for" and "foreach"

    "for" iterates through indexes (typically),

    which can be used to access an array /collection if desired Loop control variable is typically an integer

    "foreach" iterates through the values in thecollection directly No indexing is used, at least not directly

    Loop control variable is the data type we areaccessing in the collection

  • 7/31/2019 cs1621b

    53/175

    53

    "foreach" loop

    foreach loop has its advantages and

    disadvantages

    Advantages:

    Since no counter is used, we eliminate the

    possibility of index out of bounds problems We can iterate over a collection without

    having to know the implementation details ofthe collectionAllows for data hiding and improves error

    prevention We will likely discuss this more when we discuss

    object-oriented programming

  • 7/31/2019 cs1621b

    54/175

    54

    "foreach" loop

    Disadvantage

    When accessing an array, we may want orneed the index value Ex: What if we want to change the data in the

    array or reorganize it Ex: Sorting would difficult using "foreach"

    See forEach.java and foreach.pl

  • 7/31/2019 cs1621b

    55/175

    55

    Control Statements

    3) Arbitrary Loops

    Now the loop is basically an infinite loop,with the programmer expected to break outof it explicitly at some point

    Ada allows this with theloop

    end loop;

    exit statement will break out of the loop,

    and can be put into an if statement

    Thus we can break out of the loop frommore than one place

  • 7/31/2019 cs1621b

    56/175

    56

    Control Statements

    Although C, C++ and Java do not explicitly

    have this construct, you can certainly build itby making a while or for loop an infiniteloop and using the breakstatement tobreak out

    while (1) // C while (true) // Java

    { {

    } }

    Again this feature adds flexibility, but makes

    code less readable and harder to debug

  • 7/31/2019 cs1621b

    57/175

    57

    Control Statements

    Unconditional Branching

    Transfer execution from one section of codeto another section of code

    Commonly known as the goto

    Used extensively in early languages whichlacked block control structures

    Ex. early FORTRAN and BASIC programsrelied heavily on the goto

    It was necessary then, but most modernlanguages contain block control structures

  • 7/31/2019 cs1621b

    58/175

    58

    Control Statements

    Even then computer scientists were aware of

    how problematic they could be Spaghetti code that results is very difficult toread

    Modification of one code segment can significantlyimpact many parts of the program programmermust be aware of all places that can go to that

    code segment Debugging is very difficult it is hard to find andfix logic errors since all possible execution pathsare difficult to trace

    Now languages have blocks and extensive

    control structures It has been shown that goto adds no functionality(i.e. nothing can be done with it that cannot bedone without it)

    However, many languages still have goto

  • 7/31/2019 cs1621b

    59/175

    59

    Control Statements

    Unrestricted goto allows code segments that

    normally have only one entry and exit pointto have many Ex: What happens if you jump into the middle of a

    procedure (what about parameters?) or a whileloop (condition is skipped)

    Most newer languages that have the gotohave restrictions on it Ex: Cannot jump into an inactive statement or

    block in Pascal

    If restricted and used infrequently, can actually beuseful in some languages

    Ex: Pascal does not have a break statement. If anexceptional situation would case an exit from aloop, using a goto may be more readable thanadding extra convoluted logic

  • 7/31/2019 cs1621b

    60/175

    60

    Control Statements

    Some (newer) languages do not have goto at

    all Ex: JavaAllows breaks from loops

    Has exception handlers

  • 7/31/2019 cs1621b

    61/175

    61

    Subprograms

    Subprograms

    Semi-independent blocks of code with thefollowing basic characteristics:

    Only one entry point the beginning of the

    subprograms, and execute when called: Parameter information is passed to subprogram Caller execution is temporarily suspended, and

    subprogram executes

    When subprogram terminates, caller executionresumes at point directly following the subprogram

    call

  • 7/31/2019 cs1621b

    62/175

    62

    Subprograms

    What types of subprograms can we

    have?

    Most languages have two different types,procedures and functions

    Procedures can be thought of as new namedstatements that can supplement thepredefined statements in the language

    Ex: Statements to search or sort an array

    Once defined, these can be used anywherethey are needed in a program

  • 7/31/2019 cs1621b

    63/175

    63

    Subprograms

    In order to have an effect on the overall

    program, a procedure needs to act onsomething other than just the variables localto the procedure. This can be done through: Outputting data to the display or to a file

    Altering a (relatively) global variable that will be

    accessed/used later by a different part of theprogram

    Altering formal parameters such that the actualparameters in the caller are modified

    This will be discussed in more detail soon

  • 7/31/2019 cs1621b

    64/175

    64

    Subprograms

    Functions can be thought of as code

    segments that calculate and return a singleresult

    Modeled after math functions

    Used within expressions, where result value is

    substituted for the call

    The effect of functions on the overall programis the value returned by them. Thus, from anideal (and mathematical) point of view,

    functions should have NO OTHER effect onthe overall program

  • 7/31/2019 cs1621b

    65/175

    65

    Subprograms

    Should NOT modify global variables

    Should NOT alter actual parameters

    Naturally, both of the above are allowed inmany languages In these cases it is up to the programmer to decide

    how he/she wants to use functions

    Again the tradeoff for the increased flexibility is themore potential for logic errors and more difficultyin debugging

    C/C++/Java

    Only have functions, no procedures

    void functions can mimic the behavior ofprocedures

  • 7/31/2019 cs1621b

    66/175

    66

    Subprograms

    Local variables

    How/when are they allocated?

    Stack-dynamic:

    Default in most modern imperative languages

    Required for recursive calls, since memorymust be associated with each call, not eachsubprogram Ex: Binary Search

    mid = (left + right)/2;

    Many different values for mid must be able tocoexist, one for each call on the run-time stack

    Could not do it memory was statically allocated

  • 7/31/2019 cs1621b

    67/175

    67

    Subprograms

    Overhead is time for allocation and

    deallocation each time a subprogram is called May not seem like a lot of time is needed, but itcan add up if many calls are made in a program

    Access must be indirect since actual memorylocation of variable will not be known until a

    subprogram call is made Location in run-time stack depends upon calls

    made prior to current one, which can differ fromrun to run

    Also adds some time overhead

    Static: Used in languages that do not support

    recursion (ex. older FORTRAN)

  • 7/31/2019 cs1621b

    68/175

    68

    Subprograms

    Also optional in other languages, such as C

    and C++Allow variables to retain values from call to

    call Remember the lifetime is the duration of the

    program

    Ex: In CS1501 LZW algorithm writing codewords to afile, the bit buffer is static

    The leftover bits are kept in the buffer for the nextcall

  • 7/31/2019 cs1621b

    69/175

    69

    Subprograms

    Parameters

    Parameters are vital to subprograms

    Allow information to be:

    Passed IN to the subprogram

    Passed OUT from the subprogram

    Passed IN and OUT to and from thesubprogram

    When writing subprograms, programmerdecides which is required for a givensubprogram

  • 7/31/2019 cs1621b

    70/175

    70

    Subprograms

    Then programmer utilizes syntax/rules in

    language being used to achieve the desiredoption

    Sometimes the syntax/rules of the languagedo not fit exactly with the 3 use options given

    In these cases programmer must be carefulto use the parameters as he/she intends

    Some definitions:

    Formal Parameter: Parameter specified in the subprogram header Only exists during duration of subprogram exec

    Sometimes called "parameter"

  • 7/31/2019 cs1621b

    71/175

    71

    Subprograms

    Actual Parameter:

    Parameter specified in call of the subprogram May exist outside of the scope of the procedure

    Sometimes called just "argument"

    Rules for Formal and Actual parametersdiffer, as we will discuss

  • 7/31/2019 cs1621b

    72/175

    72

    Subprograms

    Parameter Passing Options

    Pass-by-Value

    Pass-by-Reference

    Pass-by-Result

    Pass-by-Value-Result

    Pass-by-Name

    You should be familiar with Pass-by-Value

    and Pass-by-Reference

    Others may be new to you

    Well discuss each

  • 7/31/2019 cs1621b

    73/175

    73

    Subprograms

    Pass-by-Value

    Formal parameter is a copy of the actualparameter

    i.e. get r-value of actual parameter and copyit into the formal parameter

    Default in many imperative languages

    Only kind used in C and Java

    Used for IN parameter passing

    Actual can typically be a variable, constantor expression

  • 7/31/2019 cs1621b

    74/175

    74

    Subprograms

    Benefit is that actual parameters cannot be

    altered through manipulation of the formalsAlso useful in some recursive calls, since a

    new copy is made with each call

    Problem is that copying a parameter can bequite expensive, both in terms of time andmemory

    Ex: Consider an object with an array of 1000

    floats Object is copied with each call to the function If, for example, recursive calls are made, a lot of

    memory can be consumed very quickly

  • 7/31/2019 cs1621b

    75/175

    75

    Subprograms

    Implementation:

    Using a run-time stack, this is straightforward When subprogram is called, copy of actual

    parameter is placed into a local variable, which isstored on the run-time stack (in the activationrecord for the subprogram)

    During subprogram execution, formal parameter isused like any other local variable for thesubprogram

    Only difference is that it is initialized via the actualparameter

  • 7/31/2019 cs1621b

    76/175

    76

    Subprograms

    Pass-by-Reference

    Formal parameter is a reference to (oraddress of) the actual parameter variable

    get l-value of actual param and copy it intothe formal param, then access the actualparam indirectly through the formal param

    Used in Pascal (var parameters), in C (usingexplicit pointers) and C++ and PHP (&)

    Most appropriate for IN and OUT parameterpassing, but can be used for all

    Actual param usually restricted to a variable

  • 7/31/2019 cs1621b

    77/175

    77

    Subprograms

    Benefit is that we can change or not change

    the actual parameter using the formal it isup to the programmer

    Also good that memory is saved only anaddress is copied

    Problem is that we can miss logic errors ifwe accidentally alter an actual parameterthrough the formal parameter

    Also some applications (ex: some recursion)

    dont work as well We may not want change at one call to affect

    another call

  • 7/31/2019 cs1621b

    78/175

    78

    Subprograms

    Constant Reference Parameters

    Developers of C++ realized that valueparameters are not practical for large dataobjects (too much time and memory, esp. forrecursive algorithms)

    Reference parameters have danger ofaccidental side effects (when used for INparameters)

    Solution is to pass parameters by reference,

    but not allow them to be altered constantreference Now compiler gives error if parameter is changed

    within subprogram

    Copy made if passed by reference to another sub

  • 7/31/2019 cs1621b

    79/175

    79

    Subprograms

    Good concept, but not perfect

    Programmer can get around it by casting to apointer and altering indirectly

    See params.cpp

    Ada IN parameters have a similar idea Cannot be assigned/altered within the function

    Cannot be passed by out or in out to another sub More on Ada params shortly

    Implementation:

    Using run-time stack, address of actual is

    stored in activation recordActual is accessed indirectly in sub through its

    address

    b

  • 7/31/2019 cs1621b

    80/175

    80

    Subprograms

    Pass-by-Result

    Reference parameters are not an exact fitfor out parameters

    Ex: A procedure designed to read data from afile into an object

    Here we dont care about what used to be inthe object we just want to be sure that atthe end the appropriate value is assigned

    With reference parameters we COULD accessthe old value and use it if we wanted to (orby mistake)

    Pass-by-Result prevents this

    S b

  • 7/31/2019 cs1621b

    81/175

    81

    Subprograms

    In Pass-by-Result, actual parameter is not

    actually passed to the subprogram it onlywaits to have a value passed back to it

    Formal parameter is a local variable

    During life of subprogram its value does notaffect actual parameter at all

    At end of subprogram its value is passed backto the actual parameter

    So what is actually needed of actualparameter is its address (lvalue)

    When address is obtained can affect result forsome contrived examples

    S b

  • 7/31/2019 cs1621b

    82/175

    82

    Subprograms

    // Note: This is NOT real code

    int A[8];

    for (int i = 0; i < 8; i++) A[i] = i;

    global int j = 2;

    foo(A[j]);

    output(A[]);

    sub foo(int param)

    {int temp = 25;

    j = 5;

    param = temp;

    }

    ------------------------------------------------Output: 0 1 25 3 4 5 6 7 // if address obtained

    // at call

    Output: 0 1 2 3 4 25 6 7 // if obtained at ret.

    S b

  • 7/31/2019 cs1621b

    83/175

    83

    Subprograms

    If used, address is typically obtained at call

    Ada 83 out parameters for simple types areALMOST this, but the formal parameter valuecannot be accessed within the sub (so it isnot really a local variable) Ada 95 changed out parameters to allow them to

    be accessed, fitting the Pass-By-Result model moreclosely

    Implementation:

    At sub call, actual param address is calculated

    and stored in run-time stack, as is the formalparam (as a local)

    Final result of formal is copied back to actualaddress at end of sub

    S b

  • 7/31/2019 cs1621b

    84/175

    84

    Subprograms

    Pass-by-Value-Result

    Now actual parameters value is passed tothe formal parameter when subprogram iscalled, being stored and used as a localvariable

    At the end of the subprogram the value ispassed back to the actual parameter

    As the name indicates, this is a combination

    of Pass-by-Value and Pass-by-Result

    Used for IN and OUT parameters

    Subprograms

  • 7/31/2019 cs1621b

    85/175

    85

    Subprograms

    If aliasing is NOT allowed/used, and if no

    exceptions occur in the subprogram theeffect of value-result and reference is thesame

    Precondition: Actual parameter has value

    obtained previous to call During subprogram: Only formal parameter is

    accessed, updated as desired

    Postcondition: Actual parameter has last

    value assigned within subprogram

  • 7/31/2019 cs1621b

    86/175

    Subprograms

  • 7/31/2019 cs1621b

    87/175

    87

    Subprograms

    Idea is that language creators did not want to

    require the params to be passed in anyspecific way They just wanted to require the in-out effect

    If the result could differ based on whether paramsare value-result or reference, then the program iserroneous

    Up to programmer to NOT use aliases

    Ada 95 clarified, requiring all structured in-out parameters to be reference

    See params.adb

    Implementation:

    Value + Result

    Subprograms

  • 7/31/2019 cs1621b

    88/175

    88

    Subprograms

    Pass-by-Name

    Definitely wackiest way of param passing

    Used for IN and OUT parameters, and onlyin Algol

    Idea is that actual parameter is textuallysubstituted for the formal in all places that itis accessed in the subprogram

    Kind of like a macro substitution

    It is only evaluated at the point of use in thesubprogram Evaluated EACH TIME it is used in subprogram

    Subprograms

  • 7/31/2019 cs1621b

    89/175

    89

    Subprograms

    Thus the parameter value or address could

    change based on where/when in thesubprogram it is evaluated

    However, the referencing environment usedis that of the CALLER, not of the subprogram So only changes within the subprogram that have

    a global effect will change its evaluation This also makes implementation more difficult

    For simple variables this is equivalent topass-by-reference

    Variable address evaluates the same wayregardless of where in the subprogram it islocated

  • 7/31/2019 cs1621b

    90/175

    Subprograms

  • 7/31/2019 cs1621b

    91/175

    91

    Subprograms

    global int i = 0, var = 11, n = 5;

    global int A[2] = {4, 8};

    foo(var, 2*n, A[i]); // all pass by name

    void foo(int x, int y, int z)

    {

    x = x + 1; output(var);

    output(y); n = n + 1; output(y);

    output(z); z = z + 1; output(z);

    i = i + 1; z = z + 1; output(z);}

    Subprograms

  • 7/31/2019 cs1621b

    92/175

    92

    Subprograms

    Implementation:

    It is not trivial to allow macro to be evaluatedand reevaluated in environment of thecaller

    Parameterless subprograms called thunks are

    used Thunk evaluates parameter in current state of

    callers referencing environment

    Returns the resulting address or value

    Clearly this is a lot of overhead

    Overhead and confusing results are why thisis not used in newer languages

    Subprograms

  • 7/31/2019 cs1621b

    93/175

    93

    Subprograms

    Subprograms as Parameters

    We allow variables as parameters so that wecan access their values (or addresses) fromwithin a subprogram

    Why not allow subprograms so that we canexecute them from within a subprogram?

    Some languages do allow this (ex. Pascal,C++, PHP)

    However, there are some issues to consider

    Subprograms

  • 7/31/2019 cs1621b

    94/175

    94

    Subprograms

    Can the parameter subprogram arguments

    differ in form from each other? If so, how to type check and even check thenumber of arguments when the subprogram isactually called?

    Easiest solution is to require the arguments to

    all have the same form Header of parameter subprogram must be givenwithin the header of the subprogram it is beingpassed to

    Scope is also an issue what is the

    referencing environment of the subprogramthat is being passed as a parameter? Threereasonable possibilities exist:

    Subprograms

  • 7/31/2019 cs1621b

    95/175

    95

    Subprograms

    1) The referencing environment in which the

    parameter subprogram is CALLED: shallowbinding

    2) The referencing environment in which theparameter subprogram is DEFINED: deepbinding

    3) The referencing environment in which theparameter subprogram is PASSED as anargument: ad hoc binding

    Note that shallow binding fits well withdynamic scoping and deep binding fits wellwith static scoping

    Subprograms

  • 7/31/2019 cs1621b

    96/175

    96

    Subprograms

    Pascal and C++ both use deep binding

    Shallow binding is used by SNOBOL, whichalso uses dynamic scoping

    Ad hoc binding has never been used

    See fnparams.cpp

    Subprograms

  • 7/31/2019 cs1621b

    97/175

    97

    Subprograms

    Overloading (ad hoc polymorphism)

    Using the same subprogram name withdifferent parameter lists

    When a subprogram is called, the compilerselects the correct version based on the

    parameter lists In Ada, return type for a function is also

    used, since coercion is not done in Ada andfunction return values cannot be ignored

    Enables programmer to use the same namefor similar functions that take differentargument types

    Subprograms

  • 7/31/2019 cs1621b

    98/175

    98

    Subp og a s

    Use: Make it easier for the programmer to

    use consistent names for subprograms Without overloading: Programmer must make

    up different but similar names forsubprograms that do similar things but for

    different types Ex: abs(int) fabs(float) labs(long) Ex: ISort(int * A) FSort(float * A)

    With overloading: Programmer uses the samename and the compiler decides which to use

    Ex: abs(int) abs(float) abs(long) Ex: Sort(int * A) Sort(float * A)

    Subprograms

  • 7/31/2019 cs1621b

    99/175

    99

    p g

    But programmer must be careful:

    Ada and C++ both allow overloading anddefault parameters

    Leaving out some parameters in the call couldmake a call ambiguous

    i.e. it matches more than one function header Call can also be ambiguous if implicit casting

    of arguments is done

    Operator Overloading is the same idea, but

    with symbols rather than identifiers We discussed these issues previously

    See Slide 12 of cs1621b.ppt

  • 7/31/2019 cs1621b

    100/175

    Generics

  • 7/31/2019 cs1621b

    101/175

    101

    Motivation:

    Programmers often apply data structures andalgorithms to more than one data type Ex. Sorting, Searching algos

    Ex. BST, PQ, Stack, Queue data structures

    Even with overloading, the programmer muststill write different (identical except for type)versions of the code

    Generics simply transfer the job of makingthe different versions from the programmerto the compilerautomates the overloadingprocess Note that DIFFERENT VERSIONS of the code MUST

    STILL BE generated

    Generics

  • 7/31/2019 cs1621b

    102/175

    102

    So the reason we have generics is to save the

    programmer some time (and perhaps someconfusion)

    Ada vs. C++:

    In Ada, template instantiations must be

    explicit Programmer specifies template arguments using

    the new statement

    Ex: package int_io is new integer_io(integer);

    The generic package is integer_io

    The instantiated package is int_io The type argument is integer

    As is usual in Ada, if declaration is explicit,there will be no surprises

    Generics

  • 7/31/2019 cs1621b

    103/175

    103

    In C++, template instantiations can beexplicit or implicit

    Implicit: generated automatically by thecompiler when a call is seen with theappropriate arguments Duplicate instantiations are merged into a single

    code segment Coercion cannot be done, since the types wont

    match the template correctly

    Saves programmer some typing

    Explicit: programmer declares each version

    Coercion can be done using regular C++promotion and conversion rules

    Programmer is aware of each version

    See template.cpp and tordlist.h

  • 7/31/2019 cs1621b

    104/175

    Generics

  • 7/31/2019 cs1621b

    105/175

    105

    However, retrieving objects back from the

    collection required explicit casting to theactual type if we wanted full access to themArrayList A = new ArrayList();

    A.add(new String("Wacky"));

    String S = (String) A.remove(0);

    Also any typing mistakes (mixing types inthe collection unintentionally) could only becaught at run-time (via casting exceptions)

    Overall not bad, but some people thoughttype parameters should be allowed

    Generics

  • 7/31/2019 cs1621b

    106/175

    106

    JDK 1.5 added syntax very similar to that for

    C++ templatesHowever, it is very different from C++

    templates (and Ada generics as well)

    It is not really adding any new generic

    abilities to the language

    It is not creating new code for each versionof the class or method

    It is designed to make collections of objectsmore type-safe

    See more details in the handout

    Implementing Subprograms

  • 7/31/2019 cs1621b

    107/175

    107

    What is involved when a subprogram is

    called, during its execution, and when itterminates?

    This will differ depending on if recursion is

    allowed in a language or notMost modern languages allow recursion, but

    original FORTRAN (up to FORTRAN 77) didnot allow it

    Implementing Subprograms

  • 7/31/2019 cs1621b

    108/175

    108

    FORTRAN 77 (and before)

    All variables within a subprogram werestatic, and recursive calls were not allowed

    Activation records were still used, but they

    also could be static Since all data was static, the size was known

    at compile time

    Run-time stack not needed, since at most one

    call per sub could be performed at a timeWhat do we need to know when a

    subprogram is called?

    Implementing Subprograms

  • 7/31/2019 cs1621b

    109/175

    109

    Return Value

    Local Variables

    Parameters

    Return Address

    If sub is a function

    Static

    Like local variablesthat are initialized

    Where to go back towhen subprogramends

  • 7/31/2019 cs1621b

    110/175

    Implementing Subprograms

  • 7/31/2019 cs1621b

    111/175

    111

    So the activation record looks similar to thatused in FORTRAN With additional link location to access global

    variables

    Now multiple instances of an activation recordcan occur at the same time, so they must be

    created dynamically (at run-time), unlike inFORTRAN

    Lets look at some of the contents of anactivation record

    Implementing Subprograms

  • 7/31/2019 cs1621b

    112/175

    112

    Temporaries

    Local Variables

    Parameters

    Dynamic Link toprevious call

    Static Link to Non-Locals

    Return Address

    Temps and local variables areallocated within the subprog.

    call. In Pascal, C and C++,the local variables must be offixed size. In Ada, they can bevariable size (ex. arrays)

    Parameters, links to non-Localsand the return address areplaced into the AR by the callerof the subprogram, so they arelower in the record

    Implementing Subprograms

  • 7/31/2019 cs1621b

    113/175

    113

    See rtstack.cpp

    Accessing non-local variables within asubprogram

    Local variables are located within theactivation record (AR)

    Can be accessed by knowing the baseaddress of the AR plus a local_offset for eachvariable

    Ex: Base address of AR = 162

    int x, y[5]; // address of x is 162 + (other AR stuff)float z; // address of z is 162 + (other AR stuff)

    // + 4 + 20

    Implementing Subprograms

  • 7/31/2019 cs1621b

    114/175

    114

    Non-locals are located elsewhere

    For languages like C and C++: Subprograms cannot be nested

    Besides locals there are global variables

    For languages like Ada and Pascal: Subprograms can be nested to arbitrary depth

    A sub can be declared within a sub, which is withina sub, which is within a sub

    Using static scope, variables declared in a textualparent sub are accessible from an inner sub

    Relative global variables

    But the variable locations could be in differentplaces on the run-time stack

    How to find them?

  • 7/31/2019 cs1621b

    115/175

    Implementing Subprograms

  • 7/31/2019 cs1621b

    116/175

    116

    Two techniques used to locate AR

    1) Static links

    A link is kept in an AR to that ARs textualparent (from the declaration)

    To access a single nonlocal many links maybe crossed

    2) Display

    A single array is kept to indicate all of the

    currently accessible nested subs Any nonlocal can be accessed with two

    indirect accesses

  • 7/31/2019 cs1621b

    117/175

    Implementing Subprograms

  • 7/31/2019 cs1621b

    118/175

    118

    However, textual parent does NOT have to

    be previous call on run-time stack So dynamic link in AR is not enough (but

    would work for dynamic scoping)

    sub foo

    {

    sub innerA

    { }

    sub innerB

    { innerA; }

    innerB;

    }

    main

    { foo; }

    innerA

    innerB

    foo

    Implementing Subprograms

  • 7/31/2019 cs1621b

    119/175

    119

    Static links connect an AR to the AR of the

    subs textual parent, no matter wherepreviously on the RT stack it is

    How is this used to access nonlocalvariables?

    Can be determined and maintained based onthe nesting depths of the subprograms thatare called The difference in the nesting depths between the

    sub using a nonlocal variable and the sub in which

    the nonlocal is declared is equal to the number ofstatic links that must be crossed to find the correctAR for the variable

    Implementing Subprograms

  • 7/31/2019 cs1621b

    120/175

    120

    This difference can be stored for each variablewhen the program is compiled, so that at run-timefinding the variable is simple

    sub parent {

    var X, Y

    sub child1 {

    var X, Z

    sub grand1 {var Z

    }

    }

    sub child2 {

    var Y

    call child1}

    }

    main {

    call parent }

    If variable Y is accessedwithin grand1

    chain offset is 2, since Y

    is declared two levelsoutside grand1

    so search for Y only hasto be done once atcompile-time

    at run-time we know tofollow two static links,whatever call sequence is

    Implementing Subprograms

  • 7/31/2019 cs1621b

    121/175

    121

    What actually happens when a sub is called?

    AR for textual parent of sub must be located on therun-time stack, so that the static link can be linked toit

    A clear (but inefficient) way to do this is to followdynamic links down the RTS until the AR for the parentsub is found

    A better way can take advantage of the fact that thecalling sub and the called sub must be relatives inthe declaration tree

    Calling sub could be parent of called sub (but notgrandparent)

    Calling sub could be called sub (direct recursion)

    Calling sub could be a sibling of called sub Calling sub could be a descendent of called sub (indirect

    recursion) Calling sub could be a niece of called sub

    Implementing Subprograms

  • 7/31/2019 cs1621b

    122/175

    122

    So instead of following dynamic links, atcompile-time we can pre-calculate thenumber of static links (from caller) to followto find the appropriate textual parent ARAlways equal to: nesting_depth (calling sub)

    nesting_depth(called sub) + 1

    Calling sub could be parent of called sub X (X+1) + 1 = 0 static links (user caller's AR)

    Calling sub could be called sub (direct recursion) X X + 1 = 1 static link same textual parent

    Calling sub could be a sibling of called sub X X + 1 = 1 static link same textual parent

    Calling sub could be a descendent of called sub(indirect recursion)

    Calling sub could be a niece of called sub Follow diff. in nesting depth + 1 static links

    Implementing Subprogams

    d Bi b i

  • 7/31/2019 cs1621b

    123/175

    123

    procedure Bigsub is

    procedure A(Flag: Boolean) is

    procedure B is

    ...A(false);

    end; -- B

    begin -- A

    if flag

    then B;

    else C;

    end; -- Aprocedure C is

    procedure D is

    here

    end; -- D

    ...

    D;end; -- C

    begin -- Bigsub

    A(true);

    end; -- Bigsub

    D dynamic link to C

    static link to C

    return addr. to C

    C dynamic link to Astatic link to Bigsub

    return addr. to A

    A param flag ( = false)

    dynamic link to B

    static link to Bigsub

    return addr. to B

    B dynamic link to A

    static link to A

    return addr. to A

    A param flag ( = true)

    dynamic link to Bigsub

    static link to Bigsub

    return addr. to Bigsub

    Bigsub dynamic link to caller

    static link

    return addr.

    Implementing Subprograms

  • 7/31/2019 cs1621b

    124/175

    124

    Evaluation of static links

    Maintaining is not too time-consuming

    Chain offsets can be calculated at compiletime

    Local variables can be accessed directlyNon-locals must follow 1 or more static links

    Works well if nesting depths do not get toodeep

    For deep sub nesting, cost of non-local accesscan be high But usually 2 or 3 levels is max used

    Implementing Subprograms

  • 7/31/2019 cs1621b

    125/175

    125

    Display

    Uses a single array to store links to ARs atall relevant nesting depths

    To access a nonlocal at a given nestingdepth, we just follow the display entry forthat depth, then the local_offset Never more than one link to follow

    Array is updated as subs are called and asthey terminate

    Generally faster than static links if manynesting levels are used

    We will skip the details here read the text

    Implementing Subprograms

  • 7/31/2019 cs1621b

    126/175

    126

    Nested declaration blocks

    Idea could be similar to nested subs

    Blocks could be treated as parameterless subs

    Static links could be used to determine textualparent

    But it is actually much easier to handle, sinceblock entry and exit is always the same

    Parent block goes to child block

    When child block terminates, we revert to parentblock

    Implementing Subprograms

  • 7/31/2019 cs1621b

    127/175

    127

    Simply push new block declarations onto

    run-time stack, and pop them when blockterminates

    But we only have one activation record, sono links are required

    "Non-locals" can be accessed just like locals

  • 7/31/2019 cs1621b

    128/175

    Data Abstraction

  • 7/31/2019 cs1621b

    129/175

    129

    Procedural (process) abstraction:

    Action can be performed without requiringdetailed knowledge of how it is performed

    Data abstraction:

    New type can be used without requireddetailed knowledge of how it is implemented

    We don't need to know the details of how it isstored in memory

    We don't need to know the details of how it ismanipulated via operations

    Data Abstraction

    M f ll ADT t ti f t

  • 7/31/2019 cs1621b

    130/175

    130

    More formally, an ADT must satisfy two

    conditions:1) The declarations of the type and operations

    (interface) are contained in a single syntacticunit ENCAPSULATION

    The interface does not depend on how theobjects are represented or how the operationsare implemented

    2) The representation of the objects is hidden from

    users of the ADT DATA HIDING

    Objects can only be manipulated via the providedinterface

    Data Abstraction

  • 7/31/2019 cs1621b

    131/175

    131

    Ex: Stack Data: something that can store and access

    multiple data values in the manner dictated by theoperations

    Operations: Push add new value to top of stack

    Pop remove top value from stack

    Top view top value (or a copy) without removing

    Empty is stack empty

    User of stack only needs to know the parametersand effect of each operation to use a stackcorrectly

    Implementation could be an array, a linked-list, ormaybe something different

    Does not affect use

    Implementer can hide these details from the userthrough private declarations

  • 7/31/2019 cs1621b

    132/175

    Data Abstraction

  • 7/31/2019 cs1621b

    133/175

    133

    Newer languages added true data

    abstractionAda via packages

    C++, Java, C#, Ada95 via classes / objects

    Encapsulation units that contain all details ofthe new type

    Access modifiers that prevent access tointernal details of the ADT from outside the

    encapsulation unit

    See text for more details

    Object-Oriented Programming (OOP)

  • 7/31/2019 cs1621b

    134/175

    134

    Characteristics of OOP

    1) Data abstraction: encapsulation +information-hiding

    The operations for manipulating data areconsidered to be part of the data type

    (encapsulated)

    The implementation details of the data type(both the structure of the data and theimplementation of the operations) are

    separate from their specifications and(possibly) hidden from the user As we discussed with ADTs

    OOP

  • 7/31/2019 cs1621b

    135/175

    135

    2) Inheritance

    The characteristics of an ADT (data +operations) can be passed on to a subtype Subtype can also add new data and operations

    Allows programmer to build new (derived)types from old (parent) ones

    Common data/operations do not have to berewritten (or copied)

    Operations that are slightly different in derivedtype can be rewritten (overridden) for that type

    New data/operations tailor the derived type to

    the problem at hand Parent type is unchanged and may (sometimes)be used together with derived type

  • 7/31/2019 cs1621b

    136/175

    OOP

  • 7/31/2019 cs1621b

    137/175

    137

    3) Polymorphism

    Variables of a parent class can also beassigned objects of a subclass (or subclassof a subclass)

    Operations used with a variable are based

    upon the class of the object currently stored(could be a parent type object or a derivedtype object) Operations may have been overridden in the

    derived class

    Dynamic binding allows parent and derivedobjects to be used together in a logical way

    OOP

    Sh l

  • 7/31/2019 cs1621b

    138/175

    138

    Ex: Shape class We could declare:

    Shape shapelist[100];

    shapelist[0] = new Rectangle(0, 0, 10, 20);

    shapelist[1] = new Square(50, 100, 30, 30);

    shapelist[2] = new Circle(100, 50, 25);

    for (int i = 0; i < 3; i++)shapelist[i].Draw();

    Polymorphism allows these different objectsto be accessed consistently within the same

    array Think about how you could do the code

    above in C or Pascal It would not be easy!

    OOP

    O i M k i d

  • 7/31/2019 cs1621b

    139/175

    139

    One option: Make one giant struct or recordto contain all of the data, including a union orvariant Base class would use only the core data items

    Derived classes would use additional data itemsas provided in the union or variant

    To do the operations, we would need a switch or

    case to test which type the variable is, so that itcan be written out appropriately

    Now what if we want to add another newderived class, Pentagon? With OOP, it is simple to add any new data and

    override the necessary operations Without OOP we would have to change the overall

    structure of the data and operations old typeswould change, possibly causing problems

    OOP

  • 7/31/2019 cs1621b

    140/175

    140

    OO Languages

    1) Smalltalkwas the first and purest OOL All data (even numeric literals) are objects,

    and are all descendents of class Object

    Objects are all allocated from the heap, andimplicitly deallocated (garbage collection)

    Variables are references, with implicitdereferencing

    Execution of a program (logically) involvesobjects sending messages to each other,executing methods, and responding back So the data is driving the execution, not the

    control statements

  • 7/31/2019 cs1621b

    141/175

  • 7/31/2019 cs1621b

    142/175

    OOP

    E i l t t i d

  • 7/31/2019 cs1621b

    143/175

    143

    Equivalent to previous code:| letters |

    letters := 0.(Prompter prompt: 'Enter your name' default:'')

    do: [ :c | c isLetter

    ifTrue: [ letters := letters + 1 ].

    ].

    letters printNl.

    Now we cascade the messages to allow fewer statements(also do: loop iterates through characters in a string, sowe dont need the loop counter

    (((Prompter prompt: 'Enter your name' default:'')

    select: [ :c | c isLetter ]) size printNl.

    Now the select: loop generates a string based on thecondition in the block

    OOP

    M S llt lk ( l d bj t )

  • 7/31/2019 cs1621b

    144/175

    144

    More on Smalltalk (classes and objects)

    Data in an object can be an instance variableor a class variable Instance variables are associated with objects

    Separate data for each object

    Accessible only through the methods defined forthat object always private to the class

    Class variables are associated with classes Shared data for all objects of the same class

    Accessible from all objects, but still private to theclass

    Methods have a similar grouping, but are

    public Instance methods associated with objects

    Class methods associated with entire class

    OOP

    M S llt lk (i h it )

  • 7/31/2019 cs1621b

    145/175

    145

    More on Smalltalk (inheritance)

    Object base class of all others Only single inheritance allowed

    All inheritance is implementation inheritance Data and methods of parent class are always

    accessible to the derived class i.e. Cannot hide implementation details from

    derived class

    Advantage: Derived class can likely implement itsmethods more efficiently with access to parentdata

    Disadvantage: Change in parent classimplementation will likely require change in derivedclass implementation

    Ex. Traversable stack

    OOP

    M S llt lk ( l hi )

  • 7/31/2019 cs1621b

    146/175

    146

    More on Smalltalk (polymorphism)

    All messages are dynamically bound tomethodsAt run-time, when a message is received, the

    objects class is searched for a method, then, ifnecessary its superclass, its super-superclass andso on up to Object

    Variables have no types since they are onlyused to refer to objects, not to determine themessages an object can receive

    Clearly some liabilities with this approach

    Slows language down due to run-time overhead Programmer type errors cannot be caught until

    execution time

    OOP

    L t' l k t l

  • 7/31/2019 cs1621b

    147/175

    147

    Let's look at some examples:

    person.cls as an example of a new class See personTest.st

    student.cls as an example of a subclass

    studentTest.st as an example showing polymorphic

    access twodarry.cls as another subclass example

    See twodTest.st

    For more information, see the GNU Smalltalk

    User's Guide: http://www.gnu.org/software/smalltalk/gst-manual/gst.html

    OOP

    2) C++ is an impe ati e/OO mi

    http://www.gnu.org/software/smalltalk/gst-manual/gst.htmlhttp://www.gnu.org/software/smalltalk/gst-manual/gst.htmlhttp://www.gnu.org/software/smalltalk/gst-manual/gst.htmlhttp://www.gnu.org/software/smalltalk/gst-manual/gst.html
  • 7/31/2019 cs1621b

    148/175

    148

    2) C++ is an imperative/OO mix Had to be backward compatible with C

    Wanted to add object-oriented features

    Result is that programmer can use as few or asmany OO features as he/she wants to

    C++ Classes and Objects Can be static, stack-dynamic or heap-dynamic

    Member data and member functions can be private,protected or public

    Allows programmer to decide

    Like Smalltalk, has notion of class variables Delcared as static in C++

    Destructor needed if object uses dynamic memory

    OOP

    C++ Inheritance

  • 7/31/2019 cs1621b

    149/175

    149

    C++ Inheritance

    Do not need a superclass (no Object baseclass for all other classes)

    Multiple inheritance is allowed Complex and difficult to use

    Implementation inheritance or interfaceinheritance are allowed With interface inheritance, all data and functions

    are still inherited, but only public ones are directlyaccessible to the derived class

    Advantage: Modifications to parent class do not

    affect derived class, as long as they do not changethe interface

    Disadvantage: Operations may be slower, sincethey cannot access the data directly

  • 7/31/2019 cs1621b

    150/175

    OOP

    3) Java falls in between Smalltalk and C++

  • 7/31/2019 cs1621b

    151/175

    151

    3) Java falls in between Smalltalk and C++

    Like Smalltalk: Object is base class to other classes Single inheritance only

    Objects are (almost) all dynamic, with garbagecollection

    References used to access

    Method names are (by default) dynamicallybound

    Like C++: Access can be private, public or protected

    Static binding can optionally be used to improverun-time speed

    Overall syntax for member data and functionaccess

    Variables are typed

    OOP

    Other Java OOP features:

  • 7/31/2019 cs1621b

    152/175

    152

    Interfaces allow for a simplified form of

    multiple inheritanceAn interface is in a sense a base class with no data

    and only abstract (pure virtual) methods

    A class that implements an interface simplyimplements the methods specified therein

    Advantages: Objects that implement an interfacecan be used whereever the interface is specified.This allows for a type of generic behavior

    Ex: Comparable interface, Runnable interface

    Disadvantage: Can become complicated wheninterfaces and inheritance are both used

    Reflection that allows us to manipulate theclasses themselves

    See poly.java

    OOP

    OOL Implementation

  • 7/31/2019 cs1621b

    153/175

    153

    OOL Implementation

    Data: Typically a record/struct type of storage is

    usedClass Instance Record (CIR)

    Data members are accessed by name, in the

    same way as records

    Subclass adds extra data to CIR of parentclass

    Private access enforced by limiting visibility ofthe data

    OOP

    Subprograms:

  • 7/31/2019 cs1621b

    154/175

    154

    Subprograms:

    Static binding Subprograms that will be called are determined bythe variable type

    Variable types are known at compile time and codecan be determined then

    Dynamic Binding: Subprograms that will be called are determined bythe objects type, not the variables type

    Objects stored in a variable are determined at runtime

    Appropriate links must be stored with the object

    But they are the same for all objects of that classVirtual Method Table (VMT) used to store links to

    all pertinent subprograms

    Parallelism

    Parallelism is incorporated into

  • 7/31/2019 cs1621b

    155/175

    155

    Parallelism is incorporated into

    programs for 2 primary reasons:1) Program is running in a multiprocessing or

    distributed environment

    Many computers now have multiple CPUs

    Many jobs are distributed over multiplecomputers in a network

    A programming language should be able totake advantage of this parallelism Many algorithms can be improved if designed for

    parallel execution

    This is PHYSICAL PARALLELISM

    Parallelism

    2) Program is running in a simulated parallel

  • 7/31/2019 cs1621b

    156/175

    156

    2) Program is running in a simulated parallelenvironment, allowing for asynchronousactivity

    Ex: Two windows are displayed to the user.One shows the current time (incrementedby seconds) and one allows the user todraw images on the screen We dont want the act of the user drawing to

    stop the clock

    We dont want the clock running to prevent theuser from drawing

    Even with a single processor, we want both ofthese activities to execute in parallel

    This is LOGICAL PARALLELISM

  • 7/31/2019 cs1621b

    157/175

    Parallelism

    If the tasks have some dependencies there

  • 7/31/2019 cs1621b

    158/175

    158

    If the tasks have some dependencies, therecan be a problem Most common dependency is shared data To handle this we must synchronize the tasks

    Cooperation Synchronization

    One task is dependent upon an output/outcome ofanother

    Ex: Task B must process data produced by Task A Contractor B cannot put up drywall until contractor

    A has finished the wiring

    Task to count ballots cannot proceed until task thatcollects ballots provides it with some

    We must have a mechanism that allows Task B to

    pause until the data is available B could loop and keep checking for data B could wait for some signal from A

    Parallelism

    Competition Synchronization

  • 7/31/2019 cs1621b

    159/175

    159

    Both tasks are competing for the same sharedresource

    If one or both tasks modify the data, it could causedata inconsistencies

    Ex: Task A and Task B are MAC machine accessesof the same bank account

    Task A checks the balance: $200

    Task B checks the balance: $200 Task A withdraws $200

    Task A updates balance to $0

    Task B withdraws $200

    Task B updates balance to $-200

    We must have some mechanism that ensures

    MUTUAL EXCLUSION for CRITICAL DATA We could have a LOCK on the data, or a similar

    mechanism allowing only one task to access it at atime

    Parallelism

    Synchronization Mechanisms

  • 7/31/2019 cs1621b

    160/175

    160

    Synchronization Mechanisms

    Semaphores Devised by Dijkstra

    Basically guards that are placed around code P must succeed to gain access to code

    Decrements a counter when it succeedsV executes when critical section ends

    Based on initial value of counter, we can controlhow many tasks are allowed to access the criticalsection at once

    If used properly, can guarantee eithercooperation or competition synchronization

    However, it is easy to NOT use them properly Can cause problems

    Parallelism

    Monitors

  • 7/31/2019 cs1621b

    161/175

    161

    Monitors

    Devised by Hansen and Hoare Critical data section is part of a data object

    that allows only one task entry at a time

    Better than semaphores for competition

    synchronization, because mechanism is builtinto the monitor Harder to programmer to mess up

    No better for cooperation synchronization Still must be done manually

    Used in Concurrent Pascal, Modula-2 and(somewhat) in Java

    Parallelism

    Message Passing

  • 7/31/2019 cs1621b

    162/175

    162

    Message Passing

    Proposed by Hansen and Hoare More general than either of the two previous

    techniques

    Tasks are synchronized via messages sent to

    each other Message is similar in look/execution to a

    subprogram call, but with restrictions: Caller (or passer) of the message is blocked at the

    call until the receiver is ready to receive it

    Receiver (or executer) of the message is blockedat the message code until the message is called

    Caller and Receiver meet at a rendezvous

    Parallelism

    Idea is that we know exactly where in the

  • 7/31/2019 cs1621b

    163/175

    163

    Idea is that we know exactly where in thecode both tasks will be when a rendezvous

    occurs So even though tasks execute asynchronously, we

    synchronize them with respect to each other at arendezvous

    Ex: Ada

    Still much of the work is up to theprogrammer

    Parallelism

    Parallel processing concerns

  • 7/31/2019 cs1621b

    164/175

    164

    Parallel processing concerns

    Data consistency We have already discussed this

    Mutual exclusion is needed to preventmultiple tasks from accessing critical data at

    the same time

    However, efforts to ensure data consistencycan cause other problems, such asDEADLOCK and STARVATION

    Parallelism

    Deadlock

  • 7/31/2019 cs1621b

    165/175

    165

    Deadlock

    When a (shared) resource has restrictedaccess, it can cause a task to stop execution Wait in a semaphore queue

    Wait in a monitor queue

    Wait in an accept queue

    If a circular resource dependency exists, wecan get deadlock

    Ex:Task A has acquired binary semaphore S1

    Task B has acquired binary semaphore S2Task A is waiting for binary semaphore S2

    Task B is waiting for binary semaphore S1

    Parallelism

    Starvation

  • 7/31/2019 cs1621b

    166/175

    166

    Starvation

    To combat deadlock, most languages allow atask to release a resource prematurely insome circumstances Ex: If one of the Tasks in the previous example

    release the semaphore, the other can proceed

    Under these circumstances there is thepossibility that a task may never acquire all ofthe resources that it needs at the time itneeds them starvation

    We must be careful to avoid all of theseproblems when programming in parallel

  • 7/31/2019 cs1621b

    167/175

  • 7/31/2019 cs1621b

    168/175

    Prolog

    Rules are predicates that consist of a head

  • 7/31/2019 cs1621b

    169/175

    169

    pand a body

    In order for the head to "succeed" in itsevaluation, all of the goals in the body mustbe satisfied These goals could be facts, or could be other rules

    Ex from ex1.pl:sibling(X,Y) :- X \== Y, parent(P,X), parent(P,Y). The :- can be thought of as "if"

    Execution of a program is in fact a sequenceof questions, or assertions

    Database is searched in an effort to satisfy allof the assertions

    Prolog

    If assertions can be satisfied, answer is yes

  • 7/31/2019 cs1621b

    170/175

    170

    , y Otherwise, answer is no

    If a given assertion succeeds, executionproceeds to the next one

    If a given assertion fails, execution backtracksand attempts to re-satisfy the previous

    assertion

    So what about variable assignments?

    These are in fact just side effects that occurin an effort to satisfy the query

    In fact variables are not assigned in thetraditional (imperative language) sense

    Prolog

    Variables in Prolog are dynamically typed and

  • 7/31/2019 cs1621b

    171/175

    171

    g y y yphave two states:

    Uninstantiated:Variable is not associated with a value

    InstantiatedVariable is associated with a value

    Once a variable is instantiated, it keeps that value,and all occurrences of that variable within thesame scope have that value Cannot be re-assigned in sense of imperative languages

    However, if execution backtracks past the point at whichit was instantiated, it can again become uninstantiated

    Let's look again at ex1.pl

    Prolog

    Recursion and database search

  • 7/31/2019 cs1621b

    172/175

    172

    Recursion and database search

    Recursion is a fundamental part ofprogramming in prolog

    Execution is simply satisfaction of goals, andthere are no loops as in imperativelanguages

    Thus, to build complex "programs" we mustutilize recursive programming

    Each attempt to satisfy a goal initiates asearch of the database

  • 7/31/2019 cs1621b

    173/175

  • 7/31/2019 cs1621b

    174/175

    Prolog Lists

    As in Lisp, the list is an important data

  • 7/31/2019 cs1621b

    175/175

    s sp, t e st s a po ta t data

    structure in PrologA list consists of a head and a tail

    Tail could be the empty list