chapter 9 functions it is better to have 100 functions operate on one data structure than 10...

Chapter 9Functions

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.

A. Perlis

Programming Languages2nd editionTucker and Noonan

9.1 Basic Terminology9.2 Function Call and Return9.3 Parameters9.4 Parameter Passing Mechanisms9.5 Activation Records9.6 Recursive Functions9.7 Run Time Stack

Contents

“Subprogram”: an independent, reusable program unit; performs a single logical task

Subprogram Types:◦ Functions: modeled after mathematical functions

which return a single value; e.g., f(x) = x2 + x Purpose: use in an expression; e.g., y = f(x) * x;

◦ Subroutines: didn’t return values directly; worked through side-effects: compute(X, Y) Called as independent statements.

Subprograms

Value-returning functions:◦ The “non-void functions/methods” in C/C++/Java◦ called from within an expression; e.g.,

x = (b*b - sqrt(4*a*c))/2*a Non-value-returning functions:

◦ known as “procedures” in Ada, “subroutines” in Fortran, “void functions/methods” in C/C++

◦ called as a separate statement; e.g., strcpy(s1, s2);

9.1 Modern Terminology

Fig 9.1 : Example

C/C++ Program

9.2 Function Call and Return

int h, i;void B(int w) { int j, k; i = 2*w; w = w+1;} void A(int x, int y) { bool i, j; B(h);}int main() { int a, b; h = 5; a = 3; b = 2; A(a, b);}

Definitions◦ A parameter is an identifier that appears in a

function declaration.◦ An argument is an expression that appears in a

function call. Example: in Figure 9.1

◦ The function declaration A(int x, int y) has parameters x and y.

◦ The call A(a, b) has arguments a and b.

9.3 Parameters

Usually by number and by position.◦ i.e., any call to A must have two arguments, and

they must match the corresponding parameters’ types.

Exceptions:◦ Python: parameters aren’t typed◦ Perl - parameters aren’t declared in a function

header. Instead, parameters are stored in an array @_, and are accessed using an array index.

◦ Ada - arguments and parameters can be linked by name; e.g., the call A(y=>b, x=>a) is the same as A(a,b).

Parameter-Argument Matching

By value By reference By value-result By result By name

9.4 Parameter Passing Mechanisms

Compute the value of the argument at the

time of the call and assign that value to the

parameter. e.g., in the call A(a, b) in Fig. 9.1, a and b are

passed by value. So the values of

parameters x and y become 3 and 2,

respectively when the call begins.

Pass (Call) by Value

Pass by value doesn’t allow the called

function to modify an argument’s value in

the caller’s environment. Technically, all arguments in C and Java are

passed by value. But references (adresses of arguments) can

be passed to allow argument values to be

modified.

Pass by Value

Compute the address of the argument at the time of the call and assign it to the parameter.

Simulating Pass by Reference in Cint h, i;void B(int* w) { int j, k; i = 2*(*w); *w = *w+1;} void A(int* x, int* y) { bool i, j; B(&h);}int main() { int a, b; h = 5; a = 3; b = 2; A(&a, &b);}

Pass by reference means the memory address of the argument is copied to the corresponding parameter so the parameter is an indirect reference (a pointer) to the actual argument.

Assignments to the parameter affect the value of the argument directly, rather than a copy of the value. This is an example of a side effect.

Pass By Reference

In languages like C++ that support both value and reference parameters, there must be a way to indicate which is which. ◦ In C++, this is done by preceding the parameter

name in the function definition with an ampersand (&) if the parameter is a reference parameter. Otherwise, it is a value parameter.

Pass By Reference

Pass by value-result: Pass by value at the time of the call and copy the result back to the argument at the end of the call. ◦ E.g., Ada’s in out parameter can be implemented

as value-result.

◦ Value-result is often called copy-in-copy-out.

Pass by result: Copy the final value of the parameter out to the argument at the end of the function call.

Pass by Value-Result and Result

Reference and value-result are the same, except when aliasing occurs.

Aliasing:refer to the same variable by two names; e.g.,◦ the same variable is both passed and globally

referenced from the called function,◦ the same variable is passed to two different

parameters using a parameter method other than pass by value.

◦ Having two pointers to the same location

Aliasing

Example parameter aliases in C++: shift(int &a, int &b, int &c)

{ a = b; b = c; }

The result of shift(x, y, z) is that x is set to y and y is set to z

The result of shift(x, y, x) is that x is set to y but y is unchanged.

Parameter Aliasing in C++

Example of Effect (a = 1, b = 1)

Reference parameters In-out parameters

void f (int& x, int& y)

{x = x + 1;y = y + 1;

}

C++f(a,b) versus f(a,a) a = 2, b = 2; or a = 3, b = 1

Procedure f (x, y: in out Integer) is

beginx = x + 1;y = y + 1;

end f;

Ada: f(a,b) versus f(a,a) a = 2, b = 2; or a = 2, b = 1

Textually substitute the argument for every instance of its corresponding parameter in the function body. ◦ Originated with Algol 60, but was dropped by

Algol’s successors -- Pascal, Ada, Modula.◦ Exemplifies late binding, since evaluation of the

argument is delayed until its occurrence in the function body is actually executed.

◦ Associated with lazy evaluation in functional languages (see, e.g., Haskell discussion in Chapter 14).

Pass by Name

procedure swap(a, b); integer a, b; // declare parameter types

begin integer t; // declare local variablet = a; // t = i (t = 3)a = b; // i = a[i] (i = 1)b = t; // a[i] = i (a[1] = 1)

end;

Consider the call swap(i, a[i]) where i = 3 and

a = Instead of the expected result i = 1 and a[3] = 3 we get result: i = 1 and a[1] = 1

Example from Algol

Consider

9 4 -1 1 14

C/C++ macros (also called defines) adopt call by name ◦ For example

#define max(a,b) ( (a)>(b) ? (a) : (b) ) ◦ A "call" to the macro replaces the macro by the

body of the macro (called macro expansion), for example max(n+1, m) is replaced by ((n+1)>(m)?(n+1):(m)) in the program text

◦ Macro expansion is applied to the program source text and amounts to the substitution of the formal parameters with the actual parameter expressions

C++ Macros/Call by Name

Some methods of parameter passing cause side effects, in this case meaning a change to the non-local environment. ◦ Call by value is “safe” – there are no side effects. ◦ Pass by reference can cause side effects.

Side effects may compromise readability and reliability. ◦ Example: p = (y*z) + f(x, y);◦ If y is a reference parameter results could depend

on the operand evaluation order, which is not specified in any grammar.

Using global variables in functions is dangerous. Parameter lists & calls using actual arguments clarify effects of the function.

Side Effects

Example: p = (y*x) + f(x, y); Suppose f(x, y) returns x + y & increments y. Assume when the call executes, x = 2 and y =

3 Sub-expressions evaluated left-to-right:

◦ y*x = 6, f(x,y) = 5, f returns 6+5 = 11, y is now set to 4 Or, sub-expressions evaluated right-to-left:

◦ f(2, 3) sets y to 4 and returns 5, y*x = 2 * 4 = 8, p = 8 + 5 = 13

Remember there are no grammar rules that specify the order of evaluation between (y*x) and f(x,y).

Side Effects (see semantic discussion)

Implementing FunctionsActivation Records

AndThe Run-time Stack

Two kinds of subprograms◦ Those that act like a mathematical function◦ Those that act by causing side effects.

Parameter passing mechanisms:◦ Call by value◦ Call by reference◦ Call by value-result◦ Call by result◦ Call by name

Dangers of side effects

Review

Types of Data Storage• Static – permanent allocation (e.g., h and i in the sample

program)• Stack: (stack-dynamic allocation) storage that contains

information about memory allocated due to function calls. When a function is called, a block of storage is pushed onto the stack; when the function exits this storage is popped.

• Heap: storage that contains dynamically allocated objects (pointer-referenced) allocated/deallocated in a less predictable order (dynamic memory allocation)More about this in Chapter 10.

• Stack storage: a collection of activation records.• Function activation: a particular execution of a

function.• If the function is recursive, several activations may

exist at the same time.• When a function is called storage is allocated to

hold information about that activation; when the function terminates the storage is deallocated. • (The stack top pointer is adjusted.)

• What should be in the activation records?

9.5 Activation Records

• Function call semantics requires the program to• Save state of calling function• Compute and pass parameters & return address• Pass control to the function

• Function return semantics• Values of pass-by-value-result or out-parameters are

copied back to the arguments• For value-returning functions, the returned value is

made available to the caller.• Restore state & pass control to the calling function


A block of information associated with each function call, which includes some or all of:• Parameters and local variables• Return address• Saved registers• Temporary variables• Return value (if any)• Static link• Dynamic link

• Usually the format of the AR is known at compile time.


Static link: points to the bottom of the Activation Record (AR) of the static parent; used to resolve non-local references. ◦ Needed in languages that allow nested

function definitions (Pascal, Algol, Ada, Java’s inner classes) and for languages that have global variables or nested blocks.

◦ Static link reflects static scope rules Dynamic link: points to the top of the AR of

the calling function; used to reset the runtime stack

Static & Dynamic Links

Simplified structure of a Called Method’s Stack Frame

Activation records are created when a function (or block) is entered and deleted when the function returns to its caller (based on a template prepared by the compiler)

The stack is a natural structure for storing the activation records (sometimes called stack frames).

The AR at the top of the stack contains information about the currently executing function/block.

Activation Record Stack

Early languages did not use this approach – all data needed for a function’s activation was allocated statically at compile time.

Result: Only one set of locations for each function◦ One set of locations for parameters◦ One set of locations for local variables,◦ One set of locations for return addresses,◦ Etc.

What about recursive functions?

Why Stacks?

A function that can call itself, either directly or indirectly, is a recursive function; e.g.,

int factorial (int n) {if (n < 2)

return 1;else return n*factorial(n-1);

}

9.6 Recursive Functions

Recursive call

When the first call is made, create an activation record to hold its information

Each recursive call from the else will cause another activation record to be added.

else return n*factorial(n-1);

9.6 Recursive Functions

Recursive self-call

When a function call is made, the runtime system◦ Allocates space for the stack frame (activation

record) by adjusting stack top pointer◦ Stores argument values (if any) in the frame◦ Stores the return address ◦ Stores a pointer to the static memory (the static

link) or enclosing scope.◦ Stores a pointer to the stack frame of the calling

method (the dynamic link.)

9.7 Managing the Run Time Stack

s

Parameters

Local Variables

Return Address

Saved Registers

Temporary Values

Return Value

Static Link Dynamic Link

Figure 9.5: Structure of a CalledFunction’s Activation Record

Consider the call factorial(3). • This places one activation record onto the

stack and generates a second call factorial(2).

• This call generates the call factorial(1), so that the stack has three activation records.

Another call, say factorial (6), would require 6 ARs. With static storage allocation (no stack), there is only one AR per function, so recursion isn’t supported.

Simple Example

int factorial (int n) {if (n < 2)

return 1;else return n*factorial(n-1);

}

Recursive Function Call

Stack Activity for factorial(3)Fig. 9.7 (activation records are incomplete)

n 3 n 3 n 3 n 3 n 3

n 2 n 2 n 2

n 1

First call Second call Third callreturns 1

Second callreturns 2*1=2

First callreturns 3*2=6

Link fields represented by blank entries

if (n < 2) return 1;

else return n*factorial(n-1)

Stacks for Non-Recursive Functions Consider the program

from Figure 9.1: main calls A, A calls B

The stack grows and shrinks based on the dynamic calling sequence.

On the next slide, we see the stack when B is executing

As each function finishes, its AR is popped from the stack.

int h, i;void B(int w) { int j, k; i = 2*w; w = w+1;} void A(int x, int y) { bool i, j; B(h);}int main() { int a, b; h = 5; a = 3; b = 2; A(a, b);}

Run-Time Stack with Stack Frames for Method InvocationsFigure 9.8 (Note: h shouldn’t be undefined; it is initialized when a & b are)

Three versionsof the stack: one after main() is called but before it calls A, one after A is called, one after B is called. Consider lifetime and scope. Any variable not in current activation record must be in static memory to be in scope.

Passing an Argument by Reference Example

Suppose, in our sample program, w had been a reference parameter. Now, when A calls B and passes in h as a parameter, the address of h is copied onto the stack. The statement w = w + 1 will change the actual value of h.

Static versus dynamic scoping Concrete syntax for functions.

Extra Slides

Static links implement static scoping (nested scopes): ◦ In statically scoped languages, when B assigns to i, the reference is to the global i

Dynamic scoping is based on the calling sequence, shown in the dynamic linkage.◦ In dynamically scoped languages, when B assigns

to i, the reference would be to the i defined in A (most recent in calling chain)

In either case the links allow a function to refer to non-local variables.

Static v Dynamic Scoping

Progr { Type Identifier FunctionOrGlobal} MainFunction

Type int | boolean | float | char | voidFunctionOrGlobal ( Parameters ) { Declarations

Statements } |Global

Parameters [ Parameter { , Parameter } ]

Global { , Identifier } ;MainFunction int main ( ) { Declarations

Statements }

Clite Concrete Grammar:Functions and Globals (new elements underlined)

Statement ; | Block | Assignment | IfStatement | WhileStatement | CallStatement | ReturnStatement

CallStatement Call ;ReturnStatement return Expression ;Factor Identifier | Literal | ( Expression ) |

CallCall Identifier ( Arguments )Arguments [ Expression { , Expression } ]

Concrete Syntax Continued

chapter 9 functions it is better to have 100 functions operate on one data structure than 10...

Documents

arguments value

single value

subprograms value

value doesnt

b sqrt4

values of parameters

int main

argument values