arrays and pointers

47
Arrays and Pointers Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 23

Upload: nevaeh

Post on 06-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Arrays and Pointers. Programming Language Principles Lecture 23. Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida. Arrays. Most common composite data type. Semantically, viewed as a mapping from the index type to the element type. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Arrays and Pointers

Arrays and Pointers

Prepared by

Manuel E. Bermúdez, Ph.D.Associate ProfessorUniversity of Florida

Programming Language PrinciplesLecture 23

Page 2: Arrays and Pointers

Arrays

• Most common composite data type.• Semantically, viewed as a mapping

from the index type to the element type.

• Some languages permit only integer as the index type; others allow any scalar.

Page 3: Arrays and Pointers

Array Declaration Syntax

C: char upper[26] ; /* array of 26 chars, 0..25 */

Fortran:

character(26) upper

Pascal:

var upper: array[‘a’ .. ‘z’] of char;

Ada:

upper: array (character range ‘a’ .. ‘z’) of character;

Page 4: Arrays and Pointers

Arrays and functions in Ada

In either case, upper(‘a’) returns ‘A’.

Page 5: Arrays and Pointers

Multi-Dimension Arrays

Ada: matrix: array (1..10, 1..10) of real;Modula-3: VAR matrix: ARRAY [1..10],[1..10] OF REAL; (same as) VAR matrix: ARRAY [1..10] OF ARRAY [1..10] OF REAL;and matrix[3,4] is the same as matrix[3][4].

Page 6: Arrays and Pointers

Multi-Dimension Arrays (cont’d)

In Ada, matrix: array(1..10,1..10) of real;is NOT the same as matrix: array(1..10) of array (1..10) of

real;

matrix(3)(4) not legal in first form;

matrix(3,4) not legal in second form.

Page 7: Arrays and Pointers

Multi-Dimension Arrays (cont’d)

An array of arrays is a slice.

In C, double matrix[10][10].

However, C integrates arrays and pointers,so matrix[3] is not an array of 10 doubles.

It is (depending on context) either:A pointer to the third row of matrix, or

the value of matrix[3][0]

Page 8: Arrays and Pointers

Slices in Fortran

Page 9: Arrays and Pointers

Array Dimensions, Bounds and Allocation

Five cases:• Global lifetime, static shape:

static allocation (easy).• Local lifetime, static shape:

space allocated on a stack frame.• Local lifetime, shape bound at elaboration:

stack frame needs fixed-size part and variable-size part.

• Arbitrary lifetime, shape bound at elaboration: Java, programmer allocates space.

• Arbitrary lifetime, dynamic shape: array lives on the heap.

Page 10: Arrays and Pointers

Array allocation in Ada (shape bound at elaboration time)

Page 11: Arrays and Pointers

Conformant Arrays in Pascal

• Array shape determined at time of call.• Pascal doesn’t allow local dynamic-shaped

arrays.

• Ada DOES allow local dynamic-shaped arrays (see textbook)

Page 12: Arrays and Pointers

Other Forms of Dynamic Arrays

• C: Arrays passed by reference, so bounds are irrelevant ! (programmer’s problem)

• Java strings: String s = “short”; s = s + “ and sweet”; //

immutable

Page 13: Arrays and Pointers

Resizing Arrays in Java

Create a new array of proper length and data type:

Integer[] a = new Integer[10]; Object[] newArray = new Object[newLength];

•Copy all elements from old array into new one:

System.arraycopy(a,0,newArray,0,a.length);

Rename array: element = newArray; // old space reclaimed by garbage // collector.

Page 14: Arrays and Pointers

Dynamic Arrays in Fortran 90

Arrays sized at run time, but can’t be changed once set.

Page 15: Arrays and Pointers

Classic Array Memory Layouts

Page 16: Arrays and Pointers

Memory Layout in C

Page 17: Arrays and Pointers

Address calculation (static array bounds)

Page 18: Arrays and Pointers

Virtual Location of Array

With static array bounds, we’ve “moved” the array in 3Ds.

Page 19: Arrays and Pointers

Dope Vectors

• A “run-time” descriptor for the array.• Contains, for each dimension (except last

one, always statically known):• Lower bound• Size• Upper bound (if dynamic checks are

required)• Size of dope vector depends on # of

dimensions (i.e. static).• Typically placed next to the array pointer, in

the fixed-size portion of the stack frame.

Page 20: Arrays and Pointers

Strings

• Usually an array of characters.• Many languages allow more flexibility with

strings than with other types of arrays.• Single-character string vs. single character:

• Pascal: no distinction.• C: *very* different

• String constants: 'abc', ”abc”.• Rules for embedding special characters:

• Pascal: double the character: ' ab''cd'• C: escape sequence: ”ab\”cd”.

Page 21: Arrays and Pointers

Strings

• C, Pascal, Ada: string length bound no later than elaboration time (allocate in stack frame).

• Lisp, Icon, ML, Java: allow dynamically-bound strings, stored in the heap.

• Pascal supports lexicographically-ordered comparison of strings ('abc' < 'abd'). Ada supports it on all 1D discrete-valued arrays.

• C: no string assignment, elements copied individually (library functions).

Page 22: Arrays and Pointers

Strings in C

Page 23: Arrays and Pointers

Sets

Pascal supports sets of any discrete type:

var a,b,c: set of char; d,e: set of weekday;

a := b + c; (* union *)

a := b * c: (* intersection *)

a := b – c: (* difference *)

Page 24: Arrays and Pointers

Set implementations

• Arrays, hash tables, trees.• Bit-vectors: each entry true (element in

the set), or false (element not in the set)• Efficient operations:

• Union is inclusive bit-wise OR.• Intersection is bit-wise AND.• Difference is NOT, followed by AND.

• Won’t work for large base types:• A set of 32-bit integers ~ 500MBs.• A set of 64-bit integers ~ 241 MBs

• Usually limited to 128, or 512.

Page 25: Arrays and Pointers

Pointers and Recursive Types

• Most recursive types are records.• Reference model languages (Lisp, ML,

Clu, Java): every field is a reference.• A record of type f contains a reference

to another record of type f.• Value model languages (C, Pascal, Ada):

need a pointer (a variable whose value is a reference).

• Don’t confuse pointer with address: an address may be segmented.

Page 26: Arrays and Pointers

Storage Reclamation

• Explicit (C,C++, Pascal, Modula-2): programmer must reclaim unused heap space.• Can be done efficiently.• Easy to get wrong; if so, can lead to

memory leaks.• Implicit (Lisp, ML, Modula-3, Ada, Java):

heap space reclaimed automatically.• Not so efficient (but getting better)• Simplifies programmer’s task a LOT.

Page 27: Arrays and Pointers

Reference Model (ML)

node (‘R’,node(‘X’,empty,empty), node(‘Y’,node(‘Z’,empty,empty),

node(‘W’,empty,empty)))

Page 28: Arrays and Pointers

Reference Model (Lisp)

'(#\R(#\X()())(#\Y(#\Z()())(#\W()())))

Page 29: Arrays and Pointers

Value Model

• Pascal: type chr_tree_ptr = ^chr_tree; chr_tree = record left, right:chr_tree_ptr; val: char end;• C: struct chr_tree { struct chr_tree *left, *right; char val; }• In C, struct names are not quite type names. Shorthand:

typedef struct chr_tree chr_tree_type

Page 30: Arrays and Pointers

Memory Allocation

• Pascal: new(my_ptr);

• Ada: my_ptr:=new chr_tree;

• C: my_ptr=(struct chr_tree *) malloc(sizeof (struct chr_tree));

• C++, Java: my_ptr = new chr_tree(args);

Page 31: Arrays and Pointers

Pointer References

Pascal: my_ptr^.val := ‘X’;C: (*my_ptr).val = ‘X’; my_ptr->val = ‘X’;Ada: T: chr_tree; P: char_tree_ptr; T.val := ‘X’; P.val := ‘X’; good for record or pointer to one. T := P.all; if need to reference the record.

Page 32: Arrays and Pointers

Pointers and Arrays in C

int n;int *a;int b[10];

All are valid:a = b;

n = a[3];n = *(a+3);n = b[3];n = *(b+3);

Page 33: Arrays and Pointers

Pointers and Arrays in C (cont’d)

Interoperable, but not the same:

int *a[n] allocates n pointersint[n][m] allocates a full 2D array.

In fact, assuming int a[n];*(a+i)*(i+a)a[i]i[a]

are all equivalent !

Page 34: Arrays and Pointers

Pointers and Arrays in C (cont’d)

• In C, arrays are passed by reference: the array name is a pointer.

• It’s customary to pass the array name, and its dimensions:

double det (double *M, int rows, int cols)

{ int i,j; ...

val = *(M+i*cols+j); /* M[i][j] */

}

Page 35: Arrays and Pointers

Tombstones

Technique for catching dangling references

Page 36: Arrays and Pointers

Tombstones

• Advantages:• Catch dangling references.• Prevent memory leaks.• Helpful in heap compaction.

• Disadvantages:• Cheap on the heap, expensive on

the stack (procedure entry/return).• Tombstones themselves can dangle.

Page 37: Arrays and Pointers

Locks and Keys

Page 38: Arrays and Pointers

Locks and Keys

• Advantage:• No need to keep tombstones around.

• Disadvantages:• Only work for heap objects.• Significant overhead.• Increase the cost of copying a pointer.• Increase the cost of every access.

Page 39: Arrays and Pointers

Reference Counts

• Set count to 1 upon object creation• Upon assignment.,

• Decrement count of object on left.• Increment count of object on right.

• Upon subroutine entry, increment counts for local pointers.

• Upon subroutine return, decrement counts for local pointers.

• Need type descriptors for this: objects can be deeply structured.

• WILL FAIL ON CIRCULAR STRUCTURES !

Page 40: Arrays and Pointers

Reference Counts Fail on Circular Structures

Page 41: Arrays and Pointers

Garbage Collection

• System determines which memory is not in use and return the memory to the pool of free storage.

• Done in two or three steps:• Mark nodes that are in use.• Compact free space (optional).• Move free nodes to storage pool.

Page 42: Arrays and Pointers

Marking

• Unmark all nodes (set all mark bits to false).• Start at each program variable that contains a

reference, follow all pointers, mark nodes that are reached.

c a e d b

firstNode

Page 43: Arrays and Pointers

Compaction

Move all marked nodes (i.e., nodes inuse) to one end of memory, updatingall pointers as necessary.

c b e d b

firstNode

a e d Free Memory

Page 44: Arrays and Pointers

Lists in Lisp and ML

Page 45: Arrays and Pointers

Equality Testing and Assignment

• Equality comparison is easy for scalars• For complex or abstract data types, say,

strings s and t, s = t could mean • s and t are aliases• s and t occupy the same storage• s and t contain the same sequence of

characters• s and t print the same

Page 46: Arrays and Pointers

Deep and Shallow Comparisons

• Shallow Comparison:• Both expressions refer to the same

object.• Deep Comparison:

• Expressions refer to objects that are “equal” in content somehow.

• Most PLs use shallow comparisons, and shallow assignments.

Page 47: Arrays and Pointers

Arrays and Pointers

Prepared by

Manuel E. Bermúdez, Ph.D.Associate ProfessorUniversity of Florida

Programming Language PrinciplesLecture 23