cs 331, principles of programming languages chapter 4 types: data representation

19
CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Upload: diana-wilkerson

Post on 17-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

CS 331, Principles of Programming Languages

Chapter 4

Types: Data Representation

Page 2: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Data Representation Issues

• Storage Management– automatic, static

• Scope

• Internal vs. External

Page 3: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Type Concepts

• Instances of a type– individual constants, variables, expressions

• Basic types– built into a language, e.g. float and int in C– instances of basic types are also known as first-class

objects

• User-defined types– defined using type expressions involving existing

types

Page 4: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Basic and User-defined Types

• In most modern languages, basic types include integer, real, character, and Boolean

• Most PLs have two mechanisms for user-defined types, namely array and record

Page 5: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Internal vs. External Representations

• Integers in C are not the same as integers in arithmetic– word length introduces issues related to

overflow e.g. -32767..32768– so some operations don’t act as they should

• Floats and doubles are not the same as real numbers (or rationals!)– conversion to binary introduces small errors

Page 6: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Storage Classes

• Typically, automatic variables are associated with a given block– allocated (and initialized) when block is entered– known only within that block– freed when block is exited

• Compare to static variables, which are associated with a given block, but are allocated and initialized only once

Page 7: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

User-directed Allocation

• In C, we have malloc and free– memory leaks can be a problem– uninitialized or invalid pointers can be problems

• In C++ we have constructors and destructors

• In Pascal and Modula, we have new() and dispose()

• In Lisp and Java, garbage collection is used

Page 8: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Data Aggregates

• Arrays are homogeneous– all the elements are of the same type

• Records (known as structs in C) are heterogeneous– components may be of different types

• Sets are available in languages like Pascal and Modula

• Type expressions are used to define these

Page 9: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Storage Allocation for Arrays

• Typically, array elements occupy a contiguous block of memory– hence C’s use of subscript 0

• For multi-dimensional arrays, there are two schemes– row-major: the rightmost subscript varies

fastest– column-major: the leftmost subscript varies

Page 10: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Records

• An example in C

• In Modula, it would be

struct date { int day; int month; int year;} DoB, DoD;

TYPE date = RECORD day, month, year: CARDINALEND;VAR DoB, DoD: date;

Page 11: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

linuxbeta.gl.umbc.edu> cat -n recordeg.mod 1 MODULE recordeg; 2 3 FROM InOut IMPORT WriteLn, WriteCard; 4 5 TYPE date = RECORD 6 day, month, year: CARDINAL 7 END; 8 VAR DoB, DoD: date; 9 10 BEGIN 11 DoB.year := 1808; 12 DoD.year := 1865; 13 WriteCard(DoB.year,10); 14 WriteCard(DoD.year,10); 15 WriteLn; 16 END recordeg.linuxbeta.gl.umbc.edu> source modulasetuplinuxbeta.gl.umbc.edu> gpmodula recordeg.modlinuxbeta.gl.umbc.edu> build recordeglinuxbeta.gl.umbc.edu> recordeg 1808 1865linuxbeta.gl.umbc.edu>

Page 12: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Varying Records in C

• Note that foo.data.I or foo.data.F are defined, but not both

• Allows data of different types to share storage

• Polymorphic data types!

type enum field_type {integer, real};struct { ft field_type; union { int I; float F; } data;} foo;if (foo.data.ft == integer) printf(“%d”, foo.data.I);else if (foo.data.ft == real) printf(“%f”, foo.data.F);else printf(“error”);

Page 13: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Varying Records in ModulaTYPE field_type = {integer, real};TYPE foo_type = RECORD CASE ft: field_type OF I: INTEGER; F: REAL; END; (* of CASE *)END; (* of RECORD *)VAR foo: foo_type;IF (foo.ft = integer) WriteInt(foo.I)ELSE IF (foo.ft = real) WriteReal(foo.F)ELSE WriteString(“error”);

Page 14: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Sets as Types

• There are situations where sets come in handy– when only certain data values are allowed, e.g.

program options or file permissions, and no existing subrange type is appropriate

– [Mon..Sun] is not a set (Sethi book has a mistake on p. 123)

Page 15: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Example: Sets in Modula or Pascal

• Operations of membership (IN), union (+), intersection (*), and set difference (-) are common

• Commonly implemented as bit strings TYPE colors = (white,yellow,blue,green,cyan,black,red);(* note that white < yellow < blue < … < red *)VAR CRT1, CRT2: SET OF colors;VAR testColor: colors;CRT1 := {cyan,yellow,green};CRT2 := {red,green,blue};IF (testColor IN CRT1) THEN WriteLn;

Page 16: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Type Coercion

• PLs differ in their approach to type coercion– coercion refers to automatic conversion from one

type to another– if a PL is strongly typed, then coercion is

restricted and/or explicit– if a PL is weakly typed, then coercion is taken

care of by the compiler, which might cause errors– in recent years, the trend is towards strong typing

Page 17: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Determining an Object’s Type

• For static or automatic objects it’s easy– static float x; int y;

• For other objects it can be hard

if (x) float *f = new float[250];else char *f = new char[1000];… /* other statements *//* the following statement must have a semantic error, but at compile time C++ can’t tell which one is wrong since it can’t know x in advance */cout << sqrt(f[0]) << strlen(f);

Page 18: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

Static and Dynamic Checking

• Type checking is needed to make sure operations are well-defined on the objects to which they’re being applied

• Static type-checking is done once, typically at compile-time

• Dynamic type-checking is done whenever an operation is applied to an object whose type could not be determined in advance

Page 19: CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation

How is Dynamic Type-Checking Done?

• When an object is created, a “tag” is attached to that object to indicate its type

• When that object is involved in some operation, the tag is checked to make sure that the operation is defined on such objects

• A hassle in terms of storage and execution time

• Smalltalk and other O-O PLs do this, but not C++