cs461 lecture week 6 - msu - department of computer...

Lecture week 64.1 Data Types – data, types and operations usually built-into language

1) investigate properties that define data objects2) typical types in most language (and Java)

a) elementary = atomic; manipulated as a unitb) structured = aggregate of other types

4.1.1 Data Object: run time grouping of one or more pieces of data in a virtual computera) programmer defined: variables, constants, arrays and filesb) system defined: run-time stack, activation records, file buffers, etc.

Data Object:- represents a container for data value(s)- has 6 attributes

1) lifetime2) type (associated with set of data values)3) location (in memory; not modifiable by programmer; may be changed by storage mgmt)4) value (usually result of an assignment op)5) name (set up in declarations; may be modified by subroutine calls/returns)6) component (belongs to another object? Often represented by ptr – modified by ptr change)

variables and constants

variable- binds to name and type at compile time- simple variable: elementary data object with a name- may be case sensitive/insensitive- binds to location at run time (unless Fortran)- binds value at run time, so value may be changed (usually by assignment stmts)

constant- binds to name, type and value at compile time- binds to location at run time (may not be changed, SUPPORTS RELIABILITY PRINCIPLE)

Example in ADA: CurrentSize: integer := 0; MaxSize: CONSTANT integer := 500;

Example in C++: int currentSize = 0; const int maxSize = 500;

Example in Java: class variable: static int currentSize = 0; instance variable: int currentSize = 0; constant: public static final int maxSize = 500;

Constant literal- name is written representation of value- “21” is representation of constant which is data object with value of 21

Persistent DataLifetimes of vars determined by execution time; data may have longer lifetime.Persistent language could declare vars whose lifetime extends beyond prog execution.Ex: airline reservation systemPersistent languages are an active research area.

Design Issues for variables1) permit aliasing? And how? A=attributes of variable V=value of variable R=reference NOTE: Java is in wrong place in diagram.

2) dereferencing (take reference and return its value)X := .X + 1 (BLISS)Context (Pascal and Ada and Java)

*Xaddr &X and context in C++ and Java*Xaddr (value of variable located at address that follows so X is a pointer)&X (returns address (X integer or any built in type)

Design issues for Constants (4)1) use only simple types for constants or can you have constant arrays and records?2) expressions used to assign constant a value3) when is value assigned? @compile time or @block entrance4) any predefined constants

Pascal example:1) only simple types2) no expressions3) @block entrance4) false, true, maxintOTHER: constants must be before all other declarations

Ada example:1) array and record constants as well2) expressions and functions may be used to assign constant a value3) run – time4) yes, maxintOTHER: just use word CONSTANT in declaration

X: constant INTEGER := 17Y: INTEGER := 17

C++/ Java example:1) array and record constants as well 1) simple types only2) yes 2) yes3) run time 3) compile, if possible otw runtime4) yes, in libaries (like math.h for pi, etc.) 4) yes

4.1.2 Data Types

Data Types – class of data objects together with a set of operations for creating them and manipulating them

Every language – set of primitive data types– may provide ability to produce programmer-defined data types

LOOK AT SPECIFICATION/IMPLEMENTATION/SYNTAX ISSUES

4.1.3 Specification of Elementary Types- elementary data object contains a single data value- class of such data objects with operations is an elementary data type- most languages: integer, real, character, boolean, enumerated and pointer

1) attributes: basic attributes such as type and name usually invariant during lifetimesome may be stored in a descriptor (or dope vector)ex. Array: #dimensions, index range, component type

2) values- type determines set of possible values- and in elementary data types they are ordered- ex: C has int, short, long, char

3) operations- determine how data objects of a given type can be manipulated- concerned with primitive operations, but can be programmer-defined- ideally, op is a math function: given input arguments it will uniquely determine result

- domain : set of possible input arguments- range: set of possible results

- arity of op = number of args: binary (or dyadic), unary (or monadic)

Signature of an operation (informal description /also needs implementation) opname: arg type x arg type x …arg type result type (= C function prototype) Ex. + : integer x integer integer ==: integer x integer Boolean

SQRT: real real

4 main factors for obscured specifications

1) ops undefined for certain inputs (ex. Underflow errors; SQRT of negative numbers)2) implicit arguments (ex. Use of globals); makes it hard to determine all data that may affect result3) side effects (ex. Modify input args as well as return)4) self-modification (ex. Random generator - of data and of code; LISP may modify code)

SubtypesInherit ops from parent typesForerunner of OOP inheritance

4.1.4 Implementation of Elementary Data Types

1) storage representation- use underlying hardware storage rep- software simulated (less efficient)- attributes may be determined by compiler for efficiency – not stored in runtime rep (as in C)- attributes may be stored in descriptor for flexibility (PROLOG, LISP)- rep is usually independent of location in memory

- usually described in terms of size & layout of atts & data values within block of memory- address of 1st word or byte represents location

2) operation implementations- directly use built-in hardware op- subprogram (ex. sqrt)- in-line code sequence (like macros; ex: abs(x) = if x < 0 then –x else x)

4.1.5 Declarations Program statement communicates to compiler about name and type of data objects .- may also indicate lifetime (how? By placement in program)

ADA: a, b: integer; explicit declaration x: float; bind names and type Ada assigns default values, if not init (0’s, blanks, etc.)

C++/Java: int a, b; explicit declaration float x; bind names and type

C++ doesn’t assign default values, if not init (uses leftover)Java assigns default values

FORTRAN has implicit declarations (vars starting w/ I-N) are integer.PERL: assigning a value to a variable declares it

Ex: $abc = ‘a string’; $abc = 7

Declarations of ops- compiler needs signature- no explicit declaration needed for primitives- programmer-defined ops need “prototype” or “heading”

ADA: function f (p1 in: integer) return float isSignature: f: integer float

C++/Java: float f (int p1);Signature: f: integer float

Purposes for Declarations1. choice of storage reps by translator2. storage management (dynamic/static)3. polymorphic ops (overloaded ops resolved; Smalltalk has no type declarations; must decide which + to

use each time it is encountered)4. type checking (allow static checking)

4.1.6 Type checking and Type conversionBit sequence in hw: Is it an integer or float or char? Or instruction? Hw can’t check type

Type Issues (2)

1) Type checking- each op checked to receive proper number of args and types- static vs. dynamic type checking- Advantages of static type checking

- Polymorphic op replaced by type-specific- All paths checked

- Advantages of dynamic type checking (Lisp, Prolog)- Flexibility in program design (type of variable can change during runtime)

- Disadvantages of dynamic type checking- Difficult to debug- Extra storage for keeping type information during execution- Must be in software / slow

- Concern for static type checking affects many aspects of lang: declarations, data control structures, provisions for separate compilation of subroutines, etc. Usually some things cannot be checked statically. Two ways to handle:- Dynamic type checking- Leaving ops unchecked

Strong typing – detect all type errors statically; few languages are truly strongly typed

2) Type inference- ML infers data types if interpretation is unambiguous: p166, 4th ed

3) Type conversion or coercion- mismatch of types

- flag as error- coercion

- explict conversion- Pascal i := round (x);- C++ /Java: r = (float) (i);

- implicit conversion- Pascal r := 2/3; 2 implicitly converted and does real division- C++ r = 2/3; 2 and 3 not implicitly converted and does integer division

- two types of coercions- no info lost/widening short int long int- narrowing 1.0 1 OK, IN THIS CASE- ADA: almost no coercions provided avoid tedious detail- C++ : coercions are the rule mask programming errors

AssignmentBasic op for changing the binding of a value to a data object- this is a side effect- In Pascal, assignment does not return a value: assignment (:=): integer1 x integerx void- In C and Lisp, it returns a value – a data object holding a copy of the value assigned: assignment (=):

integer1 x integer2 integer3- Define r-value as righr hand side of expression (= value contained in data object)- Define l-value as left hand side of expression (= location of data object)- Assignment op is then:1. Compute l-value of 1st operand expression2. Compute r-value of 2nd operand expression3. Assign computed r-value to computed l-value object4. Return computed r-value as the result of the operation

- C contains a set of unary operators for manipulating l-values and r-valuesMany “useful, and strange, manipulations of such assignments”Dual use of assignment op, as mech to change data object’s value and as a function that returns a value

is heavily exploited by C- An r-value may be the l-value of some other data object (ex: A=B, where A & B are ptrs; seeFig 5.2)

Semantics of AssignmentA 2 + 3 Does this mean assign ‘5’ to A, or assign the operation ‘2+3’ to A? This is unambiguous if language has static types – A is either type int or op. But it can be ambiguous if lang is dynamically typed. In Prolog, ‘is’ means assign the value, while ‘=’ means assign the pattern. So the following clause 1 succeeds, because X is first assigned the value 5 & its type is set to int. The second clause fails because the two parts are different.

1. X is 2 + 3, X = 5.2. X = 2 + 3, X = 5.

4.2 Elementary Data Types

4.2.1 Numeric

Integers Specification- set of integer values ordered subset within finite bounds (MININT..MAXINT)- ops: Binary ops: integer1 x integer2 integer3

ADA: + , -, *, / mod rem **C++/Java: + , -, *, / , %, pow (library)

Unary ops: integer integerADA: -, + sign abs notC++/Java: -, + sign abs(math.h), !, ++, --

relops: integer x integer booleanADA: /= , =, <=, >=, <, > in not inC++/Java: != , =, <=, >=, <, >

Implementation- hardware-defined storage rep- hardware-arithmetic/relational primitive ops

Example: ADA: a: integer;

type Tresult is digits 7; x : Tresult; number of digits can be specified in ADA C++, Java: int a;

Integer issues1) machine implementation reliance ADV: speed DISADV: portability2) predefined MaxInt and MinInt: ADA: Yes C++: Yes, INT_MAX and INT_MIN (include limits.h)

Java: yes, MAX_VALUE, MIN_VALUE, POSITIVE_INFINTY, NEGATIVE_INFINITY

3) See p 174 for examples of integer storage reps:a. Has no descriptor; ok where lang provides declarations & static type checking for

intsb. Stores descriptor in separate mem loc, w/ptr to full word int value; often used in

LISPDisadv: May double the storage required for a single intAdv: Value is stored using built-in hw rep, so hw arith can be used

c. Stores descriptor & val in single mem loc – shortens size of val to provide desc spaceAdv: Conserves spaceDisadv: Before doing hw arith, must clear desc, do arith, then put desc back – so arith is inefficient. This method is only practical for hw-implemented type descriptors (not common)

4) Subranges (A : 1..10 in Pascal or A : integer range 1..10 in Ada; in C programmer decides size of storage: char, short, int, long).Subranges have 2 important effects on imps:

1. Smaller storage requirement. But subrange vals are often stored as smallest num of bits for which hw implements arith ops (usually 8 or 16 bits).

2. Better type checking.Ex: Month: 1..12 then Month := 0 is invalid and can be caught at compile timeBut some checks must still be done at run time, for ex: Month := Month + 1. In this case, bounds must still be available at run time.

Floating Point numbersSpecification- set of ordered real values- ops: same as with integers, except some boolean ops may be restricted due to roundoff errors (== test)- other ops cos : real real max : real real

Implementation- hardware-defined storage rep- hardware-arithmetic/relational primitive ops (some may be software-simulated)

Example: ADA: x : float;

type Tresult is float range -1.0E12..1.0E12; range can be specified in ADA answer : Tresult;

C++, Java: float x; double y;

Fixed point real numbers (example: dollar amounts) Specification: avoids round off errors Implementation: Hardware or software simulated Stored as ints, with decimal point an attribute of the data object

Ex: X = 103.421; r-val is 103421, and X has an att Scale Factor of 3. So:value(X) = rvalue(X) * 10-SF

Must keep track of scale factors in arith. Ex: Product = rval(X) * rval(Y); SF = SF(X) + SF(Y)

Example: COBOL: X PICTURE 999V99 ADA: float point type Tresult is float range -1.0E12..1.0E12;

Fixed point type TsmallAmt is delta 0.01 range –100.00..100.00;

MORE numeric issues1) mixed mode arithmetic handling?2) Coercion implicit?3) Can specify number of digits or range in #?

ADA: C++: Java:1) Yes 1) yes 1) yes 2) very little 2) yes 2) very little3) yes 3) no 3) no

Enumerations Specification – ordered list of distinct values; defines literal names for values and ordering Implementation – storage rep as integers /uses only enough bits Examples: ADA: C++:type classification is (fresh, soph junior, senior); typedef classification={fresh,soph,junior,senior); c: classification; classification c; Ops: relational, assignment, succ, pred, ord ops: relational, assignment, castingclassification’ord (fresh) will return 0

Issues:1) same member in different enumerations in same scope

Example: type bread is (fresh, stale); 2) permit ordering?

ADA: C++: Java: no enums1) yes 1) no2) yes 2) yes

Boolean Specification – data objects consist of one of two values Implementation – hardware single bit sometimes/byte or word 0 – false 1 – true (any other value other than 0)

Examples: ADA: found:boolean; Ops: and, or, xor, and then or else, not C++: not built-in, must simulate using define statements or enums C++5.0: bool Java: boolean

Ops: &&, ||, ! ops: &&, |, !, ^ (XOR), ?:

Issues1) ordering? False < true?2) Can true/false be redefined

ADA: C++: Java:1) Yes 1) yes 1) no2) No 2) yes 2) yes

4.2.4 Characters Specifications: data objects have a single character value, ordering determined by collating sequence Implementation: hardware representation/primitives for I/O purposes Operations: relational ops, assg, member, ord, chr

Examples: ADA: a:character; C++/Java: char a;

Issues:1) Is char variable compatible with string of 1 char?

ADA: yesC++: noJava: no

Subtypes Data type part of a larger class Operations available to larger class also available to sub type

ADA: subtype Tscore is integer range 0..100; C++/Java: none

Issue:1) can variables of subrange type be intermixed with “parent” type?

ADA: yesC++/Java: n/a

4.1.6 Assignment and Initialization

Most operations for common elementary data types take 1 or 2 args, simple arithmetic, relationals ops Assignment more subtle.

Example: Pascal, Ada : assignment (:=) integer x integer voidC++: assignment (=) integer x integer integer

X := X + 1 (first X is an Lvalue, second is Rvalue)Lvalue or Rvalue can be resolved by context (implicit) or explicit

Design issues1) Which is evaluated first?

Suppose we have an array A and a function P(u)A[x] := P(x); What if x is a reference parameter and it is changed by function P?

2) Multiple assignments n1, n2 := exp1, exp2; n1 exp1 and n2 exp2 p, q := q, p; p q q ? new p or old p?

3) Assignment to array/records

ADA:1) Lvalue first2) ?3) Yes, but only at declaration. Example: A: array(1..3) of float := (17.2, 20.4, 23.6);

Pascal:1) Lvalue2) No3) Yes , can do A = B;

C++:1) evaluates Lvalue first2) no3) Yes, but only in declaration. Example: int a[2]= {1, 2};

Java:1) Lvalue2) No3) only in declaration Example: int a[ ] = {1, 2};

can use clones Ex. Vector b = new vector; c = b.clone ( );

cs461 lecture week 6 - msu - department of computer...

Documents