2 programming language concepts using c and c++_data level structure - wikibooks, open books for an...

Upload: paras-dorle

Post on 02-Jun-2018

248 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    1/30

    Programming Language Concepts Using C and

    C++/Data Level Structure

    In this chapter, we will start with defining properties common to all data items usable in a programming context and

    then move on to classifying data according to their structure and type. While doing so, we will also try to give an idea

    of how they can be laid out in memory.

    Contents

    1 General Properties

    1.1 Mutability

    1.1.1 Immutable Data

    1.1.2 Mutable Data

    1.2 Visibility

    1.3 Accessibility2 Data Categories

    2.1 Data Elements

    2.1.1 Primitive Data Elements

    2.1.2 Data Elements that Are Addresses

    2.1.3 Compound Data Elements

    2.2 Structures

    2.2.1 Data Structures

    2.2.2 Storage (Memory) Structures

    2.2.3 File Structures

    3 Data Types

    3.1 Relationships between Data Types

    3.1.1 Equivalence3.1.2 Extension

    3.1.3 Implementation

    3.2 Declarations

    3.2.1 Explicit Type Declarations

    3.2.2 Implicit Type Declarations

    3.3 Scalar Data Types

    3.3.1 Numeric Types

    3.3.2 Logical (Boolean) Type

    3.3.3 Pointer Type

    3.4 Structured Data Types

    3.4.1 Strings3.4.2 Arrays

    3.4.2.1 Multidimensional Arrays

    3.4.2.2 Associative Arrays

    3.4.3 Lists

    3.4.3.1 Multilists

    3.4.4 Dynamic Arrays

    3.4.5 Records

    3.4.5.1 Variant Records

    3.4.5.2 Variable-length Records

    3.4.5.3 Alignment Requirements

    3.4.6 Sets

    3.4.7 Trees

    3.4.8 Graphs

    3.4.9 User-defined Types

    3.4.10 Abstract Data Types

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    2/30

    4 Notes

    General Properties

    Mutability

    Immutable Data

    Constantis a data item that remains unchanged throughout its lifetime. A constant may be used literally or may be

    named. Named constants are sometimes termed symbolic constants or figurative constants.

    Literal and named constants.

    3, ' 5' , . TRUE. , "St r i ng l i teral " are examples to literal constants, whereas const doubl e pi =3. 141592654; is the C++ definition of [an approximation to] as a named constant.

    Some programming languages make a distinction between constants whose values are determined at compile-time

    and those whose values are determined at run-time. In C#, for example, the former is tagged with the keywordconstwhile the latter with readonly. In Java, constancy of a field (or local identifier) is flagged with the final

    keyword and presence or absence of the staticmodifier classifies the associated data item to be a compile-time or

    run-time constant, respectively.

    Example: Compile-time constants in C#.

    public class Mat h { . . . public const /* static final in Java */ double pi = 3. 141592654; public const double e = 2. 718281;

    . . .}

    Here the values for pi and e(Euler constant) can be determined even before the program starts to run. So they aredefined to be const.

    Note that same definitions are valid for all instances, if there is any, of the Mat hclass. That is, the value of pi or edoes not change from one instance to another. As a matter of fact, they exist independently of the instances as class

    fields. In other words, they are [implicitly] static. Explicit use of stat i ctogether with const in C++, which is asource of inspiration for C#, is a manifestation of this.

    Example: Run-time constants in C#.

    public class Ci t i zen { . . . Ci t i zen ( . . . , long SSNof t heNewCi t i zen, . . . ) { . . . SSN = SSNof t heNewCi t i zen; . . . } . . .

    private readonly long SSN; /* private final long SSN in Java*/} // end of class Citizen

    Here, value for SSNcannot be determined before the related Ci t i zenobject is created. Once the object is created,

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    3/30

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    4/30

    { doubl e i 1; doubl e d; . . . } /* end of inner block */ . . .} /* end of outer block */. . .

    In the above code fragment, scopes of variables (identifiers) are:

    i 1(of outer block), i 2: from its declaration point to the end of the outer blocki 1(of inner block), d: from its declaration point to the end of the inner block

    Visibilities of the variables are:

    i 2, i 1(of inner block), d: same with their scopei 1(of outer block): its scope minus the inner block

    Accessibility

    A notion found in modular and object-oriented languages, accessibility constraints can be applied on data. This is

    done for two purposes:

    To preserve the consistency of the data and1.

    To give the implementer the freedom of changing the implementation details.2.

    Example: Access constraints in Java.

    public class St ack {

    . . . private int _top; private Obj ect [ ] _cont ent s;} // end of class Stack

    Declaring the implementation details in the above class definition asprivatemarks them off-limits to the user,

    which means the below statements are not permitted.

    st k. _t op = 0; st k. _cont ent s[ 7] = new I nt eger ( 5) ;

    Issuing the first statement would probably cause a nonempty St ackobject to look empty. Similarly, the secondstatement would probably populate the St ackobject by adding a new element to some location other than thatindicated by_t op, which is definitely against the definition of a stack.

    Additionally, such a definition enables the implementer of the St ackclass to change the implementation details. Forinstance, she may choose to use a Vect or as the underlying data structure. Now that it is not known to the outsideworld, the users of the St ackclass will not be affected by this decision.

    Data Categories

    Data Elements

    The most basic data entity is the data element. These data entities may be grouped together to form structures.

    Primitive Data Elements

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    5/30

    Primitive data elements are those that can be directly operated on by machine-language instructions and are broadly

    divided into numeric, character, logical (boolean). Numeric data elements can further be divided into integers and

    reals.

    Example: Primitive data elements in C/C++.

    In C/C++, char , i nt , shor t , l ong, and l ong l ongare used for representing integers. char can also beinterpreted as holding a single-byte character. f l oat , doubl e, and l ong doubl erepresent floating-point

    single, double, and extended precision values. bool is used to represent boolean values.[1]

    Data Elements that Are Addresses

    An addressis a value that indicates a location in the process image created as a result of running the program.

    Depending on the program segment it points to, an address falls in either one of two groups.

    Labelis actually the address of a program statement, an address in the code segment of a program. It may also

    be thought of as a data element that may be operated upon by a goto operation.

    1.

    Pointersreference or point to other elements in the code and data segments. In the former case, a pointer

    indicating the start of a subprogram can be used to invoke the subprogram dynamically, which enablesimplementation of polymorphic subprograms by means of callbacks. Pointers into the data segment often refer

    to unnamed data elements. They are heavily used in constructing dynamic data structures and recursive

    algorithms.Handles, which may be considered as intelligent pointers, serve the same purpose.

    2.

    Compound Data Elements

    Strings of caharcters, generally referred to as simply strings, are linear sequences of characters. They are sometimes

    considered as primitive data elements, because they are directly operated on by machine language instructions, and

    sometimes classified as data structures.

    Structures

    Data Structures

    Data structures'are organized collections of data elements that are subject to certain allowable operations. Data

    structures are logical entities in the sense that they are created by programmers and are operated on by high-level

    programs. These may have little bearing to the physical entities, that is, the storage structures operated on by

    machine language code.

    Data structures may be classified as:

    Linear vs. nonlinear: A linear, as opposed to nonlinear, data structure is one in which the individualcomponents are an ordered sequence. Examples to linear data structures are strings, arrays, and lists. Examples

    to nonlinear data structures include trees, graphs, and sets.

    Static vs. dynamic: A static structure is one that has no capacity for change, specifically with regard to its size,

    during the course of execution. Arrays and records are examples to static data structures. An example to

    dynamic data structures is lists.

    Storage (Memory) Structures

    Storage structuresare data structures after they have been mapped to memory. While the data structure is the logical

    organization of your data, the storage structure represents the way in which your data is physically stored in memory

    during the execution of your program.

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    6/30

    Possible layouts for multi-dimensional arrays.

    Suppose you are working with a two-dimensional array. Your array, say n-by-m, is not actually stored in two

    dimensions in the memory, but as a linear sequence of elements. In row-major order, the array is stored as

    follows:

    A(1, 1), A(1, 2), ..., A(1, m), A(2, 1), ..., A(n, 1), A(n, 2), ..., A(n, m)

    In column-major order, same array is laid out in memory as follows:

    A(1, 1), A(2, 1), ..., A(n, 1), A(1, 2), ..., A(1, m), A(2, m), ..., A(n, m)

    Storage can be allocated in two ways:

    Sequential: A structure allocated this way may also be called a static structure since it is incapable of change

    throughout its lifetime. Such structures use implicit ordering: components are ordered by virtue of their

    sequential ordering in the structure (indexing).

    Linked: A structure allocated this way may also be called a dynamic structure since the data structure can

    grow and shrink during its lifetime. They use explicit ordering: each component contains within itself the

    address of the next item so that it is in effect "pointing" to its own successor.

    As for the pros and cons of each method:

    In static structures, full storage remains allocated during the entire lifetime of the structure. Additionally, in

    order to avoid overflow, maximum theoretical size is used in the declaration of the structure. For these reasons,

    static structures are not memory efficient. In dynamic structures, there is a space overhead due to the pointer

    field(s).

    Due to the possibility of shifting, which can be avoided at the expense of some extra memory, insertions into

    and deletions from any position other than the end of a static structure are expensive. Such is not the case for

    dynamic structures.

    Using the index value direct access is possible in static structures, which means accessing any item in the

    structure takes constant time. This, however, is not valid for dynamic structures. This weakness, however, canbe alleviated by imposing a hierarchical structure on the data.

    File Structures

    File structuresrefer to data residing in secondary storage. When program execution terminates, file structures are

    the only structures to survive the termination of the program.

    The data hierarchy refers to the logical organization of data that is probably stored on secondary or external storage

    media such as magnetic tape.

    A file is a collection of related records, related to a particular application. A record is a collection of related data

    items, or fields, related to a single object of processing. A field is a data item, a piece of information (either a data

    element or a structured data item) that is contained within the record.

    A file can be accessed for input, output, input/output, and append. It can be processed in two different modes: batch

    mode, query mode. In batch mode, the component records of the file are operated on in sequential order. Report

    generation for test results of a class is an example to this type of processing. In query mode individual records are

    manipulated by accessing them directly. Retrieving the record of one single student falls into this category.

    Another important issue is the organization of the file on secondary storage. This (physical) ordering of records can

    be done in three ways:

    Sequential: The file is seen as a linear sequence of records. Such files cannot be accessed in input/outputaccess mode. Text files are typical examples of sequential files.

    Relative: Each record in the file can be accessed directly by location. Naturally, such organization becomes

    possible only when the file is stored on a direct-access storage device (DASD). The mapping between the key

    field in the record and the location on disk can be done in two ways: direct-mapping and hashing.

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    7/30

    Indexed sequential: This is a compromise between the first two methods. Files are stored sequentially on a

    DASD but there is also an index file that allows optimum direct access by way of a search on index.

    Each file organization technique must be evaluated on the basis of:

    Access time: The time it takes to find a particular data item.

    Insertion time: The time it takes to insert a new data item. This includes the time it takes to find the correct

    place to insert the new data item as well as the time it takes to update the index structure.

    Deletion time: The time it takes to delete a data item. This includes the time it takes to find the item to bedeleted as well as the time it takes to update the index structure.

    Space overhead: The additional space occupied by an index structure.

    These in turn are affected by three factors. The system must first move the head to the appropriate track or cylinder.

    This head movement is called a seekand the time to complete is seek time. Once the head is at the right track, it must

    wait until the desired block rotates under the read-write head. This delay is called latency time. Finally, the actual

    transfer of data between the disk and main memory can take place. This last part is transfer time.

    Typical operations on files are: open, close, read, write, EOF, and maintenance operations such as sorting, merging,

    updating, and backup.

    Data Types

    A data type, usually referred to as simply type, is composed of a domain of data elements and a set of operations that

    act on those elementsoperations that can construct, destroy, or modify instances of those data elements.

    In addition to built-in types, which can be structured as well as primitive data types, most languages have facilities

    for the definition of new data types by the user. [ALGOL68 and Pascal were the first programming languages to

    provide this.]

    A type systemis a facility for defining new types and for declaring variables to be of such types. A type system may

    also have the capability for type checking, which may be static or dynamic depending on whether this checking is

    done at compile time or during execution.

    Example: (Restricted) type system of C.

    Basic types: char , shor t , i nt , l ong, f l oat , doubl e

    Type constructors: * , [ ] , ( ) , struct , uni on

    t ypedef char Per f ormance[ 2] ;t ypedef char * St r i ng;

    t ypedef struct {

    St r i ng f i r st _name; St r i ng l ast _name; Per f ormance grades;} St udent ;

    t ypedef voi d ( *EVAL_PERF) ( Student* Cl ass) ;. . .

    A strongly typedprogramming language is one in which the types of all variables are determined at compile time.

    Programs written in such a language, which is said to have static typing, must explicitly declare all programmer-

    defined words. Storage requirements for global and local variables are determined completely during compile time. Astrongly typed programming language may include a typing facility for defining new types.

    With dynamic typing, variables are not bound to type at compile time. Languages that allow for dynamic typing of

    variables, which are also classified as weakly typed, may utilize dynamic type checking, which requires the presence

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    8/30

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    9/30

    var

    v1, v2: ar r t ype1; v3: ar r t ype2;

    In the above fragment, v1and v2have equivalent types, while type of v3is different than that of these variables.

    Extension

    A type is said to extend another if all of its instances can be seen as instances of the type it extends. The base type,

    the one being extended, is called a generalizationof the extended type, while extended type is said to be a

    specializationof its base type.

    Thanks to the nature of the relation, an expression of extended type is assignment-compatible with a variable of

    base-type.

    The most popular technique used to provide this relation is inheritancecommon to most object-oriented

    programming languages.

    Implementation

    A type is said to implement another if it provides an implementation for all operations listed in the interface of the

    implemented type.

    Implementation can be seen as a special case of extension, where the base type does not provide any implementation

    of its operations.

    In many object-oriented programming languages, such a relation is found between an interfaceand its implementing

    class.

    Declarations

    In languages with static type checking, the program must somehow communicate the types of the identifiers it uses.

    Complete knowledge of identifier types at compile time leads to a more efficient program because of the following

    reasons.

    More efficient allocation of storage. For instance, all integer types can be stored similarly using the largest such type.

    But, if you know the exact type, you dont have to allocate the largest size. More efficient routines at run time. A +

    B is handled differently depending upon whether A and B are integers or real numbers Compile-time checks. Many

    invalid uses of the programming constructs are spotted before the program even starts to run.

    On the other hand, at the expense of ensuring type safety compilers may at times reject valid programs.

    Identifier types can be communicated in two ways.

    Explicit Type Declarations

    Widely preferred over the alternative, the programmer uses declarative statements to communicate the types of

    variables, functions, and so on. Some programming languages, such as ML and Haskell, do not require the

    programmer to provide type declarations for all identifiers. Through a process called type inference, compiler does its

    best to figure out the types of expressions.

    Implicit Type Declarations

    In (some versions of) languages like FORTRAN and BASIC, the way a variable is named reveals its type.

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    10/30

    Implicit type declarations in Fortran.

    In versions of FORTRAN before FORTRAN 90, an identifier beginning with I-N is taken to be an integer, and an

    identifier beginning with any other letter is taken to be real.

    Scalar Data Types

    A scalar data typehas a domain composed only of individual primitive data elements.

    Numeric Types

    Numeric types are related to or represent quantities in the outside world. However, whereas in real life these types

    may have infinite domains, in the world of computers their domains are finite.

    Numeric types.

    Integer, floating-point, fixed-point, and complex numbers.

    Logical (Boolean) Type

    Variables of such a type can take on only two values, trueor false, which may be represented in the machine as 0

    and 1, zero and non-zero. Typical operations on booleans are and, or , not .

    Pointer Type

    Apointeris a reference to an object or data element. A pointer variable is an identifier whose value is a reference to

    an object.

    Pointers are important for the dynamic allocation of a previously undetermined amount of data. Additionally, they

    permit the same data to reside on different structures simultaneously. In other words, they make sharing of data

    possible.

    Pointer variables point to and provide the means for accessing unnamed, or anonymous variables. Consequently,

    operations on pointers must distinguish between operations on the pointer variable itself and operations on the

    quantity to which the pointer is pointing.

    Example: Pointers in C/C++

    i nt num1, num2, *pnum;1.. . .2.. . .3.pnum = &num1;4.num1 = 15;5.num2 = *pnum;6.. . .7.

    Memory layout after line 6

    pnumin the above example is used to access a location holding an i nt value. As a matter of fact, it can hold areference to i nt s only. This is the most important difference between a pointer and a plain address: while an addresscan be used to refer to values of any type, a pointer holds a reference to a specific data type. However, a precious

    tool in implementing generic collections in C, type agnosticism of addresses can be reclaimed by disciplined use of

    voi d * .

    In C, where pointers are heavily used, advantages may turn into maintenance nightmares. Following program is an

    example to this.

    Example: Pointer pitfall in C.

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    11/30

    #i ncl ude 1.2.

    i nt mai n( voi d) {3.i nt ch = 65;4.i nt * p2i ;5.const i nt *p2ci = &ch;6.

    7.

    p2i = p2ci ; /* !!! */8.pr i nt f ( "p2ci : %i\tp2i : %i\n" , *p2ci , *p2i ) ;9.*p2i = ch++; /* !!! */10.pr i nt f ( "p2ci : %i\tp2i : %i\n" , *p2ci , *p2i ) ;11.

    12.exi t ( 0) ;13.

    } /* end of int main(void) */14.

    When we compile and run this program, it will produce the following output.

    p2ci: 65 p2i: 65

    p2ci: 66 p2i: 66

    This is a violation of the contract we have made. On line 6, we guarantee that the content of the location pointed to

    by p2ci will not change. On line 8, we assign p2ci to p2i , which is another pointer that lets itself be updated. Welater go on to change the value found at the location through the non-constant pointer p2i , which means the valuepointed to by p2ci is also changed.

    The compiler may issue a warning for this error, which is the case in the GNU C compiler. But you cannot rely on

    this: not all compilers will issue a warning. Sometimes, the programmer will turn off the warnings to avoid reading

    annoying messages and the message will go unnoticed.

    Structured Data Types

    A structured data type has domain elements that are themselves composed of other scalar or structured type

    elements.

    Strings

    Strings are ordered set of data elements of dynamically changing size.

    Two attributes can be associated with a string: type and length. Type refers to the domain of the individual elementswhile length refers to the number of elements. (Note that length and size are two different things.)

    The type attribute is generally character although bit strings are also commonly used for implementing sets.

    Example: Character strings in C/C++.

    char *name = " At i l l a" ;

    Another possible representation of the same string in a different language would be:

    Note that number of bytes reserved for the length prefix can change from implementation to implementation.

    Another point to keep in mind: ASCII is not without competition. Alternatives include Unicode and ISO/IEC-8859

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    12/30

    series of ASCII-based encodings. In the case of Unicode, each character takes up two bytes in memory.[2]

    Example: BSTRtype used in [OLE] Automation.[3]

    CSt r i ng name = _T( "At i l l a") ;BSTR bst r Name = name. Al l ocSysSt r i ng( ) ;

    As hinted by the above figure, BSTRis used for exchanging character string data between components writtenpossibly using different languages. As a matter of fact the length prefix is inherited from Visual Basic while the

    terminating null character is taken from C.

    Note that length prefix holds the number of bytes the character string proper occupies and the bst r Nameidentifieris actually a pointer to the first character.

    Typical operations on strings are:

    Concatenationcreates a string from smaller strings.

    Substringcreates a string from a subsequence of another string.Index testsfor containment of a smaller string in a larger string. It returns the index value where the

    subsequence containing the smaller string starts.

    Lengthreturns the number of components in the string.

    Insert, delete, replace, ...

    Arrays

    An array may be defined as a fixed-size, ordered set of homogeneous data. It is stored in contiguous memory

    locations and this makes direct access possible. Its fixed-size makes an array a static structure.

    In some programming languages such as Standard Pascal, size must be known at compile time while in many this sizecan also be given at run time. But, even in the latter case, size of the array does not change during its lifetime.

    Example: Static nature of arrays in Standard Pascal.

    program pr i mes( i nput , out put ) ; const N = 1000; var a: array [ 1. . N] ofboolean; . . .

    begin . . . end

    This Pascal fragment must be recompiled and run for a different value of N.Example: Open arrays in Oberon.

    PROCEDURE SetZero (VAR v: ARRAY OF REAL) ;VAR j : INTEGER;BEGIN

    j : = 0; WHILE j < LEN( v) DO v[j ] : = 0; I NC(j ) END

    END Set Zero;

    In the above Oberon fragment, any actual (one dimensional) array parameter with element type REALis compatible

    with v. The subprogram can be called with a one-dimensional array of any size. But, once the array is passed as an

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    13/30

    argument its size cannot change.

    Among the important attributes of an array are component type, dimensionality of the array, and size in each

    dimension.

    Arrays can be represented in the memory in two different ways:

    Row-major (Almost all major programming languages)1.

    Column-major (FORTRAN)2.

    In order to minimize the access time to individual array components, we should make sure that fastest changing

    indexes in our programs (that is, the innermost loop variables) correspond to the fastest changing indexes in the

    memory layout. If we ignore the loop ordering, with large multidimensional arrays, the virtual memory performance

    may suffer badly.

    Example: Multidimensional array usage in Pascal.

    var a: array [ 3. . 5, 1. . 2, 1. . 4] of integer;. . . for i : = 3 to 5 do

    for j : = 1 to 2 do for k : = 1 to 4 do {some processing done using a[i, j, k]}

    This program fragment shows the accurate loop order for a language that represents arrays using row-major

    representation. Note that the innermost loop variable (the fastest changing one) in the fragment corresponds to the

    fastest changing index in the memory layout. This correspondence must be extended to other loop variables and

    indexes, i.e. the second innermost loop variable must correspond to the second fastest changing index in the memory

    and ... .

    Example: Multidimensional array usage in Fortran.

    DO k = 1, 4 DO j = 1, 2 DO i = 3, 5 {some pr ocessi ng done usi ng a( i , j , k) } END DO END DOEND DO

    This second fragment gives the correct loop order for a language with column-major representation.

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    14/30

    It should be noted that header information included in the array representation is not standard and may vary or even

    disappear in some programming languages. Most notable example to this is Java, in which size of the array is not

    used in type checking. This means one can use an array handle to manipulate arrays of different sizes.

    Example: Type-compatibility of arrays in Java.

    int[ ] i nt Ar r ay = new int[ 10] ;. . . // use intArray as an array of 10 intsi nt Ar r ay = new int[ 20] ; // OK!. . . // use intArray as an array of 20 ints

    This may at first seem to contradict the static nature of arrays. After all, size of i nt Ar r ayhas been changed from10 to 20. Not really! What we have done in the previous code fragment is to make i nt Ar r ayindicate two differentarray objects in the heap. In other words, we have changed the array handle, not the array object itself.

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    15/30

    Determining the address of an element in a Pascal-style array.

    Given the pseudo-Pascal declaration

    var a: array [ 5. . 10, 0. . 3, - 2. . 2] of integer;

    calculate the address of a[7, 2, 1] assuming a) column-major representation and b) row-major representation. For

    both cases, assume that base of a is 200 and an integer is represented in four bytes.

    a) (10 5 + 1) * (3 0 + 1) * (1 (-2 )) + 1 gives us the order of the component at [5, 0, 1]. We need (10 5 +

    1) * (2 0) more to get to [5, 2, 1]. (7 5) more and we reach [7, 2, 1]. So, [7, 2, 1] is the (6 * 4 * 3 + 1) + (6 *

    2) + 2 = 87th component. It can be found at address 200 + (87 1) * 4 = 544.

    The above calculation can be generalized as follows:

    In C-based programming languages, where the lower bound is always taken to be 0, this formula can be

    simplified to:

    Memory address of a particular component is given as follows:[4]

    b) (2 (-2) + 1) * (3 0 + 1) * (7 5) + 1 gives us the order of the component at [7, 0, -2]. (2 (-2) + 1) * (2

    0) more and we are at (7, 2, -2). [7, 2, 1] is (1 (-2)) location after [7, 2, -2]. So, [7, 2, 1] is the (5 * 4 * 2 + 1) +

    (5 * 2) + 3 = 54th component. Its address is 200 + (54 1) * 4 = 412.

    After generalization we have:

    For C-based programming languages this is simplified to:

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    16/30

    Determining the address of an element in a C-style array.

    Given the pseudo-C declaration

    doubl e a[ 13] [ 10] [ 9] ;

    calculate the address of a[ 7] [ 6] [ 8] assuming a) column-major representation and b) row-majorrepresentation. For both cases, assume that base of a is 200 and a doubl eis represented in eight bytes.

    a)

    b)

    Multidimensional Arrays

    Support for multidimensional arrays can be provided in two ways: jagged arrayssometimes referred to as ragged

    arraysand rectangular arrays.[5]

    Example: Jagged arrays in Java.

    int[ ] [ ] numAr r = { {1, 2, 3}, {4, 5, 6, 7}, {8, 9} };

    What we have here is actually an array of arrays. Since different arrays can possibly have differing component

    counts, sub-arrays in our example can and do have different lengths. Same array can be formed using the following

    code, which reflects the way arrays are treated in Java.

    int[ ] [ ] numArr = new int[ 3] [ ] ;

    numArr [ 0] = new int[ 3] ;for ( int i = 0; i

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    17/30

    This leads us to the conclusion, which is also reflected in the layout given before, that Java-style multidimensional

    arrays are not necessarily allocated in contiguous memory locations.[6]

    Example: Rectangular arrays in C#.

    int[ , ] 4 numMat r i x = { {1, 2}, {3, 4}, {5, 6} };

    Here, we have a 3-by-2 matrix. Its also guaranteed that the entire matrix is allocated in contiguous memory

    locations.

    Associative Arrays

    Array index is usually of a scalar type although languages such as Perl and Tcl provide for non-scalar indexes

    through the use of hashing.5 Such arrays are called associative arrays.

    Example: Associative arrays in Perl.

    $di cti onar y{' wor d' } = ' szck, kel i me' ;$gr ade{' Emr ah' } = 90;$di cti onar y{' sent ence' } = ' t mce, cml e' ;

    Thanks to the operator name overloading facility, languages such as C++ and C# incorporate such a facility by

    means of a class that overloads the subscript operator ([ ] ).

    Example: Associative arrays in C#.

    HashTabl e di ct i onar y = new HashTabl e( ) ;. . .di cti onar y[ "word"] = "szck, kel i me" ;di cti onar y[ "sent ence"] = "t mce, cml e";. . .Consol e. Wr i t e( "sent ence i n Tur ki sh i s {0}", di ct i onar y[ "sentence"] ) ;

    // will print "sentence in Turkish is tmce, cmle" on the standard output

    In languages that lack operator name overloading, such as Java, one has to resort to using the relevant class with its

    messages.

    Example: Associative arrays in Java.

    Map di ct i onar y = new HashMap( ) ;. . .di cti onar y. put ( "wor d", "szck, kel i me") ;

    di cti onar y. put ( "sentence", " t mce, cml e") ;. . .Syst em. out . pr i nt ( "Sent ence i n Tur ki sh " + di cti onar y. get ( "sent ence") ) ;// will print "sentence in Turkish is tmce, cmle" on the standard output

    Lists

    Lists are generalized, dynamic, linear data structures. A (linked) list is a set of items organized sequentially, just like

    an array. In an array, the sequential organization is provided implicitly (by the position in the array); in a list, we use

    an explicit arrangement in which each item is part of a node that also contains a link to the next node.

    Example: Linked lists in Pascal.

    type

    l i nk = node;

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    18/30

    node = record key : integer; next : l i nk end;var

    head : l i nk;. . .

    Depending on how links are provided, we have:

    Singly linked lists: These provide us only with a forward pointer pointing to the next node in the list.

    Doubly linked lists: These provide us with two pointers, one pointing to the next and one pointing to the

    previous node in the list.

    Another classification is possible with how the first and last nodes in the list are related:

    Circular lists: In circular lists, the first and last nodes in the list are connected with link(s). In a circular doubly

    linked list, the next link of the last node points to the first node and previous link of the first node points to the

    last node of the list. In the case of a circular singly linked list, there is no pointer from the first node to the last

    one.

    Linear lists: In linear lists, there are no links between the first and last nodes. The next field of the last node

    and the previous field of the first node point nowhere, i.e. they are null.

    In the implementation of a list, one can make use of a header and/or a dummy end node.

    Operations on lists are:

    Creation/destruction operations.

    Insert: Before/after a node with a certain key value; into the end, at the beginning.

    Remove: A component with a certain key value; the first, last component.

    Search: A component with a certain key value.

    Empty: Test whether or not the list is empty.

    Some commonly used, specialized (restricted) forms of lists are stack (LIFO), queue (FIFO), deque (Double-ended

    queue), output-restricted deque, and input-restricted deque.

    Multilists

    Multilists are similar to lists. The only difference is that nodes reside on more than one list simultaneously.

    Example: Pascal representation of a sparse matrix using a multilist.

    type

    l i nk = node; node = record r ow_no, col umn_no : integer; key : integer; next _r ow, next _col umn : l i nk end;

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    19/30

    Dynamic Arrays

    A cross between arrays and lists, dynamic arrays can grow or shrink their size in the run-time. Known to the

    Java-world as Vect or and C#-world as Ar r ayLi st , a dynamic array maintains an internal, heap-allocated array

    and replaces this with a larger (smaller) one as need arises.

    Records

    A recordas a logical data structure is a fixed-size, ordered collection of possibly heterogeneous components that

    may themselves be structured. It may also be called the hierarchical or structured type. Record components, often

    called fields, are accessed by name rather than by subscript.

    Example: Record type definition in COBOL.

    01 STUDENT- RECORD.

    02 NAME. 03 FI RST- NAME PIC X( 15) . 03 MI DDLE- INITIAL PIC X. 03 LAST- NAME PIC X( 15) . 02 STUDENT-NO PIC 9( 9) . 02 TEST- SCORES. 03 MI DTERM PIC 9( 3) . 99. 03 FINAL PIC 999. 99. 02 ASSI GNMENTS OCCURS 5 TIMES PICTURE IS 9( 3) . 9( 2) .

    Variant Records

    Some languages allow us to have variations of a record. Such a record with variations is called a variant record. This

    means that there will be some fields that are common to all variations and some fields that are unique to each

    variation.

    Example: Variant records in C.

    enum shape_t ag {Ci rcl e, Squar e};struct SHAPE { f l oat area;

    enum shape_t ag t ag; uni on { f l oat r adi us; f l oat edge_l engt h;

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    20/30

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    21/30

    Before we see examples of how records are laid out in the memory, lets look at a rather important issue affecting it:

    alignment requirements.

    Alignment Requirements

    For reasons of efficiency, some architectures do not let a data element start at an address that is not a multiple of the

    data elements size. As a result of this alignment, it takes fewer memory accesses to fetch the data required in the

    program. This improvement on the running speed, however, comes at the expense of more memory.

    It should be noted that the alignment scheme offered in the following examples is not the only one. Programming

    environments may in some way let you alter the way data is aligned. As a matter of fact, programming environments

    built on top of the Intel architecture may even let you turn on and off alignment.

    Alignment requirement of double

    An IEEE754 double precision floating point number can be represented in 8 bytes. A data type compatible with

    this specification and implemented on an architecture with an alignment requirement stated as above cannot start

    at memory location, say, 4 or 6. It should start at locations whose addresses are divisible by 8. So, it can start at 0,

    8, ..., 33272, ... .

    The alignment requirement for a record type will be at least as stringent as for the component having the most

    stringent requirements. The record must terminate on the same alignment boundary on which it started.

    Example: Alignment in records.

    struct S { doubl e val ue; char name[ 10] ;};

    struct S {char name[ 10] ;doubl e val ue;};

    Realize that alignment requirement of an array is equal to the alignment requirement of its component. Whether the

    array size is 1 or a million bytes, its alignment requirement is always the same. This is due to the fact that an array

    declaration is shorthand for defining multiple variables. The above fragment should actually be considered as given

    below:

    struct S { doubl e val ue; char name0, name1, name2, . . . , name9;};

    So, alignment requirement of our structure definitions is equal to that of val ue: 8.

    Example:

    Alignmentrequirements

    in variantrecords.

    type

    t agt ype= ( f i r s t ,

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    22/30

    second) ; vt ype =record

    f 1 :integer; f 2 :real; case c

    : t agt ypeof

    f i r s t :( f 3, f 4 :integer) ;second :( f 5, f 6 :real) end;

    var

    v :vtype;

    If we change the selector of the variant part to case t agt ype ofwe have

    Depending on the architecture, this saves us 4 to 8 bytes for each record. But it leaves us without type safety.

    Sets

    A setis a nonlinear structure containing an unordered collection of distinct values. Typical operations on sets are

    insertion of an individual component, deletion of an individual component, union, intersection, difference, and

    membership.

    Example: Sets in Pascal.

    type

    days = ( sun, mon, t ues, wed, t hur , f r i , sat ) ; dayset = set of days

    var weekdays, weekend: dayset ;. . .

    begin

    weekdays : = [ mon, t ues, wed, t hur , f r i ] ; weekend : = [ sun, sat ] ; . . . . . .end.

    Base types in sets are restricted to scalar types and, due to the storage requirements; the number of potential

    members is severely limited.

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    23/30

    Trees

    A treeis a nonempty collection of nodes and edges that satisfies certain requirements. A node is a simple object

    containing link(s) to other nodes and some value; a link to another node is called an edge.

    In a tree,

    Predecessor of a node is called its parent.

    Successor(s) of a node is (are) called its child(ren).Nodes without successors are called leaves.

    The node without a predecessor is called the root.

    There are no unreachable nodes from the root.

    There is only one path between the root and some node.

    Trees are encountered frequently in everyday life. For example, many people keep track of their ancestors and/ or

    descendants with a family tree. As a matter of fact, much of the terminology derives from this usage. Another

    example is found in the organization of sports tournaments.

    Example: Representing binary trees in Pascal.

    type

    l i nk = node; node = record i nf o: char; l ef t _ chi l d, r i ght_chi l d: l i nk end;

    Graphs

    Similar to a tree, a graphis a collection of nodes and edges. Unlike trees, graphs do not have the notion of a root, a

    leaf, a parent, or a child. A node, also called vertex, can live in isolation; that is, it can be unreachable. For some

    vertex pairs, there can be more than one path connecting them to each other.

    A great many problems are naturally formulated using graphs. For example, given an airline route map of Europe, we

    might be interested in questions like: "What's the fastest way to get from Izmir to St. Petersburg?" It's very likely that

    many city pairs have more than one path connecting them. Another example is found in finite-state machines.

    A graph can be represented in two ways:

    1. Adjacency matrix representation

    A V-by-V, where V is the number of vertices, array of boolean values is maintained, with a[x, y] set to true if

    there is an edge from vertex x to vertex y and false otherwise.

    Example: Adjacency matrix representation in Pascal.

    program adj mat r i x ( i nput , out put ) ;const max_no_of _ver t i ces = 50;

    type

    mat r i x_t ype = array [ 1. . max_no_of _ver t i ces, 1. . max_no_of _vert i ces] ofboolean;

    var a: mat r i x_t ype;. . .

    . . .

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    24/30

    2. Adjacency-structure representation

    In this representation all the vertices connected to each vertex are listed on an adjacency list for that vertex.

    Example: Adjacency structure representation in Pascal,

    program adj l i st ( i nput , out put ) ;const max_no_of _ver t i ces = 100;

    type l i nk = node; node = record v: integer; next : l i nk end; bucket _ar r ay = array [ 1. . max_no_of _ver t i ces] of l i nk;var adj : bucket _ar r ay;. . .. . .

    While adjacency matrix is the better choice for dense graphs, for sparse graphs adjacency-structure representation

    turns out to be a more feasible solution.

    User-defined Types

    Besides enhancing the readability and clarity of the program text, defining user-defined types makes it possible to

    compose a complicated data structure once and then create as many instances (variables of that type) of it as

    necessary and, secondly, to use the languages own type-checking facility for input data validation such as range or

    consistency checking.

    User-defined types come in two flavors:

    1. Enumeration types

    An enumeration type provides for the enumeration of the domain of the type by the programmer. The domain

    values are listed in a declarative statement.

    Example: Enumeration type in C++

    In C++, instead of

    const i nt i nput = 1;const i nt out put = 2;const i nt append = 3;. . .

    bool open_f i l e( st r i ng f i l e_name, i nt open_mode) ; . . . i f ( open_f i l e( "Sal esRepor t " , append) ) . . .

    we can have

    enum open_modes {i nput = 1, out put , append};. . .bool open_f i l e( st r i ng f i l e_name, open_modes om) ;. . . i f ( open_f i l e( "Sal esRepor t " , i nput ) ) . . .

    2. Subtypes

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    25/30

    A subtype is the specification of the domain as a subrange of another already existing type.

    Example: Subtypes in Pascal.

    type ShoeSi zeType = 35. . 46;var ShoeSi ze : ShoeSi zeType;

    With these declarations in place, we cannot assign any value outside the range to the ShoeSi zevariable.

    Any such attempt would be caught by the typing system and cause the program to terminate with an error. [Inprogramming languages supporting exceptions, this could be handled more graciously without terminating the

    program.]

    Realize, the enumeration and subtype definitions do not allow the programmer to specify operations on the newly

    defined data type.

    Abstract Data Types

    The defining characteristics of an abstract data type(ADT) is that nothing outside of the definition of the data

    structure and the algorithms operating on it should refer to anything inside, except through function and procedure

    calls for the fundamental operations.

    Example: Stack implementation in Pascal,

    type

    l i nk = node; node = record key: integer; next : l i nk end;var head, z: l i nk;

    procedure stacki ni t ;begin

    new( head) ; new( z) ; head . next : = z; z . next : = zend;

    procedure push( v: integer) ;var t : l i nk;

    begin new( t ) ; t . key : = v; t . next : = head . next ; head . next : = tend;

    function pop: integer;var t : l i nk;

    begin

    t : = head . next ;

    pop : = t . key; head . next : = t . next ; di spose( t )end;

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    26/30

    function peek: integer;begin

    peek : = t . key;end

    function st ackempt y: boolean;begin

    st ackempty : = ( head . next = z)

    end;

    The above Pascal fragment is unfortunately not an ideal implementation of an ADT. Fields of the record definition is

    open to manipulation; there is no language construct that prohibits changing the underlying structure directly by

    changing its components. It is also possible for the programmer to use any subprogram, be it one meant for export or

    one meant for implementing some auxiliary functionality. A mechanism to provide controlled access to subprograms

    and record fields, such as provided with access specifiers in object-oriented programming languages, is missing.

    Organization of the related subprograms into a compilation unit is not enforced by the language. One could add

    subprograms that are irrelevant to the ADT being implemented. There is no compiler-reinforced rule relating the

    pre-existing and/or newly added subprograms to the ADT in question. The best we can do is to stick to conventionsfor better organizing our programs.

    One other weakness, one that can be remedied very easily, of the preceding fragment is the fact that we cannot

    create more than one stack.

    What we need is something like the module construct found in modular programming languages or the class construct

    of object-oriented programming languages.

    Example: Stack implementation in Ada83.

    PACKAGE St ack_Package I S TYPE Stack_Type I S PRIVATE; PROCEDURE I ni t ( St ack: IN OUT St ack_Type) ; PROCEDURE Push ( St ack: IN OUT St ack_Type; I t em: IN I nt eger ) ; FUNCTION Pop ( St ack: IN OUT St ack_Type) RETURN I nt eger ; FUNCTION Peek ( St ack: IN St ack_Type) RETURN I nt eger ; FUNCTION Empt y ( St ack: IN St ack_Type) RETURN Bool ean;PRIVATE

    St ack_Si ze: CONSTANT I nt eger : = 10; TYPE I nt eger _Li st _Type I SARRAY ( 1. . St ack_Si ze) OF I nt eger ; TYPE Stack_Type I S RECORD

    Top: I nt eger RANGE 0. . St ack_Si ze; El ement s: I nt eger _Li st _Type; END RECORD;END Stack_Package;

    PACKAGE BODY St ack_Package I S PROCEDURE I ni t ( St ack: IN OUT St ack_Type) I S BEGI N St ack. Top : = 0; END I ni t ;

    PROCEDURE Push ( St ack: IN OUT Stack_Type; I t em: IN I nt eger ) I SBEGI N Stack. Top : = Stack. Top + 1; St ack. El ement s ( St ack. Top) : = I t em;

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    27/30

    END Push;

    FUNCTION Pop ( St ack: IN OUT St ack_Type) RETURN I nt eger I S I t em: I nt eger : = St ack. El ement s ( St ack. Top) ;BEGI N St ack. Top : = St ack. Top 1; RETURN I t em;END Pop;

    FUNCTION Peek ( St ack: IN St ack_Type) RETURN I nt eger I SBEGI N RETURN St ack. El ement s( St ack. Top) ;END Peek;

    FUNCTION Empt y ( St ack: IN St ack_Type) RETURN Bool ean I SBEGI N I F St ack. Top = 0THEN RETURN True; ELSE RETURN Fal se;END Empt y;

    END Stack_Package;

    Using this implementation, a procedure reading ten integers and printing them in reverse order can be implemented

    as given below.

    WITH Basi c_I O, St ack_Package;

    PROCEDURE Rever se_I nt eger s I S x: I nt eger ; St : St ack_Package. St ack_Type;BEGI N

    Stack_Package. I ni t ( St ) ; FOR i IN 1. . 10 LOOP Basi c_I O. Get ( x) ; St ack_Package. Push( St , x) ; END LOOP;

    FOR i IN 1. . 10 LOOP x : = Stack_Package. Pop( St ) ; Basi c_I O. Put ( x) ; Basi c_I O. New_Li ne; END LOOP;

    END Rever se_I nt eger s;

    The above implementation is a lot better than the previous one: The only way a stack can be manipulated is through

    the operations defined on it. The representation of the ADT is declared to belong to the private part of the package.

    This certainly is a great improvement over the first example. But, it still requires the user to perform creation and

    initialization in two separate steps, a rather error-prone process. What if we forget to call the initialization routine?[7]

    One other problem with this approach is its lack of adaptability. Once it has been defined, the user cannot alter its

    behavior except for by modifying its definition. This may at times be very restrictive. Consider defining a shape type

    in a graphics system. Although shapes may differ in many aspects, there are certain operations that can be applied to

    all shapes, such as drawing, rotating, ttanslating, and etc. At first we might come up with a central routine where wecall the appropriate drawing routine of the specific shape type. An implementation, which is typical of non-object-

    oriented programming languages, would look like as follows:

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    28/30

    enum ki nd { ci rcl e, t r i angl e, rect angl e };

    cl ass Shape { / / at t r i but es common t o al l shapes ki nd k; . . .publ i c:

    / / common i nt er f ace of al l shapes voi d dr aw( voi d) ; voi d rotate( i nt degree) ; . . .};

    voi d Shape: : dr aw( voi d) { swi t ch( k) { case ci r cl e: / / dr aw t he ci r cl e break;

    case tr i angl e: / / dr aw t he t r i angl e break; case r ectangl e: / / dr aw t he r ect angl e break; def aul t : error ( . . . ) ; } / / end of swi t ch( k)} / / end of voi d Shape: : dr aw( voi d)

    voi d Shape: : rotate( i nt howManyDegr ees) {

    swi t ch( k) { case ci r cl e: / / rot at e t he ci rc l e break; case tr i angl e: / / rot at e t he t r i angl e break; case r ectangl e: / / r ot at e t he r ectangl e break; def aul t : error ( ) ;

    } / / end of swi t ch( k)} / / end of voi d Shape: : r ot at e( i nt )

    One weakness of the preceding solution is the requirement that the central function dr awand rotatemust knowabout all kinds of different shapes. If you define a new shape, every operation on a shape must be examined and

    possibly modified. As a matter of fact, you cannot even add a new shape to the system unless you have access to the

    source code. And generally, you do not have such a privilege when you are a mere user of the class. So, the

    implementer faces a dilemma: either ship the implementation with missing shapes or indefinitely postpone shipping it

    until you make sure you have an exhaustive list of possible shapes.[8]Ideally, you would like to ship your software as

    early as you can and as fully functional as possible. This is referred to as the open-closed principle. But, how can

    you possibly provide an exhaustive functionality when you have limited time and you dont have much idea aboutthe alternatives? The answer is to get the user to extend your software and the keyword is inheritance. [9]In C++,

    you would provide the following solution:

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    29/30

    cl ass Shape { / / at t r i but es common t o al l shapes. No at t r i but es f or hol di ng t he ki nd! . . .publ i c: / / i nt er f ace common t o al l shapes vi r t ual voi d dr aw( voi d) ; vi r t ual voi d rotate( i nt degr ee) ;

    . . .} / / end of cl ass Shape

    cl ass Ci rc l e : publ i c Shape { / / at t r i but es speci al t o Ci rcl e. . . .publ i c: voi d dr aw( voi d) { /* circle-specific implementation of draw */ } voi d rotate( i nt degree) { /* circle-specific impl. of rotate */ } . . .} / / end of cl ass Ci rcl e

    cl ass Tri angl e : publ i c Shape { / / at t r i but es speci al t o Tr i angl e. . . .publ i c: voi d dr aw( voi d) { /* triangle-specific implementation of draw */ } voi d rotate( i nt degree) { /* triangle-specific impl. of rotate */ } . . .} / / end of cl ass Tr i angl e

    cl ass NewShape : publ i c Shape {

    / / at t r i but es speci al t o t he new shape. . . .publ i c: voi d dr aw( voi d) { /* new shape-specific implementation of draw */ } voi d rotate( i nt degree) { /* new shape-specific impl. of rotate */ } . . .} / / end of cl ass NewShape

    In addition to the aforementioned advantages, it is the compiler that controls the whole process, whereas in the

    previous case it was the user doing the all bookkeeping stuff.

    Notes

    Although it is a language conceived in early 70's, bool has been added to C in 1999. Therefore, you willprobably not see it being used very frequently. Instead, you will see conventional uses of integer values

    through macros and t ypedefs.

    1.

    A character taking up two bytes in memory does not mean that it will be serialized as a sequence of two

    bytes. that is, it will probably not consume two bytes of disk space or will not be transmitted in two bytes over

    the network. Different encoding techniques may be used. A commonly used technique is the UTF-8, which

    assumes that the most commonly used characters are those found in the ASCII subset of the Unicode standard.

    In this scheme, the following mappings are used to encode individual characters: 0000 0000 0xxx xxxx

    0xxx xxxx; 0000 0xxx xxxx xxxx 110x xxxx 10xx xxxx; xxxx xxxx xxxx xxxx 1110 xxxx 10xx xxxx10xx xxxx

    2.

    Automation is a Microsoft technology that allows you to take advantage of an existing programs content and

    functionality, and to incorporate it into your own applications. A typical example is the automation of MS

    Office applications (automation objects) by VBA (automation controller).

    3.

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...

    f 30 20-11-2014 13:00

  • 8/10/2019 2 Programming Language Concepts Using C and C++_Data Level Structure - Wikibooks, open books for an open

    30/30

    The formulas given here are made slightly more complex to avoid having a 0th component. Normally, the

    compiler implementer would not add 1, just to subtract it in the next computation.

    4.

    This term admittedly invokes the image of a two-dimensional array. However, it covers arrays of any rank.5.

    This is not true for components of the same sub-array. That is, this is not true for components of the same

    sub-array. numArr [ 0] and numAr r [ 1] , numAr r [ 1] [ 2] and numArr [ 1] [ 3] , and so on are guaranteedto reside in contiguous locations. Otherwise would have ruled out random access. However, numArr [ 0] [ 2]and numAr r [ 1] [ 0] , which are neighboring components living in different sub-arrays, cannot be guaranteedto occupy adjacent locations.

    6.

    As a matter of fact, this is exactly why constructors are provided in object-oriented programming languages.Similarly, object-oriented programming languages without automatic garbage collection provide a destructor

    routine for each class.

    7.

    A third option would be giving out the source code of the class, which would spell disaster for a software

    company: bankruptcy.

    8.

    Note that inheritance is not the only way to extend your software. Composition can be used as an alternative

    to inheritance.

    9.

    Retrieved from "http://en.wikibooks.org/w/index.php?title=Programming_Language_Concepts_Using_C_and_C

    %2B%2B/Data_Level_Structure&oldid=2717092"

    This page was last modified on 23 October 2014, at 20:47.

    Text is available under the Creative Commons Attribution-ShareAlike License.; additional terms may apply.

    By using this site, you agree to the Terms of Use and Privacy Policy.

    gramming Language Concepts Using C and C++/Data Level Structure... http://en.wikibooks.org/wiki/Programming_Language_Concepts_Using...