data structures using c

88

Click here to load reader

Upload: pdr-patnaik

Post on 25-May-2015

500 views

Category:

Education


12 download

DESCRIPTION

Data structures using C

TRANSCRIPT

Page 1: Data structures using C

1

DATA STRUCTURES USING C

Chapter 1 – Basic Concepts

Page 2: Data structures using C

2

Overview

Data: data is a simple value or set of values. Entity: Entity is some thing that has certain

attributes or properties which may assigned values.

Field: It is a single elementary unit of information representing an attribute of an entity.

Record: A record is a collection of field values of a given entity.

File: A file is a collection of records of the entities in a given entity set.

Page 3: Data structures using C

3

Why Data Structures? Problem solving is more related to

understanding the problem, designing a solution and Implementing the solution, then

What exactly is a solution? In a very crisp way, it can be

demonstrate as a solution which is equal to a program and it is also approximately equal to algorithm.

Page 4: Data structures using C

4

Algorithm An algorithm is a sequence of steps that take us

from the input to the output. An algorithm must be Correct. It should provide

a correct solution according to the specifications.

Finite. It should terminate and general. It should work for every instance of a problem is

efficient. It should use few resources (such as time or

memory).

Page 5: Data structures using C

5

What is Information? It is an undefined term that cannot be

explained in terms of more elementary concepts.

In computer science we can measure quantities of information.

The basic unit of information is bit. It is a concentration of the term binary

digit.

Page 6: Data structures using C

6

Definition !

Data may be organized in many different ways ; the logical / mathematical model of a particular, organization of data is called Data Structure.

Data structure can be also defined as, it is the mathematical model which helps to store and retrieve the data efficiently from primary memory.

Page 7: Data structures using C

7

Page 8: Data structures using C

8

System Life Cycle

Large-Scale program contains many complex interacting parts.

Programs undergo a development process called the system life cycle.

Solid foundation of the methodologies in

1. data abstraction 2. algorithm specification3. performance analysis and measurement

Page 9: Data structures using C

9

5 phases of system life cycle:

1. Requirements: Begins with purpose of the project. What inputs (needs), outputs (results)

2. Analysis: Break the problem down into manageable pieces. bottom-up vs. top-down, divide and conquer

3. Design: data objects and operations creation of abstract data types, second the specification of algorithms and algorithm design strategies.

4. Refinement & coding: representations for our data objects and write algorithms for each operation on them.

we should write those algorithms that are independent of the data objects first.

Page 10: Data structures using C

10

5. Verification: consists of developing correctness proofs for the program, testing the program with a variety of input data and removing errors.

Correctness proofs: proofs are very time-consuming and difficult to develop fro large projects. Scheduling constraints prevent the development of a complete set of proofs for a large system.

Testing: before and during the coding phase. testing is used at key checkpoints in the overall process to determine whether objectives are being met.

Error Removal: system tests will indicate erroneous code. Errors can be removed depending on the design & coding decisions made earlier.

Page 11: Data structures using C

11

Pointers

For any type T in C there is a corresponding type pointer-to-T. The actual value of a pointer type is an address of memory. & the address operator. * the dereferencing ( or indirection ) operator. Declaration

• int i , *pi;then i is an integer variable and pi is a pointer to an integer.

• pi = &i;then &i returns the address of i and assigns it as the value of pi.

Page 12: Data structures using C

12

• i = 10 or• *pi = 10; In both cases the integer 10 is stored as the value of i. In second case, the * in front of the pointer pi causes

it to be dereferenced, by which we mean that instead of storing 10 into the pointer, 10 is stored into the location pointed at by the pointer pi.

The size of a pointer can be different on different computers.

The size of a pointer to a char can be longer than a pointer to a float.

Test for the null pointer in C if (pi == NULL) or if(!pi)

Page 13: Data structures using C

13

Dynamic Memory Allocation

When you will write program you may not know how much space you will need. C provides a mechanism called a heap, for allocating storage at run-time.

The function malloc is used to allocate a new area of memory. If memory is available, a pointer to the start of an area of memory of the required size is returned otherwise NULL is returned.

When memory is no longer needed you may free it by calling free function.

The call to malloc determines size of storage required to hold int or the float.

The notations (int *) and (float *) are type cast expressions.

Page 14: Data structures using C

14

Example…!

int i,*pi;float f,*pf;pi = (int *) malloc (sizeof(int));pf = (float *) malloc(sizeof(float));*pi=1024;*pf=3.124;printf(“an integer = %d, a float = %f\

n”,*pi,*pf);free(pi);free(pf);

Page 15: Data structures using C

15

When programming in C it is a wise practice to set all pointers to NULL when they are not pointing to an object.

Use explicit type casts when converting between pointer typespi = malloc(sizeof(int));

/* assign to pi a pointer to int */pf = (float *) pi;

/* casts an int pointer to float pointer */ In many systems, pointers have the same size as type

int. int is the default type specifier.

Page 16: Data structures using C

16

Algorithm Specification

An algorithm is a finite set of instructions that, if followed, accomplishes a particular task.

Algorithms must satisfy following criteria, Input: there are zero or more quantities that are

externally supplied. Output: at least one quantity is produced. Definiteness: Each instruction is clear and

unambiguous. Finiteness: for all cases, the algorithm terminates

after a finite number of steps. Effectiveness: Every instruction must be basic

enough to be carried out, it must also be feasible.

Page 17: Data structures using C

17

Difference between an algorithm and a program : the program does not have to satisfy the fourth condition (Finiteness).

Describing an algorithm: natural language such as English will do. However, natural language is wordy to make a statement definite. That is where the Code of a program language fit in.

Flowchart: work well only for algorithm, small and simple.

Page 18: Data structures using C

18

Example:Translating a Problem into an Algorithm Problem

Devise a program that sorts a set of n integers, where n>= 1, from smallest to largest.

Solution I: looks good, but it is not an algorithm From those integers that are currently

unsorted, find the smallest and place it next in the sorted list.

Solution II: an algorithm, written in partially C and English for (i= 0; i< n; i++){ Examine list[i] to list[n-1] and suppose that

the smallest integer is list[min]; Interchange list[i] and list[min]; }

Page 19: Data structures using C

19

Example: Program

#define swap(x,y,t) ((t)= (x), (x)= (y), (y)= (t))

void sort(int list[], int n){ int i, j, min, temp; for (i= 0; i< n-1; i++){ min= i; for (j= i+1; j< n; j++){ if (list[j]< list[min]) min= j; } swap(list[i], list[min], temp);}

void swap(int *x, int *y){ int temp= *x; *x= *y; *y= temp;}

Page 20: Data structures using C

20

Recursive Algorithms

Direct recursion Functions call themselves

Indirect recursion Functions call other functions that invoke

the calling function again Any function that we can write using

assignment, if-else, and while statements can be written recursively.

Page 21: Data structures using C

21

int binsearch(int list[], int searchnum, int left, int right){// search list[0]<= list[1]<=...<=list[n-1] for searchnum int middle; while (left<= right){ middle= (left+ right)/2; switch(compare(list[middle], searchnum))

{ case -1: left= middle+ 1;

break; case 0: return middle; case 1: right= middle- 1;

} } return -1;}

int compare(int x, int y){ if (x< y) return -1; else if (x== y) return 0; else return 1;}

int compare(int x, int y){ if (x< y) return -1; else if (x== y) return 0; else return 1;}

Page 22: Data structures using C

22

Recursive Implementation of Binary Search

int binsearch(int list[], int searchnum, int left, int right){

int middle;if(left<=right)

{middle = (left+right)/2;switch(COMPARE(list[middle],searchnum)){

case -1: return binsearch(list,searchnum,middle+1,right);case 0: return middle;case 1: returnbinsearch(list,searchnum,left,middle-1);

}}return -1;

}

Page 23: Data structures using C

23

Towers of Hanoi

Page 24: Data structures using C

24

void Hanoi(int n, char x, char y, char z){

if (n > 1) {

Hanoi(n-1,x,z,y);printf("Move disk %d from %c to %c.\n",n,x,z);Hanoi(n-1,y,x,z);

} else{

printf("Move disk %d from %c to %c.\n",n,x,z); }

}

Page 25: Data structures using C

25

Data Abstraction

Data TypeA data type is a collection of objects and a set of operations that act on those objects.

if program is dealing with predefined or user-defined data types two aspects i,e objects & operations must be considered.

Example of "int" Objects: 0, +1, -1, ..., Int_Max, Int_Min Operations: arithmetic(+, -, *, /, and %), testing

(equality/inequality), assigns, functions

Page 26: Data structures using C

26

Data Abstraction & Encapsulation

Abstraction: is extraction of essential information for a particular purpose and ingnoring the remainder of the information

Encapsulation: is the process of binding together of Data and Code. It can also be defined as the concept that an object totally separates its interface from its implementation.

It has been observed by many software designers that hiding the representation of objects of a data type from its users is a good design strategy.

Page 27: Data structures using C

27

Abstract Data Type

Abstract Data TypeAn abstract data type (ADT) is a data type that is organized in such a way that the specification of the objects and the operations on the objects is separated from the representation of the objects and the implementation of the operations.

ADT Operations — only the operation name and its parameters are visible to the user — through interface

Why abstract data type ? implementation-independent

Abstraction is … Generalization of operations with unspecified

implementation

Page 28: Data structures using C

28

Abstract data type model

Page 29: Data structures using C

29

Classifying the Functions of a ADT

Creator/constructor Create a new instance of the designated type

Transformers Also create an instance of the designated

type by using one or more other instances Observers/reporters

Provide information about an instance of the type, but they do not change the instance

Page 30: Data structures using C

30

Abstract data type NaturalNumber

ADT NaturalNumber is structure NaturalNumber is (denoted by Nat_No) objects: an ordered subrange of the integers starting at zero and ending at the maximum integer (INT_MAX) on the computer functions: for all x, y Nat_Number; TRUE, FALSE Boolean and where +, -, <, and == are the usual integer operations. Nat_No Zero ( ) ::= 0 Boolean IsZero(x) ::= if (x) return FALSE else return TRUE Nat_No Add(x, y) ::= if ((x+y) <= INT_MAX) return x+y else return INT_MAX Boolean Equal(x,y) ::= if (x== y) return TRUE else return FALSE Nat_No Successor(x) ::= if (x == INT_MAX) return x else return x+1 Nat_No Subtract(x,y) ::= if (x<y) return 0 else return x-y end NaturalNumber

Creator

Observer

Transformer

Page 31: Data structures using C

31

Cont’d..

ADT definition begins with name of the ADT.

Two main sections in the definition the objects and the functions.

Objects are defined in terms of integers but no explicit reference to their representation.

Function denote two elements of the data type NaturalNumber, while TRUE & FALSE are elements of the data type Boolean.

The symbols “::=“ should be read as “is defined as.”

Page 32: Data structures using C

32

Cont’d…

First function Zero has no arguments & returns the natural number zero (constructor function).

The function Successor(x) returns the next natural nor in sequence (ex of transformer function).

Other transformer functions are Add & Subtract.

Page 33: Data structures using C

33

Performance Analysis

Criteria upon which we can judge a program Does the program meet the original specifications of

the task ? Does it work correctly ? Does the program contain documentation that shows

how to use it and how it works ? Does the program effectively use functions to create

logical units ? Is the program’s code readable ?Space & Time Does the program efficiently use primary and

secondary storage ? Is the program’s running time acceptable for the task ?

Page 34: Data structures using C

34

Cont’d…

Performance evaluation is loosely divided into two fields:

First field focuses on obtaining estimates of time and space that are machine independent. (Performance Analysis)

Second field, (Performance measurement), obtains machine-dependent running times, these are used to identify inefficient code segments.

Page 35: Data structures using C

35

Cont’d…

Evaluate a programMeet specifications, Work correctly, Good user-interface, Well-documentation,Readable, Effectively use functions, Running time acceptable, Efficiently use space/storage

How to achieve them? Good programming style, experience, and practice Discuss and think

The space complexity of a program is the amount of memory that it needs to run to completion.

The time complexity of a program is the amount of computer time that it needs to run to completion

Page 36: Data structures using C

36

Space Complexity

The space needed is the sum of Fixed space and Variable space

Fixed space: C Includes the instructions, variables, and

constants Independent of the number and size of I/O

Variable space: Sp(I) Includes dynamic allocation, functions'

recursion Total space of any program

S(P)= C+ Sp(Instance)where C is constant & Sp is instance

characteristics

Page 37: Data structures using C

37

Example1 to find space complexity

Algorithm P1(P,Q,R)1. Start2. Total = (P+Q+R*P+Q*R+P)/(P*Q);3. EndIn the above algorithm there is no instance characteristics and space needed by P,Q,R and Total is independent of instance characteristics thereforeS(P1) = 4 + 0where one space is required for each of P,Q,R &

Total

Page 38: Data structures using C

38

Ex 2

Algorithm SUM()1. [Sum the values in an array A]

Sum=0Repeat for I=1,2,…..NSum = sum + A[i]

2. [Finished]Exit.In the above algorithm there is an instance characteristic N, since A must be large enough to hold the N elements to be summed & space needed by sum, I, and N is the independent of instance characteristics, we can writeS(Sum) = 3 + N.N for A[] and 3 for N,I and sum

Page 39: Data structures using C

39

Program 1.11: Recursive function for summing a list of numbers

float rsum(float list[ ], int n){ if (n) return rsum(list, n-1) + list[n-1]; return 0; }

Figure 1.1: Space needed for one recursive call of Program 1.11

Ssum(I)=Ssum(n)=6n

Type Name Number of bytes parameter: float parameter: integer return address:(used internally)

list [ ] n

4 4 4(unless a far address)

TOTAL per recursive call 12

Page 40: Data structures using C

40

Time Complexity

The time T(P) taken by a program, P is the sum of its compile time and its run (execution) time.

It depends on several factors The input of the program. Time required to generate the object code by the

compiler. The speed to CPU used to execute the program.

If we must know the running time, the best approach is to use the system clock to time the program

Total time T(P)= compile time + run (or execution) time May run it many times without recompilation. Run time

Page 41: Data structures using C

41

Cont’d…

How to evaluate? + - * / … Use the system clock Number of steps performed

machine-independent Instance

Definition of a program step A program step is a syntactically or semantically

meaningful program segment whose execution time is independent of the instance characteristics.

We cannot express the running time in standard time units such as hours, minutes & seconds rather we can write as “the running time of such and such an algorithm is proportional to n”.

Page 42: Data structures using C

42

Cont’d..

Time complexity for a given algorithm can be calculated using two steps1. Separate the particular operations such as

ACTIVE operations that is central to the algorithm that is executed essentially often as any other.

2. Other operations such as assignments, the manipulation of index I and accessing values in an array, are called BOOK KEEPING operations and are not generally counted.

After the active operation is isolated, the number of times that it is executed is counted.

Page 43: Data structures using C

43

Ex 1

Consider the algorithm to sum the values in a given array A that contains N values. I is an integer variable used as index of A

ALGORITHM SUM()1. [Sum the values in an array A]

sum = 0repeat for I=1,2…Nsum=sum+A[I]

2. [Finished]exit.

Page 44: Data structures using C

44

Explaination

In the above ex, operation to isolate is the addition that occurs when another array value is added to the partial sum

After active operation is isolated, the nor of time that it is executed is counted.

The number of addition of the values in the alg is N.

Execution time will increases in proportional to the number of times the active operation is executed.

Thus the above algorithm has execution time proportional to N.

Page 45: Data structures using C

45

Tabular Method

Step count table for Program Iterative function to sum a list of numbers

Statement s/e Frequency Total steps

float sum(float list[ ], int n) { float tempsum = 0; int i; for(i=0; i <n; i++)

tempsum += list[i]; return tempsum; }

0 0 0 0 0 0 1 1 1 0 0 0 1 n+1 n+1 1 n n 1 1 1 0 0 0

Total 2n+3

steps/execution

Page 46: Data structures using C

46

Recursive Function to sum of a list of numbers

Statement s/e Frequency Total steps

float rsum(float list[ ], int n){ if (n) return rsum(list, n-1)+list[n-1]; return list[0];}

0 0 00 0 01 n+1 n+11 n n1 1 10 0 0

Total 2n+2

Step count table for recursive summing function

Page 47: Data structures using C

47

Asymptotic Notation (O, , ) Exact step count

Compare the time complexity of two programs that computing the same function

Difficult task for most of the programs Asymptotic notation

Big “oh” upper bound(current trend)

Omega lower bound

Theta upper and lower bound

Page 48: Data structures using C

48

Cont’d…

Asymptotic notations are the terminology that enables meaningful statements about time & space complexity

The time required by the given algorithm falls under three types1. Best-case time or the minimum time

required in executing the program.2. Average case time or the average time

required in executing program.3. Worst-case time or the maximum time

required in executing program.

Page 49: Data structures using C

49

Break Even Point

If we have two programs with a complexity of c1n2+c2n & c3n respectively.

We know one with complexity c3n will be faster that one with complexity c1n2+c2n for large values on n.

For small values of n, either would be faster. If c1=1,c2=2 & c3=100 then c1n2+c2n for n<=98 If c1=1,c2=2 & c3=1000 then c1n2+c2n for n<=998 No matter what the values of c1,c2 &c3 there will be an n

beyond which the program with complexity c3n will be faster.

This value of n is called break even point. the break-even point (BEP) is the point at which cost or

expenses and revenue are equal: there is no net loss or gain

Page 50: Data structures using C

50

Asymptotic Notation BIG OH(O) Definition

f(n)= O(g(n)) iff there exist two positive constants c and n0 such that f(n)<= cg(n) for all n, n>= n0

Examples 3n+ 2= O(n) as 3n+ 2<= 4n for all n>= 2 10n2+ 4n+ 2= O(n2) as 10n2+ 4n+ 2<= 11n2 for n>=

5 3n+2<> O(1), 10n2+ 4n+ 2<> O(n)

Remarks g(n) is upper bound, the least?

n=O(n2)=O(n2.5)= O(n3)= O(2n) O(1): constant, O(n): linear, O(n2): quadratic, O(n3):

cubic, and O(2n): exponential

Page 51: Data structures using C

51

Asymptotic Notation BIG OMEGA()

Definition f(n)= (g(n)) iff there exist two positive constants

c and n0 such that f(n)>= cg(n) for all n, n>= n0

Examples 3n+ 2= (n) as 3n+ 2>= 3n for n>= 1 10n2+ 4n+ 2= (n2) as 10n2+4n+ 2>= n2 for

n>= 1 6*2n+ n2= (2n) as 6*2n+ n2 >= 2n for n>= 1

Remarks lower bound, the largest ? (used for problem)

3n+3= (1), 10n2+4n+2= (n); 6*2n+ n2= (n100)

Page 52: Data structures using C

52

Asymptotic Notation BIG THETA() Definition

f(n)= (g(n)) iff there exist positive constants c1, c2, and n0 such that c1g(n)<= f(n) <= c2g(n) for all n, n>= n0

Examples 3n+2=(n) as 3n+2>=3n for n>1 and

3n+2<=4n for all n>= 2 10n2+ 4n+ 2= (n2); 6*2n+n2= (2n)

Remarks Both an upper and lower bound 3n+2!=(1); 10n2+4n+ 2!= (n)

Page 53: Data structures using C

53

Example of Time Complexity Analysis

Statement Asymptotic complexity

void add(int a[][Max.......) 0{ 0 int i, j; 0 for(i= 0; i< rows; i++) (rows) for(j=0; j< cols; j++) (rows*cols) c[i][j]= a[i][j]+ b[i][j]; (rows*cols)} 0

Total (rows*cols)

Page 54: Data structures using C

54

Function values

log n•0•1•2•3•4•5

n•1•2•4•5•16•32

n log n

•0•2•8•24•64•160

n2

•1•2•16•64•256•1024

n3

•1•8•64•512•4096•32768

2n

•2•4•16•256•65536•4,….

Page 55: Data structures using C

55

Plot of function values

n

logn

nlogn

Page 56: Data structures using C

56

Performance Measurement

Clocking Functions we need to time events are part of C’s library &

accessed through the statement #include<time.h> There are actually two different methods for timing events

in C. Figure 1.10 shows the major differences between these two

methods. Method 1 uses clock to time events. This function gives the

amount of processor time that has elapsed since the program began running.

Method 2 uses time to time events. This function returns the time, measured in seconds, as the built-in type time t.

The exact syntax of the timing functions varies from computer to computer and also depends on the operating system and compiler in use.

Page 57: Data structures using C

57

Method 1 Method 2

Start timing start=clock(); start=time(NULL);

Stop timing stop=clock(); stop=time(NULL);

Type returned clock_t time_t

Result in seconds duration=((double)(stop-start)/CLK_TCK;

duration=(double)difftime(stop, start);

Page 58: Data structures using C

58

Generating Test Data

It is necessary to use a computer program to generate the worst-case data.

In these cases, another approach to estimating worst-case performance is taken.

For each set of values of the instance characteristics of interest.

We generate a suitably large number of random test data.

The run times for each of these test data are obtained.

The max of these times is used as an estimate of the worst-case time for a set of values.

Page 59: Data structures using C

59

Polynomials

Arrays are not only data structures in their own right.

Most commonly found data structures: the ordered or linear list.

Ex: Days of the week: (Sun,Mon,Tue....) Values in a deck of cards: (Ace,2,3,4,5,6,7) Years Switzerland fought in WorldWarII: ()

Above ex is an empty list we denote as (). The other lists all contain items that are written

in the form (item0,item1,.....item n-1)

Page 60: Data structures using C

60

Operations on lists including, Finding the length, n , of a list Reading the items in a list from left to right or

either. A polynomial is a sum of terms where each term has

a form axe , where x is the variable, a is the coefficient and e is the exponent.

Ex: A(x) = 3x20+ 2x5+4

The largest exponent of a polynomial is called its degree.

The term with exponent equal to zero does not show the variable since x raised to a power of zero is 1.

Page 61: Data structures using C

61

Standard Mathematical definitions for the sum & product of polynomials.

Assume that we have two polynomial... Similarly we can define subtraction &

division on polynomials as well other operations

Page 62: Data structures using C

62

Structure Polynomial is objects:; a set of ordered pairs of <ei,ai> where ai in Coefficients and ei in Exponents, ei are integers >= 0functions:for all poly, poly1, poly2 Polynomial, coef Coefficients, expon ExponentsPolynomial Zero( ) ::= return the polynomial, p(x) = 0

Coefficient Coef(poly, expon) ::= if (expon poly) return its coefficient else return Zero Exponent Lead_Exp(poly) ::= return the largest exponent in poly

Polynomial Attach(poly,coef, expon) ::= if (expon poly) return error else return the polynomial poly with the term <coef, expon> inserted

Polynomials A(X)=3X20+2X5+4, B(X)=X4+10X3+3X2+1

Page 63: Data structures using C

63

Polynomial Remove(poly, expon) ::= if (expon poly) return the polynomial poly with the

term whose exponent is expon deleted

else return errorPolynomial SingleMult(poly, coef, expon) ::= return the polynomial poly • coef • xexpon

Polynomial Add(poly1, poly2) ::= return the polynomial

poly1 +poly2Polynomial Mult(poly1, poly2) ::= return the polynomial poly1 • poly2

Page 64: Data structures using C

64

Polynomial representation

Very reasonable first decision requires unique exponents arranged in decreasing order.

This algorithm works by comparing terms from the two polynomials until one of both of the polynomials become empty.

The switch statement performs the comparison and adds the proper term to the new polynomial d.

One way to represent polynomials in C is to typedef to create the type polynomial..

Page 65: Data structures using C

65

#define MAX_DEGREE 101typedef struct {

int degree;float coef[MAX_DEGREE];} polynomial;

/* d =a + b, where a, b, and d are polynomials */d = Zero( )while (! IsZero(a) && ! IsZero(b)) do { switch COMPARE (Lead_Exp(a), Lead_Exp(b)) { case -1: d = Attach(d, Coef (b, Lead_Exp(b)), Lead_Exp(b)); b = Remove(b, Lead_Exp(b)); break; case 0: sum = Coef (a, Lead_Exp (a)) + Coef ( b, Lead_Exp(b)); if (sum) { Attach (d, sum, Lead_Exp(a)); a = Remove(a , Lead_Exp(a)); b = Remove(b , Lead_Exp(b)); } break;

case 1: d = Attach(d, Coef (a, Lead_Exp(a)), Lead_Exp(a)); a = Remove(a, Lead_Exp(a)); } }insert any remaining terms of a or b into d

Page 66: Data structures using C

66

Array representation of two polynomials

A(X)=2X1000+1 B(X)=X4+10X3+3X2+1

2 1 1 10 3 1

1000 0 4 3 2 0

starta finisha startb finishb avail

coef

exp

0 1 2 3 4 5 6 specification representationpoly <start, finish>A <0,1>B <2,5>

Page 67: Data structures using C

67

Polynomial AdditionAdd two polynomials: D = A + B

void padd (int starta, int finisha, int startb, int finishb, int * startd, int *finishd){/* add A(x) and B(x) to obtain D(x) */ float coefficient; *startd = avail; while (starta <= finisha && startb <= finishb) switch (COMPARE(terms[starta].expon, terms[startb].expon)) { case -1: /* a expon < b expon */ attach(terms[startb].coef, terms[startb].expon); startb++ break;

Page 68: Data structures using C

68

case 0: /* equal exponents */ coefficient = terms[starta].coef + terms[startb].coef; if (coefficient) attach (coefficient, terms[starta].expon); starta++; startb++; break;case 1: /* a expon > b expon */ attach(terms[starta].coef, terms[starta].expon); starta++;}

Page 69: Data structures using C

69

/* add in remaining terms of A(x) */for( ; starta <= finisha; starta++) attach(terms[starta].coef, terms[starta].expon);/* add in remaining terms of B(x) */for( ; startb <= finishb; startb++) attach(terms[startb].coef, terms[startb].expon);*finishd =avail -1;}

Analysis: O(n+m)where n (m) is the number of nonzeros in A(B).

*Program 2.5: Function to add two polynomial (p.64)

Page 70: Data structures using C

70

void attach(float coefficient, int exponent){ /* add a new term to the polynomial */ if (avail >= MAX_TERMS) { fprintf(stderr, “Too many terms in the polynomial\n”); exit(1); } terms[avail].coef = coefficient; terms[avail++].expon = exponent;}

Problem: Compaction is requiredwhen polynomials that are no longer needed.(data movement takes time.)

*Program 2.6:Function to add anew term (p.65)

Page 71: Data structures using C

71

Sparse Matrices

Some of the problems require lot of zeros to be stored as a part of solution.

Hence storing more number of zeros is a waste of memory.

Matrices with relatively high proportion of zero entries are called sparse matrices.

Matrix that contains more number of zero elements are called sparse matrices.

Sparse matrix is used in almost all areas of the natural sciences

Page 72: Data structures using C

CHAPTER 272

0002800

0000091

000000

006000

0003110

150220015col0 col1 col2 col3 col4 col5

row0

row1

row2

row3

row4

row5

8/36

6*65*3

15/15

Sparse Matrix

sparse matrixdata structure?

Page 73: Data structures using C

73

SPARSE MATRIX ABSTRACT DATA TYPE

Structure Sparse_Matrix is objects: a set of triples, <row, column, value>, where row and column are integers and form a unique combination, and value comes from the set item. functions: for all a, b Sparse_Matrix, x item, i, j, max_col, max_row index

Sparse_Marix Create(max_row, max_col) ::= return a Sparse_matrix that can hold up to max_items = max _row max_col and whose maximum row size is max_row and whose maximum column size is max_col.

Page 74: Data structures using C

74

Sparse_Matrix Transpose(a) ::= return the matrix produced by interchanging the row and column value of every triple.Sparse_Matrix Add(a, b) ::= if the dimensions of a and b are the same return the matrix produced by adding corresponding items, namely those with identical row and column values. else return errorSparse_Matrix Multiply(a, b) ::= if number of columns in a equals number of rows in b return the matrix d produced by multiplying a by b according to the formula: d [i] [j] = (a[i][k]•b[k][j]) where d (i, j) is the (i,j)th element else return error.

Page 75: Data structures using C

75

Sparse Matrix Representation

Consider a normal matrix containing more number of zero entries.

Few steps are…as the first step first row of the sparse matrix stores the following information1. Location of[0,0] stores the row size of the

original matrix2. Location of [0,1] stores the column size of

the original matrix3. Location of [0,2] stores the number non zero

entries of original matrix.

Page 76: Data structures using C

76

row col value row col value

a[0] 6 6 8 b[0] 6 6 8 [1] 0 0 15 [1] 0 0 15 [2] 0 3 22 [2] 0 4 91 [3] 0 5 -15 [3] 1 1 11 [4] 1 1 11 [4] 2 1 3 [5] 1 2 3 [5] 2 5 28 [6] 2 3 -6 [6] 3 0 22 [7] 4 0 91 [7] 3 2 -6 [8] 5 2 28 [8] 5 0 -15

(a) (b)

*Figure 2.4:Sparse matrix and its transpose stored as triples (p.69)

(1) Represented by a two-dimensional array. Sparse matrix wastes space.(2) Each element is characterized by <row, col, value>.

row, column in ascending order

# of rows (columns)

# of nonzero terms

transpose

Page 77: Data structures using C

77

Create Operation

Sparse_matrix Create(max_row, max_col) ::= #define MAX_TERMS 101 /* maximum number of terms +1*/ typedef struct { int col; int row; int value; } term; term a[MAX_TERMS]

# of rows (columns)# of nonzero terms

Page 78: Data structures using C

78

Transposing a matrix

(1) for each row i take element <i, j, value> and store it in element <j, i, value> of the transpose. difficulty: where to put <j, i, value> (0, 0, 15) ====> (0, 0, 15) (0, 3, 22) ====> (3, 0, 22) (0, 5, -15) ====> (5, 0, -15)

(1, 1, 11) ====> (1, 1, 11) Move elements down very often.

(2) For all elements in column j, place element <i, j, value> in element <j, i,

value>

Page 79: Data structures using C

CHAPTER 2

79

void transpose (term a[], term b[])/* b is set to the transpose of a */{ int n, i, j, currentb; n = a[0].value; /* total number of elements */ b[0].row = a[0].col; /* rows in b = columns in a */ b[0].col = a[0].row; /*columns in b = rows in a */ b[0].value = n; if (n > 0) { /*non zero matrix */ currentb = 1; for (i = 0; i < a[0].col; i++) /* transpose by columns in a */ for( j = 1; j <= n; j++) /* find elements from the current column */ if (a[j].col == i) { /* element is in current column, add it to b */

Page 80: Data structures using C

80

b[currentb].row = a[j].col; b[currentb].col = a[j].row; b[currentb].value = a[j].value; currentb++ } }}

elements

columns

Page 81: Data structures using C

81 void fast_transpose(term a[ ], term b[ ]) { /* the transpose of a is placed in b */ int row_terms[MAX_COL], starting_pos[MAX_COL]; int i, j, num_cols = a[0].col, num_terms = a[0].value; b[0].row = num_cols; b[0].col = a[0].row; b[0].value = num_terms; if (num_terms > 0){ /*nonzero matrix*/ for (i = 0; i < num_cols; i++) row_terms[i] = 0; for (i = 1; i <= num_terms; i++) row_term [a[i].col]++ starting_pos[0] = 1; for (i =1; i < num_cols; i++) starting_pos[i]=starting_pos[i-1] +row_terms [i-1];

columns

elements

columns

Page 82: Data structures using C

82

for (i=1; i <= num_terms, i++) { j = starting_pos[a[i].col]++; b[j].row = a[i].col; b[j].col = a[i].row; b[j].value = a[i].value; } }}

Compared with 2-D array representationO(columns+elements) vs. O(columns*rows)

elements --> columns * rowsO(columns+elements) --> O(columns*rows)

Cost: Additional row_terms and starting_pos arrays are required. Let the two arrays row_terms and starting_pos be shared.

Page 83: Data structures using C

83

Sparse Matrix Multiplication

Definition: [D]m*p=[A]m*n* [B]n*p

Procedure: Fix a row of A and find all elements in column j of B for j=0, 1, …, p-1.

111

111

111

000

000

111

001

001

001

Page 84: Data structures using C

84

void mmult (term a[ ], term b[ ], term d[ ] )/* multiply two sparse matrices */{ int i, j, column, totalb = b[].value, totald = 0; int rows_a = a[0].row, cols_a = a[0].col, totala = a[0].value; int cols_b = b[0].col, int row_begin = 1, row = a[1].row, sum =0; int new_b[MAX_TERMS][3]; if (cols_a != b[0].row){ fprintf (stderr, “Incompatible matrices\n”); exit (1);}

Page 85: Data structures using C

85

fast_transpose(b, new_b);/* set boundary condition */a[totala+1].row = rows_a;new_b[totalb+1].row = cols_b;new_b[totalb+1].col = 0;for (i = 1; i <= totala; ) { column = new_b[1].row; for (j = 1; j <= totalb+1;) { /* mutiply row of a by column of b */ if (a[i].row != row) { storesum(d, &totald, row, column, &sum); i = row_begin; for (; new_b[j].row == column; j++) ; column =new_b[j].row }

Page 86: Data structures using C

86 else switch (COMPARE (a[i].col, new_b[j].col)) { case -1: /* go to next term in a */ i++; break; case 0: /* add terms, go to next term in a and b */ sum += (a[i++].value * new_b[j++].value); break; case 1: /* advance to next term in b*/ j++ } } /* end of for j <= totalb+1 */ for (; a[i].row == row; i++) ; row_begin = i; row = a[i].row; } /* end of for i <=totala */ d[0].row = rows_a; d[0].col = cols_b; d[0].value = totald;}

Page 87: Data structures using C

87

We store the matrices A,B & D in the arrays a,b & d respectively.

To place a triple in d to reset sum to 0, mmult uses storeSum.

In addition mmult uses several local variables that we will describe….

The variable row is the row of A that we are currently multiplying with the columns in B.

The variable rowbegin is the position in a of the first element of the current row, and the variable column in the column of B that we are currently multiplying with a row in A.

Page 88: Data structures using C

88

Compared with matrix multiplication using array

for (i =0; i < rows_a; i++) for (j=0; j < cols_b; j++) { sum =0; for (k=0; k < cols_a; k++) sum += (a[i][k] *b[k][j]); d[i][j] =sum; }