chapter 41 arrays, records and pointer. introduction data structure is classified as either linear...

26
CHAPTER 4 1 CHAPTER 4 ARRAYS, RECORDS AND POINTER

Upload: kelly-reed

Post on 27-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

CHAPTER 4 1

CHAPTER 4

ARRAYS, RECORDS AND POINTER

Page 2: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Introduction data structure is classified as either linear or non linear, there are two ways of

representing linear structures in memory.

One way is to have linear relationship between the elements represented by

means of sequential memory location. These are called array.

The other way is to have linear relationship between the elements are represented

by means of pointer or links. These structures are called linked list.

Here we discuss array and different operation performed on array.

CHAPTER 4 2

Page 3: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

CHAPTER 4 3

Linear Arrays

A linear array is a list of finite numbers n of homogenous data elements . Array has a set of index and values.

data structureFor each index, there is a value associated with that index. Index is used to reference an element in memory location. Values are data elements stored in memory. representation (possible)

implemented by using consecutive memory.The number n of elements is called the length or size of the array, if not defined we assume index started from 1,2…..,n. in general the length or the number of data elements of array can be obtained from index set by formula

Length = UB – LB + 1UB is upper bound the largest index, and LB is lower bound the smallest index. elements may of an array A denoted by subscript notation A1, A2, ----------------------- An

Page 4: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

in Pascal elements of data can be represented A[1], A[2] --------------------- A[n] we usually use subscript notation to represents.The fig shows array of numbers start from 0 to 19. which contain data in sequential memory location.

CHAPTER 4 4

Page 5: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

1 2 3 4Fig shows four data elements contains data in memory.

CHAPTER 4 5

247

300

500

56

Page 6: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Some programming languages e.g ( Fortran and Pascal ) allocate memory space

for arrays statically during program compilation hence the size of array is fixed

during program execution. On the other hand some programming languages

allows one to read an integer n and then declare an array with n elements; such

programming languages are said to allocated memory dynamically.

Representation of Linear arrays in memory

let La be a linear array in memory of the computers, and memory of computer is

simply a sequence of address location and we use notation

LOC(LA[K]) = address of element LA[K] of the array LA

because these stores in successive memory cells so only we needs to keep track of

the first element of LA,

Denoted by Base(LA) and called the base address of LA. Using this address

computer calculates the address of elements of LA by following formula

LOC(LA[K]) = Base (LA) + w(K- lower bound)

where w is the number of words per memory cell for the array LA.CHAPTER 4 6

Page 7: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Traversing Linear Arrays.

Let B be a collection of data elements stored in memory, and we want to print the content of each elements of B or we want to count the numbers of elements of B. this can be done by Traversing B, that is by accessing and processing each elements of B exactly once. The following algorithm traverses a linear array LA.

Traversing Linear Array , LA is an array with UB and LB. the algorithm traverses LA

1. Set K:= LB 2. repeat step 3 and 4 while K ≤ UB 3 apply process to LA[K] 4. set K:= K+1 End of step 2 5. Exit

CHAPTER 4 7

Page 8: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Inserting and deletingInserting an elements at the end of a linear array can be easily done provided the memory space allocated for the arrays is large enough to accommodate the additional element. If we need to insert an element in the middle of array. Then on the average half of the elements must be moved downward to new location to accommodate the new elements and keep the order of other elements, similarly deletion element at end is not difficult but deletion in middle of array would required that each subsequent element be moved one location upward in order to fill up the array.

Algorithm of inserting into linear array. INSERT (LA,N, K, ITEM)

1.Set J:=N

2.Repeat step 3 and 4 while J ≥ K

3.[Move jth element downward ] set LA[J+1] := LA[J]

4.Decrees counter set J:= J-1 end of step 2 loop

5.Insert element set LA[K] := ITEM

6.[Reset N] N:= N+1

7.EXITCHAPTER 4 8

Page 9: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Following algorithm insert a data element ITEM into the kth position in a linear

array LA with N elements. The first four step create space in LA by moving

downward one location each element from the kth position. These elements move

in revers order first LA[N], then LA[N-1]…. LA[K]; we first set J=N and then

using J as counter decrease J as counter, decrease J each time the loop is executed

until J reach K. the next step 5 insert ITEM into the array in the space just created.

Before exit the number N of elements in LA is increased by 1 to account for the

new element.

CHAPTER 4 9

Page 10: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Sorting Bubble Sort

Let A be a list of n numbers, sorting A refers to the operation of rearranging the element of A so they are in increasing order, i.e so that

A[1]<A[2]<A[3] -------<A[N]

e.g. A contain 8, 4, 19, 2,7, 13, 5 ,16

After sorting A is the list

2,4,5,7,8,13,16,19 sorting may seems to be a trivial task. Here are many algorithms

for sorting data in array, we will discuss the bubble sort, from above definition we

mean sorting refers to arranging numerical data in increasing order or arranging data

in decreasing order and also arranging nonnumeric data in alphabetical order.

Bubble sort

Suppose the list of numbers A[1], A[2]--------- A[N] is in memory bubble sort algorithm works as follows

CHAPTER 4 10

Page 11: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Working of Bubble Sort

Step 1. compare A[1] and A[2] and arrange them in the desired order, so that A[1]<A[2].then compare

A[2] and A[3] and arrange them so that A[2]< A[3]. then compare A[3] and A[4] and arrange them so

that A[3]< A[4].continue until we compare A[N-1] with A[N] and arrange them so that A[N-1] < A[N].

Step involves n-1 comparison in this step the largest element is bubbled up to nth location, after step 1

largest element is at nth location

Step2. in this step same steps repeated with less than one comparison from step one i.e A[N-2]

comparison required. And compare A[N-2] and A[N-1].

We performs these comparison up to step N-1

Step N-1. compare A[1] with A[2] and arrange them so that A[1]<A[2]

The process of sequentially traversing through all or part of a list is frequently called “pass” so each of

the above steps is called pass.

CHAPTER 411

Page 12: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Algorithm & its complexity

CHAPTER 4

Page 13: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Searching; Linear SearchSearching refers to the operation of finding the location LOC of ITEM in Data, or printing some message that item does not appear. The search is successful if item is found. There are many searching algorithm available we discuss simple searching algorithm called linear search and then we study binary search. Binary search is faster searching algorithm and better than linear search. The complexity of searching algorithms is measured in term of numbers f(n) of comparisons required to find ITEM in DATA where DATA contains n elements.

Linear Search

Suppose Data is linear array with n elements., the most popular method to find the given item in data is to compare item with each element of data one by one. That is first we text whether DATA[1]=item , and we continue to n elements until we find required data. In this method we travers each element un till we found required data. To simply this assign ITEM to DATA[N+1], the position following the last element of DATA. LOC=N+1 where LOC denotes where item first appear in DATA, signifies that search is unsuccessful. The purpose is to avoid repeatedly testing whether or not we have reached the end of array DATA. This way search must eventually success.

CHAPTER 4 13

Page 14: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Algorithm of Linear Search(linear search) LINEAR(DATA, N, ITEM, LOC) here DATA is a linear array with N elements and ITEM is a given item of information. This finds the location LOC of ITEM in DATA, or sets LOC=0 if search is unsuccessful.

1. [insert ITEM at the end of DATA) set DATA[N+1] = ITEM.

2. [Initialize counter set] LOC=1

3. [Search for item]

repeat while DATA[LOC]#ITEM

set LOC = LOC+1

[end of loop]

4. [successful?] if LOC = N+1, then : set LOC := 0

5. exit.

Complexity of this algorithm is by measure by the number of comparison required to find ITEM in DATA with n elements. Two important cases is average case and worst case. Worst case is that when searching perform through entire array when item does not appear in data. F(n) = n+1.

Average case is f(n) = n(n+1)/2 . 1/n = n=1/2

CHAPTER 4 14

Page 15: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Binary Search & complexity Suppose data is an array which is sorted in increasing numerical order or equivalently alphabetically. Then there is an efficient searching algorithm called binary search, which can be used to find the location LOC of a given item of information in DATA. The binary search algorithm applied to our array data works as follows. During each stage of our algorithm, our search for item is reduce to a segment of element of data.

Data[beg], Data[beg+1]--------,Data[end]

The variables beg and end denotes respectively the beginning and end locations of the segments under consideration. The algorithm compare item with the middle element data [mid] of the segments where mid is obtained by

mid = INT((beg) + (end)/2)

If data[mid] = item then the search is successful and we set LOC = mid otherwise a new segments of data is obtained by

(a) If item < data[mid], then item can appear only in the left half of the segment

Data [beg], data[beg+1] -------data[mid-1]

So we reset END : = mid-1 and begin searching again.

CHAPTER 4 15

Page 16: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

(b) If item > data [mid], then item can appear only in the right half of the segment

Data[mid+1] ,data[mid+2] ------- data[end]

So we reset Beg : = mid+1 and begin searching again.

Initially we begin with the entire array data i.e we begin with beg = 1 and end = n or more generally with beg = LB and end =UB

If item is not in data then eventually we obtain end<beg

This condition signals that search is unsuccessfully, and in such a case we assign LOC:= null . Null mean the value is lies outside the set of indices of data.

Complexity of binary search algorithm

The complexity is measured by the number f(n) of comparison to locate item in data where data contains n elements. Observe that each comparison reduce the sample size in half. Hence require at most f(n) comparison to locate item where

f(n)= [log2n]+1

That is, the running time for the worst case is approximately equal to log2n. The running time for average case is approximately equal to the running time for the worst case.

Limitation of Binary Search

Since the binary search algorithm is very efficient( e.g it requires only 20 about 20 comparison with an initial list of 1000 000 elements)

CHAPTER 4 16

Page 17: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Why would one want to use any other search algorithm? The reason is that (1) the list must be sorted and (2) one must have direct access to the middle element in any sub list. This means one must essentially use a sorted array to hold data. But sorted array is expensive when many insertion or deletion take place, for this reason one may use different data structure such as linked list or binary search tree.

(Binary Search) BINARY (DATA, LB , UB, ITEM,LOC)

Here DATA is a sorted array with lower bound LB and upper bound UB, and ITEM is a given item of information. The variable BEG, END and MID denote, respectively, the beginning, end and middle location of segment of elements of DATA. This algorithm finds the location LOC of ITEM in DATA or sets LOC = Null

1. [ initialize segments variables]

Set BEG = LB, END = UB and MID = INT((BEG+END)/2).

2. Repeat step 3 and 4 while BEG ≤ END and DATA[MID] ≠ ITEM

3. If ITEM < DATA[MID] then :

Set END := MID-1

else BEG: = MID+1

[end of if structure] CHAPTER 4 17

Page 18: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

4. Set MID : = INT((BEG+END)/2)

[End of steps 2 loop]

5. If DATA[MID] = ITEM then:

set LOC : = MID

else

set LOC : = NULL

[End of if structure.]

6. Exit

CHAPTER 4 18

Page 19: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Pointers ; Pointer Arrays

Let DATA be an array. A variable P is called a pointer if P “points” to an element in

DATA, i.e. if P contains the address of an element in DAT. An array PTR is called a

pointer array if each element of PTR is a pointer. Pointers and pointer arrays used to

facilitate the processing of information in DATA. Arrays whose rows or columns

begin with different numbers of data elements and end with unused storage location

are said to be jagged. This useful tool can be discuss by an example. There is also

another way to store list in memory, that list is placed in a linear array, one group

after another. This method is space efficient, also the entire list can easily processed

one by one. But there is no way to access any particular group; e.g there is no way

to find and print only the name in the third group.

A modified version shown in next slide. That is the names are listed in a linear

array, group by group, except some marker, such as three dollar sign used to indicate

end of group. This required extra memory but one can easily access particular data.

CHAPTER 419

Page 20: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Draw back is that list is traversed from beginning in order to recognized the group.

CHAPTER 4 20

Group 1 Group 2 Group 3 Group 4

Evans Conrad Davis Baker

Harris Felt Segal Cooper

Lewis Glass Ford

Shaw Hill Gray

King Jones

Penn Reed

Silver

Troy

MEMBERS

1 Evans

2 Harris

3 Lewis

4 Shaw

5 Conrad

…. …

13 Wagner

14 Davis

15 Segal

16 Baker

….….

..

..

21 Reed

Group 1

Group 2

Group 4

Group 3

Page 21: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

( * * * * 0 0 0 0 0)

( * * * * * * * * *)

( * * 0 0 0 0 0 0 0)

(* * * * * * * 0 0)

CHAPTER 4 21

MEMBERS1

2

3

4

5 $$$

6 Conrad

… ….

14 Wagner

15 $$$

16 Davis

17 Segal

18 $$$

19 Baker

------ -----

24 Reed

25 $$$

Page 22: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Pointer ArrayThe two space efficient data structure can be easily modified so that individual group can be indexed. This is accomplished by using a pointer array, which contain the location of

different groups or more specifically the location

of the first element in the different groups.

CHAPTER 4 22

MEMBERS

1 Evans

2 Harris

3 Lewis

4 Shaw

5 Conrad

--- ---

13 Wagner

14 Davis

15 Segal

16 Baker

---- ---

21 Reed

22 $$$

Group

1 1

2 5

3 14

4 16

5 22

Page 23: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Records, Record Structure

Collection of data are frequently organized into a hierarchy of fields, records and

files. A record is a collection of related data items, each of which is called a field or

attributes, and a file is collection of similar records. Each data item itself may be a

group item composed of sub items, those items which are indecomposable are called

elementary item or scalars. The names given to various data items are called

identifier. Although a record is a collection of data item, it is differ from linear array

in the following ways.

(a)A record may be a collection of nonhomogeneous data.

(b)The data item in a record are indexed by attribute names, so there may not be a

natural ordering of its elements.

CHAPTER 4 23

Page 24: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Multidimensional arraysLinear array also called one-dimensional arrays, since each element in the array is referenced by a single subscript. Most programming languages allow two-dimensional and three-dimensional array. i.e. arrays where elements are referenced respectively by two and three subscripts.

Two Dimensional array

A two dimensional m x n array A is a collection of M. N data element such that each element is specified by a pair of integers (such as J ,K) called subscripts. The element of A with first subscript j and second subscript k will be denoted by

A[J, K]

It is also called matrices in mathematics and table in business applications. There is standard way of drawing a two dimensional m x n array.

1 2 3 4

1 A[1,1] A[1,2] A[1,3] A[1,4]

2 A[2,1] A[2,2] A[2,3] A[2,4]

3 A[3,1] A[3,2] A[3,3] A[3,4]

Two dimensional 3 x 4 array ACHAPTER 4 24

Page 25: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Representation of two- dimensional array in memory Let A be a two dimensional m x n array. Although A is pictured as rectangular array of elements with m rows and n columns, the array will be represented in memory by a block of m . n sequential memory locations. The programming language will store the array A either (1) column by column is what called column-major order or row by row in major-row order. Fig shows two ways to represent array, either by row or column.

Subscript

column 1(1,1) to (3,1) Row 1 (1,1) to (1,4)

column 2 (1,2) to (3,2)

Row 2 (2,1) to (2,4)

column 3 (1,3) to (3,3)

column 4 (1,4) to (3,4) Row 3 (3,1) to (3,4)

CHAPTER 4 25

A

A

Page 26: CHAPTER 41 ARRAYS, RECORDS AND POINTER. Introduction data structure is classified as either linear or non linear, there are two ways of representing linear

Linear array do not keep track of the address LOC(A[k]) of every element A[k], but does keep the track of bas( A), the address of first element.

formula LOC(A[k] = base (A) + w(k-1)

Similar situation holds for two dimensional m x n array A. that is, the computer keeps track of base (A) the address of the first element A[1,1] of A and compute the address LOC(A[ j ,k ] ) of A[ j , k] by formula.

Column major-order LOC(A[ j ,k ] ) = base (A) + w[M(k-1) + (J-1)

Or formula

Row major-order LOC(A[ j ,k ] ) = base (A) + w[N(j-1) + (k-1)

CHAPTER 4 26