Download - Algorithms
Sorting and Searching Algorithms
Cutajar & Cutajar
Algorithms 2
Sorting
Sorting means the arrangement of records in a particular order, according to a specified sorting key.
For simplicity we will consider sorting of an array of records, all present in main memory.
Generally sorting involves a complexity of N2 where N is the number of records. This means that in a list of N records, sorting normally involves scanning all the N records for N times.
ID Card Surname
712364 Cutajar
345222 Zammit
778879 Sammut
453211 Abela
ID Card Surname
345222 Zammit
453211 Abela
712364 Cutajar
778879 Sammut
SortingSort key
Algorithms 3
Considerations For Choosing A Sorting Method
Number of items to be sorted.
Initial arrangement of data
Complexity of algorithm (hence its implementation)
The relative speed of a given method
Size of data items to be sorted.
Algorithms 4
Selection Sort
This is a simple technique similar to the one performed by pencil and paper.
It consists of repeatedly looking through a data array to find the lowest key (for sorting in ascending order). This element is than written to another array. The data element is than cancelled from the original array. ( sometimes a rogue value is just written)
This procedure is repeated until all the records are sorted out of the original data array.
data
array
new
array
pass 1 pass 2 pass 3
data new data new data new
234 234 156 156 156
320 320 320 234 234
156 320
n = 3
n = 3
Algorithms 5
Selection Sort Algorithm
In this algorithm, we use the same array instead of two arrays. We scan the data array and find the smallest item and place it in the first place, The smallest element is swapped with the considered element. We repeat with the second item and place it in the second place … etc.
Consider the following data array:
123
213
456
145
431
100
123
100
100
213
456
145
431
123
213
123
100
123
456
145
431
213
456
145
100
123
145
456
431
213
456
213
100
123
145
213
431
456
431
431
smallest element
pass 1 pass 2 pass 3 pass 4 pass 5
Algorithms 6
Pseudo-Code
Program Selection_Sort
use variables: numbers array[n] of type integer
temp, count, pass, lowest of type integer
for pass := 1 to n do
lowest := pass
for count := pass to n-1 {find smallest}
if numbers[count+1] < numbers[lowest]
lowest := count+1
end for
temp := numbers[pass] {swap current with smallest}
numbers[pass] := numbers[lowest]
numbers[lowest] := temp
end for
end program
Algorithms 7
Insertion Sort
In this algorithm the first two elements of the array are compared and arranged in order. The third element is compared with the first two and inserted in its correct position. This process is repeated until every element in the list has been inserted in its correct position. This method is similar to when we sort cards by hand.
Initial array pass 1 pass 2 pass 3 pass 4 pass 5
21 16 16 16 16 12
16 21 21 21 19 16
35 35 35 21 19
47 47 35 21
19 47 35
12 47
Algorithms 8
Insertion Sort Flowchart
Flowchart Notation: CP(item) : item in location pointed at by CP PP(item) : item in location pointed at by PP CP : value of pointer pointing to the number
currently being inserted PP : value of pointer pointing to the numbers (in
ordered list) that are being compared to CP(item)
N
CP:=1
Start
PP<=CP?
Temp:=CP(item)Exchange
consecutive items down to
PP+1
PP(item):=temp
PP:=PP+1
CP=max?
End
CP:=CP+1PP:=1
Y N
Y
PP(item)>CP(item)?
Y
N
Algorithms 9
Insertion Sort Algorithm
123
213
456
145
431
100
initially
123
213
456
145
431
100
pass 1
123
213
456
145
431
100
pass 2
123
213
456
145
431
100
pass 3
123
213
456
431
100
145
123
213
456
431
100
123
145
213
456
431
100
pass 4
431
123
145
213
456
431
100
123
145
213
456
100
123
145
213
456
100
123
145
213
431
456
100
pass 5
100
123
145
213
431
456
123
145
213
431
456
100
123
145
213
431
456
123
145
213
431
456
100
In this algorithm we use the same array and shift the data downwards to make space for the insertion
Algorithms 10
Pseudo Code
Program Insertion_Sort
use variables: numbers array[n] of type integer
current, pass, position of type integer
for pass := 1 to n-1 do
current := numbers[pass]
position := pass
while (position > 0 AND numbers[position-1] > current )
numbers[position]:=numbers[position-1] {shift down}
position := position-1
end while
numbers[position] := current {insert current element in space}
end for
end program
Algorithms 11
Bubble Sort
This is so called because the smallest element rises to the top of the array and then the next smallest „bubbles‟ up to the next position and so on.
On the first pass the last two elements (n) and (n-1) of the array are compared and exchanged if necessary. This process is repeated with the (n-1) and (n-2) elements and then with the (n-2) and (n-3) and so on until the smallest element arrives at the top of the array.
The next pass repeats the same procedure with the whole array except the already sorted elements.
The passes end when the array is completely sorted
Algorithms 12
Bubble Sort Algorithm
3
2
0
4
1
pass 1initially
3
2
0
1
4
3
2
0
1
4
3
0
2
1
4
0
3
2
1
4
0
3
2
1
4
0
3
1
2
4
0
1
3
2
4
pass 2
0
1
3
2
4
0
1
2
3
4
pass 3
0
1
2
3
4
pass 4
0
1
2
3
4
final
Algorithms 13
Pseudo-Code
Program bubble_sort
use variables numbers array[n] of type integer
temp, pass, current, of type integer
for pass := 1 to n-2 do
for current := n downto pass+1 do
if numbers[current] < numbers[current-1]
then swap(numbers[current],numbers[current-1])
end for
end for
end program
Procedure swap (parameters: variable a,b of type integer)
use variables temp of type integer
temp := a
a := b
b := temp
end procedure
Algorithms 14
Computational Complexity
Computational Complexity or simply Complexity is the number of steps or arithmetic operations required to solve the posed problem.
The interesting aspect is usually how complexity scales with the size of the input (the "scalability"), where the size of the input is described by some number N.
Thus an algorithm may have computational complexity O(N2) (of the order of the square of the size of the input), in which case if the input doubles in size, the computation will take four times as many steps. The ideal is a constant time algorithm (O(1)) or failing that, O(log2N) or O(N).
O(N2)
Algorithms 15
The Big-O Analysis
In complexity O denotes “in the order of”
It gives us an idea of the proportion of the efficiency of an algorithm with N and not an equality.
Thus if the execution steps of an algorithm is given by:
a.N + b we say it is of the order of N or O(N)
a.(log2N) + b we say if is of the order of log2N or O(log2N)
a.N.(log2N) + b we say if is of the order of N.log2N or O(N.log2N)
a.N2 + b we say it is of the order of N2 or O(N2)
Or simply a we say it is constant or O(1).
If the complexity has an order higher then polynomial, say in exponential relation with N, it is considered impossible due to the large amount of time required to perform the required task.
Algorithms 16
Procedure Example2
…
…
For i : = 1 to 2N do
For j := 1 to N do
…
…
End for
End for
End Procedure
Consider the following codes:
Procedure Example1
…
…
For j := 1 to 2N do
…
End for
For i := 1 to N do
…
End for
End Procedure
Complexity Example
The execution steps would be b + a.N , where a = 2a1+a2 , and so the complexity is still O(N)
b steps
The execution steps would be b + 2.a.N2
and so the complexity is still O(N2)
a steps
b steps
a1 steps
a2 steps
Algorithms 17
Complexity Classes
Most algorithms fall into one of the following types:
Type Complexity Comment
Logarithmic O(log2N) Very Good
Linear O(N) Good
Linear-Logarithmic O(N.Log2N) Fairly Good
Quadratic O(N2) Ok
Polynomial O(Nk) k ≥ 1 Poor
Exponential O(aN) a > 1 Awful
Algorithms 18
Complexity of Simple Sorts
To analyse the complexity of the sorting algorithms seen so far, i.e. Insertion Sort, Selection Sort and Bubble Sort, let us analyse the number of comparisons made for a general list of N elements.
In all these algorithms the sum of comparisons is:
N-1 for the pass 1
N-2 for the pass 2
N-3 for the pass 3
…
2 for pass N-2
1 for pass N-1
Sum = N-1 + N-2 + N-3 + … + 2 + 1
Or 1 + 2 + 3 + … + N-2 + N-1
If we add the two lines above we get 2*Sum
2*Sum = N + N + N + … + N + N
i.e. N for (N-1) times
Thus 2*Sum = N*(N-1) Therefore Sum = N*(N-1)
2
Which could have been obtained by the sum of an arithmetic progression.
Thus
Cplx = O(N2)
Algorithms 19
Quick Sort
Complex type routine used when list of items to be sorted is large.
Although Quicksort is faster than other methods when sorting large amounts of data, it is often slower (depending on both the implementation and the starting order) with less than about a dozen items. Hence quicksort programs sometimes include a switch to another method whenever the number remaining to be sorted drops below some arbitrary figure.
Additionally if the unsorted list is already somewhat ordered the quicksort method becomes somewhat inefficient – the worst case for quicksort being an input list which is already in order!
Algorithms 20
Algorithm
Select an item, usually the first item, of the unsorted list. This is called the Pivot
Partition the remaining items into TWO sublists.
A LEFT SUBLIST, with data items LESS than the selected item
A RIGHT SUBLIST, with data items GREATER than the selected item.
Place the pivot between these two sublists.
If left sublist contains more than one item
Then Quicksort the left sublist
If right sublist contains more than one item
Then Quicksort the right sublist.
Algorithms 21
How to Partition
Pivot Value:= Table[First]
Up := First
Down = Last
Repeat
Increment Up until Table[Up] > Pivot Value
Decrement Down until Table[Down] <= Pivot Value
If Up < Down exchange their values
Until Up >= Down
Exchange Table[First] and Table[Down]
Define Pivot Index as Down
Algorithms 22
Partitioning Example
44 76 23 43 55 12 64 77 33
First LastUp Down
44 76 23 43 55 12 64 77 33
First LastUp Down
Pivot Value:= Table[First] ; Up := First ; Down = Last
Increment Up until Table[Up] > Pivot ValueDecrement Down until Table[Down] <= Pivot Value
If Up < Down Exchange their values
44 33 23 43 55 12 64 77 76
First LastUp Down
Pivot = 44
Pivot = 44
Pivot = 44
Table
Table
Table
Algorithms 23
Partitioning (cont…)
44 33 23 43 55 12 64 77 76
First LastUp Down
Up is < Down so Continue
Increment Up until Table[Up] > Pivot ValueDecrement Down until Table[Down] <= Pivot Value
If Up < Down Exchange their values
44 33 23 43 55 12 64 77 76
First LastUp Down
44 33 23 43 12 55 64 77 76
First LastUp Down
Pivot = 44
Pivot = 44
Pivot = 44
Algorithms 24
Partitioning (cont…)
44 33 23 43 12 55 64 77 76
First LastUpDown
Up is < Down so Continue
Increment Up until Table[Up] > Pivot ValueDecrement Down until Table[Down] <= Pivot Value
Now Down < Up so exchange pivot value with Table[Down]
44 33 23 43 12 55 64 77 76
First LastUp Down
12 33 23 43 44 55 64 77 76
First LastUpDown
Pivot = 44
Pivot = 44
Pivot = 44
Table
Table
Table
Algorithms 25
Partitioning (cont…)
Note that all values under the Pivot Index are smaller than the Pivot Value and all values above the Pivot Index are Larger than the Pivot Value
This gives us two sub-arrays to re-partition
12 33 23 43 44 55 64 77 76
Pivot Index
Pivot = 44Table
Last 1First 1 First 2 Last 2
Partition 1 Partition 2
Algorithms 26
Quick Sort Algorithm
Procedure QuickSort( use variables First, Last : integer)
Use Varianbles PivIndex: integer;
If (First < Last) then
PivIndex = Partition(First,Last);
QuickSort(First, Pivindex-1);
QuickSort(Pivindex+1, Last);
Endif
End Procedure;
Algorithms 27
Quick Sort Example
Consider the following list
12 33 23 43 44 55 64 77 76
12 33 23 43 55 64 77 76
23 33 43 64 77 76
76 774323
77
12 23 33 43 44 55 64 76 77
Algorithms 28
Complexity of Quick Sort
Let us analyse the number of comparisons that are made in this algorithm:
N for the first pass where all the elements are compared with the pivot
2*N/2 for the next pair of passes where N/2 elements in each “half” of the original array are compared to their own pivot values.
4*N/4 for the next four passes where N/4 elements in each “quarter” of the original array are compared to their own pivot values.
How Many Partitions Occur ??
Algorithms 29
How Many Partitions
It depends on the order of the original array elements: If each partition divides the sub-array
approximately in half, there will be only log2N partitions made, and so Quicksort is O(Nlog2N).
But, if the original array was sorted to begin with, the recursive calls will partition the array into parts of unequal length, with one part empty, and the other part containing all the rest of the array except for the pivot value itself. In this case, there can be as many as N-1 partitions made, and QuickSort will have O(N2).
Best Case: 3 (log28)
comparisons of 8 elements
Worst Case: 7 comparisons of 8 elements
Algorithms 30
Comparison Of Sorting Routines
Relative speeds for sorting random integers, using different methods.
Bubble Sort
Insertion Sort
Selection Sort
Quick Sort
Algorithms 31
Merge Sort
In MergeSort, the list to be sorted is successively subdivided in two until the number of elements in the sub-list remain one or two.
Subsequently they are merged together in order in such a way that after successive merges the whole list is recomposed in the desired sorting order.
This algorithm lends itself well to a recursive method of programming.
Algorithms 32
Merge Sort Algorithm
1. If the input sequence has fewer than two elements, return. 2. Partition the input sequence into two halves. 3. Sort the two subsequences using the same algorithm. 4. Merge the two sorted subsequences to form the output sequence.
MergeSort(list, first, last) if (first < last)
middle = (first + last) div 2 MergeSort(list, first, middle) MergeSort(list, middle+1, last) Merge(list, first,middle, last)
endif
44 76 23 43 55 12 64 77 33
First Last
list
Algorithms 33
The Merge Algorithm (Part I)
void Merge(use variables A[] array of integer; f, m, l : integer)
first1 := f; last1 := m; first2 := m+1; last2 := last;
index = first1;
B[SIZE] : array of integer;
while((first1 <= last1) && (first2 <= last2))
if(A[first1] < A[first2]) then
B[index] = A[first1];
first1:= first1+1;
else
B[index] = A[first2];
first2:=first2+1;
Endif
index := index+1;
End While
Last1
44 76 23 43
First1 First2 Last2
A
23 43 44 76B
f l
Algorithms 34
The Merge Algorithm (Part II)
while(first1 <= last1)
B[index] = A[first1];
first1:=first1+1;
index:=index+1;
Endwhile
while(first2 <= last2)
B[index] = A[first2];
first2:= first2+1;
index:=index+1;
Endwhile
For index := f to l
A[index] = B[index];
finish off first sub-array if necessary
finish off second sub-array if necessary
Copy Temporary array back to original array
Algorithms 35
Merge Sort Complexity
The entire array can be subdivided into halves only log2N times.
Each time it is subdivided, function Merge is called to re-combine the halves. Function Merge uses a temporary array to store the merged elements.
Merging is O(N) because it compares each element in the sub-arrays.
Copying elements back from the temporary array to the values in the array is also O(N)
Thus Merge Sort is O(N.log2N)
Algorithms 36
Tree Sort (Tournament Sort)
Algorithm:
Transform the unsorted list into a binary search tree.
In a binary search tree, every node has the following property: All of its left descendants are smaller in value than the value
of the node itself, and all of its right descendants are larger than its value.
Traverse the resultant binary search tree in order.
Algorithms 37
Tree Sort Example
Consider the following unsorted list of numbers:
27 48 13 50 39 77 82 91 65 19 70 66
Creating the binary search tree for this given list:
27
13
48
50
39
77
82
91
65
19
70
66
Traversing this resultant tree in order we get the sorted list:
13 19 27 39 48 50 65 66 70 77 82 91.
Algorithms 38
Tree Sort Complexity
Building of the tree has a complexity of O(N). This is done just one time in the algorithm.
To read back the tree into a sorted array the visit of the tree must be performed N times with a complexity depending on the distribution of the tree.
In the best case the tree is perfectly balanced, i.e. the maximum difference between the lowest leaf and the highest leaf is 1 level. Then it would require log2N for each element thus leading to a total complexity of O(N.log2N)
In the worst case the tree is totally unbalanced, which is similar to a simple linked list. This would lead an average time of N/2 to read each of the N elements thus giving an overall complexity of O(N2). This occurs when the original list is already ordered and thus all children are placed on the right of the parent node
Note that this algorithm requires N extra memory space to build the tree.
Algorithms 39
Complexity Summary
Sorting Algorithm Best Case Average Case Worst Case
Selection Sort O(N2) O(N2) O(N2)
Insertion Sort O(N2) O(N2) O(N2)
Bubble Sort O(N2) O(N2) O(N2)
Merge Sort O(N.log2N) O(N.log2N) O(N.log2N)
Quick Sort O(N.log2N) O(N.log2N) O(N2)
Tree Sort O(N.log2N) O(N.log2N) O(N2)
Algorithms 40
Comments on Sorting Algorithms
Bubble, Insertion and Selection sort routines are preferable when list of items to be sorted consists of a few elements.
Bubble sort is the slowest in execution but the easiest method (and simplest implementation).
Insertion and Selection sorts have approximately the same speeds and both are usually marginally faster than a bubble sort, implementation (programs) are short and simple (advantage)
Quick Sort and Tree Sort are far faster than the above methods when large quantities of items are to be sorted. They both suffer from the initial arrangement of the list problem.
Merge Sort doesn‟t suffer from the initial arrangement problem. However Merge Sort and Tree Sort require an extra memory space for the temporary array or the binary tree. However the algorithms for the later three (and implementation) are much more complex.
Algorithms 41
Linear Search
This can be a matter of looking through the array, element by element sequentially until the required key is found.
Since the particular key may be absent from the unsorted array, searching the array may require the search through the entire array.
Thus the complexity of this type of searching is of the order of N, O(N).
This is clearly an inefficient method, called direct or linear search, and is only used for small arrays.
Sorting the array on the search key can improve the efficiency by stopping when the the particular key is found or a greater one is found instead.
342
234
123
675
455
664
344
888
645
456
456
456
N
Algorithms 42
Pseudo-Code
Program direct_search
use variables: numbers array[N] of type integer
key, item of type integer
found of type boolean
found := FALSE
repeat
if key = numbers[item] then found := TRUE
else item := item + 1
until found = TRUE OR item = N
end program
Algorithms 43
Binary Search
This algorithm, although performing more computation is more efficient since it performs a less number of comparisons.
The method involves finding the center of the array and comparing the search key with that element. If the search key is equal then the element is found.
If the search key is less than the element then the key must be in the lower half of the array so the same procedure is repeated on the lower half.
If the search key is greater than the element then the key must be in the upper half of the array so the same procedure is repeated on the upper half.
An allowance must be taken for the case where the key doesn‟t exist in the array
The complexity of this algorithm is thus log2(N)
NOTE: This algorithm works only on sorted arrays !
Algorithms 44
S NOT FOUND
Binary Search Flowchart
Output
„found‟
S FOUND
Start
Consider the entire input list l
Compare S to middle item of considered list
Is Considered list empty?
Output „not found‟
End
YESNO
Is S < Considered item?
Take the considered list as the lower half of
the list
YES NO
Take the considered list as the upper half of
the list
Algorithms 45
Binary Search Algorithm
123
234
342
344
455
456
645
664
675
888
456
pass 1
123
234
342
344
455
456
645
664
675
888
456
pass 2
123
234
342
344
455
456
645
664
675
888
pass 3
456
Algorithms 46
Pseudo-Code
Program Binary_Search
use variables numbers array[N] of type integer
start, middle, end, key of type integer
found of type boolean
start := 0
end := N-1
found := false
repeat
middle := ( end + start ) div 2
if key = numbers[middle] then found := TRUE
else if key > numbers[middle] then start := middle + 1
else if key < numbers[middle] then end := middle – 1
until found = TRUE OR start > end
end program