big o notation
DESCRIPTION
Big O notation to understand space and time complexity in order to perform efficient programming and solve complex problems.TRANSCRIPT
![Page 1: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/1.jpg)
Algorithm Analysis: Big O Notation
![Page 2: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/2.jpg)
Determine the running time of simple algorithms
Best case
Average case
Worst case
Profile algorithms Understand O notation's mathematical basis Use O notation to measure running time
John Edgar 2
![Page 3: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/3.jpg)
John Edgar 3
![Page 4: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/4.jpg)
Algorithms can be described in terms of
Time efficiency
Space efficiency
Choosing an appropriate algorithm can make a significant difference in the usability of a system
Government and corporate databases with many millions of records, which are accessed frequently
Online search engines
Real time systems where near instantaneous response is required
▪ From air traffic control systems to computer games
John Edgar 4
![Page 5: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/5.jpg)
There are often many ways to solve a problem
Different algorithms that produce the same results ▪ e.g. there are numerous sorting algorithms
We are usually interested in how an algorithm performs when its input is large
In practice, with today's hardware, most algorithms will perform well with small input
There are exceptions to this, such as the Traveling Salesman Problem ▪ Or the recursive Fibonacci algorithm presented previously …
John Edgar 5
![Page 6: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/6.jpg)
It is possible to count the number of operations that an algorithm performs
By a careful visual walkthrough of the algorithm or by
Inserting code in the algorithm to count and print the number of times that each line executes (profiling)
It is also possible to time algorithms
Compare system time before and after running an algorithm ▪ More sophisticated timer classes exist
Simply timing an algorithm may ignore a variety of issues
John Edgar 6
![Page 7: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/7.jpg)
It may be useful to time how long an algorithm takes to rum
In some cases it may be essential to know how long an algorithm takes on some system
▪ e.g. air traffic control systems
But is this a good general comparison method?
Running time is affected by a number of factors other than algorithm efficiency
John Edgar 7
![Page 8: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/8.jpg)
CPU speed Amount of main memory Specialized hardware (e.g. graphics card) Operating system System configuration (e.g. virtual memory) Programming language Algorithm implementation Other programs System tasks (e.g. memory management) …
John Edgar 8
![Page 9: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/9.jpg)
Instead of timing an algorithm, count the number of instructions that it performs
The number of instructions performed may vary based on
The size of the input
The organization of the input
The number of instructions can be written as a cost function on the input size
John Edgar 9
![Page 10: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/10.jpg)
void printArray(int arr[], int size){
for (int i = 0; i < size; ++i){
cout << arr[i] << endl;
}
}
John Edgar 10
Operations performed on an array of length 10
|
declare and initialize i
perform comparison, print array element, and
increment i:10 times
||| ||| ||| ||| ||| ||| ||| ||| ||| ||| |
make comparison when i = 10
32 operations
![Page 11: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/11.jpg)
Instead of choosing a particular input size we will express a cost function for input of size n
Assume that the running time, t, of an algorithm is proportional to the number of operations
Express t as a function of n
Where t is the time required to process the data using some algorithm A
Denote a cost function as tA(n)
▪ i.e. the running time of algorithm A, with input size n
John Edgar 11
![Page 12: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/12.jpg)
void printArray(int arr[], int size){
for (int i = 0; i < size; ++i){
cout << arr[i] << endl;
}
}
John Edgar 12
Operations performed on an array of length n
1
declare and initialize i
perform comparison, print array element, and
increment i: n times
3n 1
make comparison when i = n
t = 3n + 2
![Page 13: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/13.jpg)
The number of operations usually varies based on the size of the input
Though not always – consider array lookup
In addition algorithm performance may vary based on the organization of the input
For example consider searching a large array
If the target is the first item in the array the search will be very fast
John Edgar 13
![Page 14: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/14.jpg)
Algorithm efficiency is often calculated for three broad cases of input
Best case
Average (or “usual”) case
Worst case
This analysis considers how performance varies for different inputs of the same size
John Edgar 14
![Page 15: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/15.jpg)
It can be difficult to determine the exact number of operations performed by an algorithm Though it is often still useful to do so
An alternative to counting all instructions is to focus on an algorithm's barometer instruction The barometer instruction is the instruction that is executed
the most number of times in an algorithm
The number of times that the barometer instruction is executed is usually proportional to its running time
John Edgar 15
![Page 16: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/16.jpg)
Analyze and compare some different algorithms Linear search
Binary search
Selection sort
Insertion sort
Quick sort
John Edgar 16
![Page 17: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/17.jpg)
![Page 18: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/18.jpg)
It is often useful to find out whether or not a list contains a particular item
Such a search can either return true or false
Or the position of the item in the list
If the array isn't sorted use linear search
Start with the first item, and go through the array comparing each item to the target
If the target item is found return true (or the index of the target element)
John Edgar 18
![Page 19: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/19.jpg)
int linearSearch(int arr[], int size, int x){
for (int i=0; i < size; i++){
if(arr[i] == x){
return i;
}
} //for
return -1; //target not found
}
John Edgar 19
The function returns as soon as the target item is found
return -1 to indicate that the item has not been found
![Page 20: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/20.jpg)
Search an array of n items The barometer instruction is equality checking (or
comparisons for short) arr[i] == x;
There are actually two other barometer instructions ▪ What are they?
How many comparisons does linear search perform?
John Edgar 20
int linearSearch(int arr[], int size, int x){
for (int i=0; i < size; i++){
if(arr[i] == x){
return i;
}
} //for
return -1; //target not found
}
![Page 21: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/21.jpg)
Best case
The target is the first element of the array
Make 1 comparison
Worst case
The target is not in the array or
The target is at the last position in the array
Make n comparisons in either case
Average case
Is it (best case + worst case) / 2, i.e. (n + 1) / 2?
John Edgar 21
![Page 22: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/22.jpg)
There are two situations when the worst case arises
When the target is the last item in the array
When the target is not there at all
To calculate the average cost we need to know how
often these two situations arise
We can make assumptions about this
Though any these assumptions may not hold for a
particular use of linear search
John Edgar 22
![Page 23: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/23.jpg)
The target is not in the array half the time
Therefore half the time the entire array has to be checked to determine this
There is an equal probability of the target being at any array location
If it is in the array
That is, there is a probability of 1/n that the target is at some location i
John Edgar 23
![Page 24: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/24.jpg)
Work done if the target is not in the array
n comparisons
This occurs with probability of 0.5
John Edgar 24
![Page 25: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/25.jpg)
Work done if target is in the array:
1 comparison if target is at the 1st location ▪ Occurs with probability 1/n (second assumption)
2 comparisons if target is at the 2nd location ▪ Also occurs with probability 1/n
i comparisons if target is at the ith location
Take the weighted average of the values to find the total expected number of comparisons (E)
E = 1*1/n + 2*1/n + 3*1/n + … + n * 1/n or
E = (n + 1) / 2
John Edgar 25
![Page 26: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/26.jpg)
Target is not in the array: n comparisons Target is in the array (n + 1) / 2 comparisons Take a weighted average of the two amounts:
= (n * ½) + ((n + 1) / 2 * ½)
= (n / 2) + ((n + 1) / 4)
= (2n / 4) + ((n + 1) / 4)
= (3n + 1) / 4
Therefore, on average, we expect linear search to perform (3n + 1) / 4 comparisons
John Edgar 26
![Page 27: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/27.jpg)
If we sort the target array first we can change the linear search average cost to around n / 2
Once a value equal to or greater than the target is found the search can end
▪ So, if a sequence contains 8 items, on average, linear search compares 4 of them,
▪ If a sequence contains 1,000,000 items, linear search compares 500,000 of them, etc.
However, if the array is sorted, it is possible to do much better than this by using binary search
John Edgar 27
![Page 28: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/28.jpg)
int binSearch(int arr[], int size, int target){
int low = 0;
int high = size - 1;
int mid = 0;
while (low <= high){
mid = (low + high) / 2;
if(target == arr[mid]){
return mid;
} else if(target > arr[mid]){
low = mid + 1;
} else { //target < arr[mid]
high = mid - 1;
}
} //while
return -1; //target not found
} John Edgar 28
Index of the last element in the array
Note the if, else if, else
![Page 29: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/29.jpg)
The algorithm consists of three parts
Initialization (setting lower and upper)
While loop including a return statement on success
Return statement which executes when on failure
Initialization and return on failure require the same amount of work regardless of input size
The number of times that the while loop iterates depends on the size of the input
John Edgar 29
![Page 30: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/30.jpg)
The while loop contains an if, else if, else statement The first if condition is met when the target is found
And is therefore performed at most once each time the algorithm is run
The algorithm usually performs 5 operations for each iteration of the while loop
Checking the while condition
Assignment to mid
Equality comparison with target
Inequality comparison
One other operation (setting either lower or upper)
John Edgar 30
The barometer instructions
![Page 31: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/31.jpg)
In the best case the target is the midpoint element of the array
Requiring one iteration of the while loop
John Edgar 31
index 0 1 2 3 4 5 6 7
arr 1 3 7 11 13 17 19 23
mid = (0 + 7) / 2 = 3
binary search (arr, 11)
![Page 32: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/32.jpg)
What is the worst case for binary search?
Either the target is not in the array, or
It is found when the search space consists of one element
How many times does the while loop iterate in the worst case?
John Edgar 32
index 0 1 2 3 4 5 6 7
arr 1 3 7 11 13 17 19 23
mid =
binary search (arr, 20)
(0 + 7) / 2 = 3 (4 + 7) / 2 = 5 (6 + 7) / 2 = 6 done
![Page 33: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/33.jpg)
Each iteration of the while loop halves the search space
For simplicity assume that n is a power of 2
▪ So n = 2k (e.g. if n = 128, k = 7)
How large is the search space?
The first iteration halves the search space to n/2
After the second iteration the search space is n/4
After the kth iteration the search space consists of just one element, since n/2k = n/n = 1
▪ Because n = 2k, k = log2n
Therefore at most log2n iterations of the while loop are made in the worst case!
John Edgar 33
![Page 34: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/34.jpg)
Is the average case more like the best case or the worst case?
What is the chance that an array element is the target
▪ 1/n the first time through the loop
▪ 1/(n/2) the second time through the loop
▪ … and so on …
It is more likely that the target will be found as the search space becomes small
That is, when the while loop nears its final iteration
We can conclude that the average case is more like the worst case than the best case
John Edgar 34
![Page 35: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/35.jpg)
John Edgar 35
n (3n+1)/4 log2(n)
10 8 3
100 76 7
1,000 751 10
10,000 7,501 13
100,000 75,001 17
1,000,000 750,001 20
10,000,000 7,500,001 24
![Page 36: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/36.jpg)
![Page 37: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/37.jpg)
As an example of algorithm analysis let's look at two simple sorting algorithms
Selection Sort and
Insertion Sort
Calculate an approximate cost function for these two sorting algorithms
By analyzing how many operations are performed by each algorithm
This will include an analysis of how many times the algorithms' loops iterate
John Edgar 37
![Page 38: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/38.jpg)
Selection sort is a simple sorting algorithm that repeatedly finds the smallest item
The array is divided into a sorted part and an unsorted part
Repeatedly swap the first unsorted item with the smallest unsorted item
Starting with the element with index 0, and
Ending with last but one element (index n – 1)
John Edgar 38
![Page 39: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/39.jpg)
John Edgar 39
23 41 33 81 07 19 11 45 find smallest unsorted - 7 comparisons
07 41 33 81 23 19 11 45 find smallest unsorted - 6 comparisons
07 11 33 81 23 19 41 45 find smallest unsorted - 5 comparisons
07 11 19 81 23 33 41 45 find smallest unsorted - 4 comparisons
07 11 19 23 81 33 41 45 find smallest unsorted - 3 comparisons
07 11 19 23 33 81 41 45 find smallest unsorted - 2 comparisons
07 11 19 23 33 41 81 45 find smallest unsorted - 1 comparison
07 11 19 23 33 41 45 81
![Page 40: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/40.jpg)
Unsorted elements Comparisons
n n-1
n-1 n-2
… …
3 2
2 1
1 0
n(n-1)/2
John Edgar 40
![Page 41: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/41.jpg)
void selectionSort(int arr[], int size){
for(int i = 0; i < size -1; ++i){
int smallest = i;
// Find the index of the smallest element
for(int j = i + 1; j < size; ++j){
if(arr[j] < arr[smallest]){
smallest = j;
}
}
// Swap the smallest with the current item
temp = arr[i];{
arr[i] = arr[smallest];
arr[smallest] = temp;
}
}
John Edgar 41
inner loop body n(n – 1)/2 times
outer loop n-1 times
![Page 42: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/42.jpg)
The barometer operation for selection sort must be in the inner loop
Since operations in the inner loop are executed the greatest number of times
The inner loop contains four operations
Compare j to array length
Compare arr[j] to smallest
Change smallest
Increment j
John Edgar 42
The barometer instructions
![Page 43: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/43.jpg)
The barometer instruction is evaluated n(n-1) times Let’s calculate a detailed cost function
The outer loop is evaluated n-1 times ▪ 7 instructions (including the loop statements), cost is 7(n-1)
The inner loop is evaluated n(n – 1)/2 times ▪ There are 4 instructions but one is only evaluated some of the time
▪ Worst case cost is 4(n(n – 1)/2)
Some constant amount of work is performed ▪ Parameters are set and the outer loop control variable is initialized
Total cost: 7(n-1) + 4(n(n – 1)/2) + 3 ▪ Assumption: all instructions have the same cost
John Edgar 43
![Page 44: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/44.jpg)
In broad terms and ignoring the actual number of executable statements selection sort
Makes n*(n – 1)/2 comparisons, regardless of the original order of the input
Performs n – 1 swaps
Neither of these operations are substantially affected by the organization of the input
John Edgar 44
![Page 45: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/45.jpg)
Another simple sorting algorithm
Divides array into sorted and unsorted parts
The sorted part of the array is expanded one element at a time
Find the correct place in the sorted part to place the 1st element of the unsorted part
▪ By searching through all of the sorted elements
Move the elements after the insertion point up one position to make space
John Edgar 45
![Page 46: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/46.jpg)
John Edgar 46
23 41 33 81 07 19 11 45 treats first element as sorted part
07 11 19 23 33 41 45 81 locate position for 45 - 1 comparisons
23 41 33 81 07 19 11 45 locate position for 41 - 1 comparison
23 33 41 81 07 19 11 45 locate position for 33 - 2 comparisons
23 33 41 81 07 19 11 45 locate position for 81 - 1 comparison
07 23 33 41 81 19 11 45 locate position for 07 - 4 comparisons
07 19 23 33 41 81 11 45 locate position for 19- 5 comparisons
07 11 19 23 33 41 81 45 locate position for 11- 6 comparisons
![Page 47: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/47.jpg)
inner loop body how many times?
void insertionSort(int arr[], int size){
for(int i = 1; i < size; ++i){
temp = arr[i];
int pos = i;
// Shuffle up all sorted items > arr[i]
while(pos > 0 && arr[pos - 1] > temp){
arr[pos] = arr[pos – 1];
pos--;
} //while
// Insert the current item
arr[pos] = temp;
}
}
John Edgar 47
maximum: i – 1 times for each iteration, n * (n – 1) / 2
outer loop n-1 times
minimum: just the test for each outer loop iteration, n
![Page 48: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/48.jpg)
Sorted
Elements
Worst-case Search
Worst-case Shuffle
0 0 0
1 1 1
2 2 2
… … …
n-1 n-1 n-1
n(n-1)/2 n(n-1)/2
John Edgar 48
![Page 49: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/49.jpg)
The efficiency of insertion sort is affected by the state of the array to be sorted
In the best case the array is already completely sorted!
No movement of array elements is required
Requires n comparisons
John Edgar 49
![Page 50: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/50.jpg)
In the worst case the array is in reverse order Every item has to be moved all the way to the
front of the array
The outer loop runs n-1 times
▪ In the first iteration, one comparison and move
▪ In the last iteration, n-1 comparisons and moves
▪ On average, n/2 comparisons and moves
For a total of n * (n-1) / 2 comparisons and moves
John Edgar 50
![Page 51: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/51.jpg)
What is the average case cost?
Is it closer to the best case?
Or the worst case?
If random data is sorted, insertion sort is usually closer to the worst case
Around n * (n-1) / 4 comparisons
And what do we mean by average input for a sorting algorithm in anyway?
John Edgar 51
![Page 52: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/52.jpg)
![Page 53: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/53.jpg)
Quicksort is a more efficient sorting algorithm than either selection or insertion sort
It sorts an array by repeatedly partitioning it
Partitioning is the process of dividing an array into sections (partitions), based on some criteria
Big and small values
Negative and positive numbers
Names that begin with a-m, names that begin with n-z
Darker and lighter pixels
John Edgar 53
![Page 54: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/54.jpg)
John Edgar 54
Partition this array into small and big values using a partitioning algorithm
31 12 07 23 93 02 11 18
![Page 55: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/55.jpg)
John Edgar 55
Partition this array into small and big values using a partitioning algorithm
We will partition the array around the last value (18), we'll call this value the pivot
31 12 07 23 93 02 11 18
Use two indices, one at each end of the array, call them low and high
18
![Page 56: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/56.jpg)
John Edgar 56
31 12 07 23 93 02 11 18
arr[low] (31) is greater than the pivot and should be on the right, we need to swap it with something
We will partition the array around the last value (18), we'll call this value the pivot
Use two indices, one at each end of the array, call them low and high
Partition this array into small and big values using a partitioning algorithm
![Page 57: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/57.jpg)
John Edgar 57
31 12 07 23 93 02 11 18
arr[low] (31) is greater than the pivot and should be on the right, we need to swap it with something
arr[high] (11) is less than the pivot so swap with arr[low]
We will partition the array around the last value (18), we'll call this value the pivot
Use two indices, one at each end of the array, call them low and high
Partition this array into small and big values using a partitioning algorithm
![Page 58: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/58.jpg)
John Edgar 58
31 12 07 23 93 02 11 18 31 11
We will partition the array around the last value (18), we'll call this value the pivot
Use two indices, one at each end of the array, call them low and high
Partition this array into small and big values using a partitioning algorithm
![Page 59: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/59.jpg)
John Edgar 59
12 07 23 93 02 18
increment low until it needs to be swapped with something
31 11 12 07
then decrement high until it can be swapped with low
We will partition the array around the last value (18), we'll call this value the pivot
Use two indices, one at each end of the array, call them low and high
Partition this array into small and big values using a partitioning algorithm
![Page 60: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/60.jpg)
John Edgar 60
12 07 23 93 02 18
and then swap them
31 23 02 11 12 07
increment low until it needs to be swapped with something
then decrement high until it can be swapped with low
We will partition the array around the last value (18), we'll call this value the pivot
Use two indices, one at each end of the array, call them low and high
Partition this array into small and big values using a partitioning algorithm
![Page 61: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/61.jpg)
John Edgar 61
12 07 93 18
repeat this process until
31 23 02 11
high and low are the same
We will partition the array around the last value (18), we'll call this value the pivot
Use two indices, one at each end of the array, call them low and high
Partition this array into small and big values using a partitioning algorithm
![Page 62: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/62.jpg)
John Edgar 62
repeat this process until
high and low are the same
We'd like the pivot value to be in the centre of the array, so we will swap it with the first item greater than it
12 07 93 18 31 23 02 11 93 18
We will partition the array around the last value (18), we'll call this value the pivot
Use two indices, one at each end of the array, call them low and high
Partition this array into small and big values using a partitioning algorithm
![Page 63: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/63.jpg)
John Edgar 63
smalls bigs pivot
12 07 93 18 31 23 02 11
We will partition the array around the last value (18), we'll call this value the pivot
Use two indices, one at each end of the array, call them low and high
Partition this array into small and big values using a partitioning algorithm
![Page 64: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/64.jpg)
John Edgar 64
Use the same algorithm to partition this array into small and big values
00 08 07 01 06 02 05 09
bigs! pivot
00 08 07 01 06 02 05 09
smalls
![Page 65: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/65.jpg)
John Edgar 65
Or this one:
09 08 07 06 05 04 02 01
bigs pivot
01 08 07 06 05 04 02 09
smalls
![Page 66: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/66.jpg)
The Quicksort algorithm works by repeatedly partitioning an array
Each time a subarray is partitioned there is
A sequence of small values,
A sequence of big values, and
A pivot value which is in the correct position
Partition the small values, and the big values
Repeat the process until each subarray being partitioned consists of just one element
John Edgar 66
![Page 67: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/67.jpg)
The Quicksort algorithm repeatedly partitions an array until it is sorted
Until all partitions consist of at most one element
A simple iterative approach would halve each sub-array to get partitions
But partitions are not necessarily of the same size
So the start and end indexes of each partition are not easily predictable
John Edgar 67
![Page 68: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/68.jpg)
47 70 36 97 03 67 29 11 48 09 53
09 29 48 03 47 97
03 11 29 48 61
11
John Edgar 68
36 11 61 70
36 09 97
08 01 36
11 09 03
09 03
53
70 47
29
11
53
48 61 97 70 47 53
36 29 48 61 97 70 47 53
36 29 48 61 97 70 47 53
![Page 69: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/69.jpg)
One way to implement Quicksort might be to record the index of each new partition
But this is difficult and requires a reasonable amount of space
The goal is to record the start and end index of each partition
This can be achieved by making them the parameters of a recursive function
John Edgar 69
![Page 70: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/70.jpg)
void quicksort(arr[], int low, int high){
if (low < high){
pivot = partition(arr[], low, high)
quicksort(arr[], low, pivot – 1)
quicksort(arr[], pivot + 1, high)
}
}
John Edgar 70
![Page 71: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/71.jpg)
How long does Quicksort take to run?
Let's consider the best and the worst case
These differ because the partitioning algorithm may not always do a good job
Let's look at the best case first
Each time a sub-array is partitioned the pivot is the exact midpoint of the slice (or as close as it can get) ▪ So it is divided in half
What is the running time?
John Edgar 71
![Page 72: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/72.jpg)
John Edgar 72
08 01 02 07 03 06 04 05
bigs pivot
04 01 02 03 05 06 08 07
smalls
First partition
![Page 73: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/73.jpg)
John Edgar 73
big1 pivot1
02 01 04 05 06 08
sm1
04 01 02 03 05 06 08 07
Second partition
07 03
pivot1 pivot2
pivot2 big2 sm2
![Page 74: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/74.jpg)
John Edgar 74
pivot1
02 03 04 05 06 07 08
Third partition
02 01 03 04 05 06 07 08
pivot1 done done done
01
![Page 75: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/75.jpg)
Each sub-array is divided in half in each partition Each time a series of sub-arrays are partitioned n
(approximately) comparisons are made
The process ends once all the sub-arrays left to be partitioned are of size 1
How many times does n have to be divided in half before the result is 1? log2 (n) times
Quicksort performs n * log2 (n) operations in the best case
John Edgar 75
![Page 76: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/76.jpg)
First partition
John Edgar 76
09 08 07 06 05 04 02 01
bigs pivot
01 08 07 06 05 04 02 09
smalls
![Page 77: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/77.jpg)
John Edgar 77
bigs pivot
01 08 07 06 05 04 02 09
smalls
01 08 07 06 05 04 02 09
Second partition
![Page 78: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/78.jpg)
John Edgar 78
bigs pivot
01 02 07 06 05 04 08 09
01 08 07 06 05 04 02 09
Third partition
![Page 79: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/79.jpg)
John Edgar 79
pivot
01 02 07 06 05 04 08 09
smalls
01 02 07 06 05 04 08 09
Fourth partition
![Page 80: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/80.jpg)
John Edgar 80
bigs pivot
01 02 04 06 05 07 08 09
01 02 07 06 05 04 08 09
Fifth partition
![Page 81: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/81.jpg)
John Edgar 81
pivot
01 02 04 06 05 07 08 09
smalls
01 02 04 06 05 07 08 09
Sixth partition
![Page 82: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/82.jpg)
John Edgar 82
pivot
01 02 04 05 06 07 08 09
01 02 04 06 05 07 08 09
Seventh partition!
![Page 83: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/83.jpg)
Every partition step ends with no values on one side of the pivot
The array has to be partitioned n times, not log2(n) times
So in the worst case Quicksort performs around n2 operations
The worst case usually occurs when the array is nearly sorted (in either direction)
John Edgar 83
![Page 84: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/84.jpg)
With a large array we would have to be very, very unlucky to get the worst case
Unless there was some reason for the array to already be partially sorted
The average case is much more like the best case than the worst case
There is an easy way to fix a partially sorted arrays to that it is ready for Quicksort
Randomize the positions of the array elements!
John Edgar 84
![Page 85: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/85.jpg)
![Page 86: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/86.jpg)
Linear search: 3(n + 1)/4 – average case
Given certain assumptions
Binary search: log2n – worst case
Average case similar to the worst case
Selection sort: n((n – 1) / 2) – all cases Insertion sort: n((n – 1) / 2) – worst case
Average case is similar to the worst case
Quicksort: n(log2(n)) – best case
Average case is similar to the best case
John Edgar 86
![Page 87: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/87.jpg)
Let's compare these algorithms for some arbitrary input size (say n = 1,000) In order of the number of comparisons
▪ Binary search
▪ Linear search
▪ Insertion sort best case
▪ Quicksort average and best cases
▪ Selection sort all cases, Insertion sort average and worst cases, Quicksort worst case
John Edgar 87
![Page 88: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/88.jpg)
What do we want to know when comparing two algorithms?
The most important thing is how quickly the time requirements increase with input size
e.g. If we double the input size how much longer does an algorithm take?
Here are some graphs …
John Edgar 88
![Page 89: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/89.jpg)
0
50
100
150
200
250
300
350
400
450
10 11 12 13 14 15 16 17 18 19 20
Nu
mb
er
of
Op
era
tio
ns
n
log2n
5(log2n)
3(n+1)/4
n
n(log2n)
n((n-1)/2)
n2
John Edgar 89
Hard to see what is happening with n so small …
![Page 90: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/90.jpg)
John Edgar 90
n2 and n(n-1)/2 are growing much faster than any of the others
0
2000
4000
6000
8000
10000
12000
10 20 30 40 50 60 70 80 90 100
Nu
mb
er
of
Op
era
tio
ns
n
log2n
5(log2n)
3(n+1)/4
n
n(log2n)
n((n-1)/2)
n2
![Page 91: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/91.jpg)
John Edgar 91
Hmm! Let's try a logarithmic scale …
0
200000000000
400000000000
600000000000
800000000000
1000000000000
1200000000000
10 50 100 500 1000 5000 10000 50000 100000 500000 1000000
Nu
mb
er
of
Op
era
tio
ns
n
log2n
5(log2n)
3(n+1)/4
n
n(log2n)
n((n-1)/2)
n2
![Page 92: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/92.jpg)
John Edgar 92
Notice how clusters of growth rates start to emerge
1
10
100
1000
10000
100000
1000000
10000000
100000000
1000000000
10000000000
100000000000
1000000000000
10 50 100 500 1000 5000 10000 50000 100000 500000 1000000
Nu
mb
er
of
Op
era
tio
ns
n
log2n
5(log2n)
3(n+1)/4
n
n(log2n)
n((n-1)/2)
n2
![Page 93: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/93.jpg)
Exact counting of operations is often difficult (and tedious), even for simple algorithms
And is often not much more useful than estimates due to the relative importance of other factors
O Notation is a mathematical language for evaluating the running-time of algorithms
O-notation evaluates the growth rate of an algorithm
John Edgar 93
![Page 94: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/94.jpg)
Cost Function: tA(n) = n2 + 20n + 100 Which term in the function is the most important?
It depends on the size of n n = 2, tA(n) = 4 + 40 + 100
▪ The constant, 100, is the dominating term
n = 10, tA(n) = 100 + 200 + 100 ▪ 20n is the dominating term
n = 100, tA(n) = 10,000 + 2,000 + 100 ▪ n2 is the dominating term
n = 1000, tA(n) = 1,000,000 + 20,000 + 100 ▪ n2 is still the dominating term
John Edgar 94
![Page 95: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/95.jpg)
O notation approximates a cost function that allows us to estimate growth rate
The approximation is usually good enough
▪ Especially when considering the efficiency of an algorithm as n gets very large
Count the number of times that an algorithm executes its barometer instruction
And determine how the count increases as the input size increases
John Edgar 95
![Page 96: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/96.jpg)
Big-O notation does not give a precise formulation of the cost function for a particular data size
It expresses the general behaviour of the algorithm as the data size n grows very large so ignores
lower order terms and
constants
A Big-O cost function is a simple function of n
n, n2, log2(n), etc.
John Edgar 96
![Page 97: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/97.jpg)
An algorithm is said to be order f(n)
Denoted as O(f(n))
The function f(n) is the algorithm's growth rate function
If a problem of size n requires time proportional to n then the problem is O(n)
▪ e.g. If the input size is doubled so is the running time
John Edgar 97
![Page 98: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/98.jpg)
An algorithm is order f(n) if there are positive constants k and m such that
tA(n) k * f(n) for all n m ▪ i.e. find constants k and m such that the cost function is less than or
equal to k * a simpler function for all n greater than or equal to m
If so we would say that tA(n) is O(f(n))
John Edgar 98
![Page 99: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/99.jpg)
Finding a constant k | tA(n) k * f(n) shows that there is no higher order term than f(n)
e.g. If the cost function was n2 + 20n + 100 and I believed this was O(n)
▪ I would not be able to find a constant k | tA(n) k * f(n) for all values of n
For some small values of n lower order terms may dominate
The constant m addresses this issue
John Edgar 99
![Page 100: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/100.jpg)
The idea is that a cost function can be approximated by another, simpler, function
The simpler function has 1 variable, the data size n
This function is selected such that it represents an upper bound on the value of tA(n)
Saying that the time efficiency of algorithm A tA(n) is O(f(n)) means that
A cannot take more than O(f(n)) time to execute, and
The cost function tA(n) grows at most as fast as f(n)
John Edgar 100
![Page 101: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/101.jpg)
An algorithm’s cost function is 3n + 12
If we can find constants m and k such that:
k * n > 3n + 12 for all n m then
The algorithm is O(n)
Find values of k and m so that this is true
k = 4, and
m = 12 then
4n 3n + 12 for all n 12
John Edgar 101
![Page 102: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/102.jpg)
An algorithm’s cost function is 2n2 + 10n + 6
If we can find constants m and k such that:
k * n2 > 2n2 + 10n + 6 for all n m then
The algorithm is O(n2)
Find values of k and m so that this is true
k = 3, and
m = 11 then
3n2 > 2n2 + 10n + 6 for all n 11
John Edgar 102
![Page 103: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/103.jpg)
John Edgar 103
0
200
400
600
800
1000
1200
1400
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
2n2+10n+6
3n2
After this point 3n2 is always going to be larger than 2n2 +10n + 6
![Page 104: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/104.jpg)
All these expressions are O(n):
n, 3n, 61n + 5, 22n – 5, …
All these expressions are O(n2):
n2, 9n2, 18n2 + 4n – 53, …
All these expressions are O(n log n):
n(log n), 5n(log 99n), 18 + (4n – 2)(log (5n + 3)), …
John Edgar 104
![Page 105: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/105.jpg)
O(k * f) = O(f) if k is a constant
e.g. O(23 * O(log n)), simplifies to O(log n)
O(f + g) = max[O(f), O(g)]
O(n + n2), simplifies to O(n2)
O(f * g) = O(f) * O(g)
O(m * n), equals O(m) * O(n)
Unless there is some known relationship between m and n that allows us to simplify it, e.g. m < n
John Edgar 105
![Page 106: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/106.jpg)
O(1) – constant time
The time is independent of n, e.g. list look-up
O(log n) – logarithmic time
Usually the log is to the base 2, e.g. binary search
O(n) – linear time, e.g. linear search O(n*logn) – e.g. Qquicksort, Mergesort O(n2) – quadratic time, e.g. selection sort O(nk) – polynomial (where k is some constant)
O(2n) – exponential time, very slow!
John Edgar 106
![Page 107: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/107.jpg)
We write O(1) to indicate something that takes a constant amount of time
e.g. finding the minimum element of an ordered array takes O(1) time
▪ The min is either at the first or the last element of the array
Important: constants can be large
So in practice O(1) is not necessarily efficient
It tells us is that the algorithm will run at the same speed no matter the size of the input we give it
John Edgar 107
![Page 108: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/108.jpg)
The O notation growth rate of some algorithms varies depending on the input
Typically we consider three cases:
Worst case, usually (relatively) easy to calculate and therefore commonly used
Average case, often difficult to calculate
Best case, usually easy to calculate but less important than the other cases
John Edgar 108
![Page 109: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/109.jpg)
Linear search Best case: O(1)
Average case: O(n)
Worst case: O(n) Binary search Best case: O(1)
Average case: O(log n)
Worst case: O(log n)
John Edgar 109
![Page 110: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/110.jpg)
Quicksort
Best case: O(n(log2n))
Average case: O(n(log2n))
Worst case: O(n2)
Mergesort
Best case: O(n(log2n))
Average case: O(n(log2n))
Worst case: O(n(log2n))
John Edgar 110
![Page 111: Big O Notation](https://reader031.vdocuments.us/reader031/viewer/2022020403/55cf8617550346484b942d98/html5/thumbnails/111.jpg)
Selection sort
Best Case: O(n2)
Average case: O(n2)
Worst case: O(n2)
Insertion sort
Best case: O(n)
Average case: O(n2)
Worst case: O(n2)
John Edgar 111