big-o and sorting

Post on 04-Jan-2016

34 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Big-O and Sorting. February 6, 2006. Administrative Stuff. Readings for today: Ch 7.3-7.5. Readings for tomorrow: Ch 8. Sorting!. Very common to need data in order Viewing, printing Faster to search, find min/max, compute median/mode, etc. Lots of different sorting algoritms - PowerPoint PPT Presentation

TRANSCRIPT

Big-O and Sorting

February 6, 2006

Administrative Stuff

• Readings for today: Ch 7.3-7.5

• Readings for tomorrow: Ch 8

Sorting!

• Very common to need data in order– Viewing, printing– Faster to search, find min/max, compute

median/mode, etc.

• Lots of different sorting algoritms– From the simple to very complex– Some optimized for certain situations (lots of

duplicates, almost sorted, etc.)– Typically sort arrays, but algorithms usually can be

adapted for other data structures (e.g. linked lists)

Selection sort

• Sort by "selecting" smallest and putting in front– Search entire array for minimum value– Min is placed in first slot– Could move elements over to make space, but

faster to just swap with current first– Repeat for second smallest, third, and so on

Selection sort codevoid SelectionSort(int arr[], int n){ for (int i = 0; i < n-1; i++) { int minIndex = i; for (int j = i+1; j < n; j++) { if (arr[j] < arr[minIndex]) minIndex = j; } Swap(arr[i], arr[minIndex]); }}

Analyzing selection sort

for (int i = 0; i < n-1; i++) { int minIndex = i; for (int j = i+1; j < n; j++) { if (arr[j] < arr[minIndex]) minIndex = j; } Swap(arr[i], arr[minIndex]); }

• Count statements– First time inner loop N-1 comparisons– N-2 second time, then N-3, …– Last iteration 1 comparison

Analyzing selection sort

• N-1 + N-2 + N-3 + … + 3 + 2 + 1– "Gaussian sum"

• Add sum to self

Sum =

Analyzing selection sort• N-1 + N-2 + N-3 + … + 3 + 2 + 1

– "Gaussian sum"

• Add sum to self

N-1 + N-2 + N-3 + … + 3 + 2 + 1

+ 1 + 2 + 3 + …. + N-2 + N-1

= N + N + N + …. + N + N

= (N-1)N

Sum = 1/2 * (N-1)N

O(N2)

Quadratic growth• In clock time

– 10,000 3 sec– 20,000 13 sec– 50,000 77 sec– 100,000 5 min

• Double input -> 4X time– Feasible for small inputs, quickly unmanagable

• Halve input -> 1/4 time– Hmm…– If two sorted half-size arrays, how to produce sorted full

array?

Mergesort• "Divide and conquer" algorithm

– Divide array in half– Recursively sort each half– Merge two halves together

• "Easy-split hard-join"– No complex decision about which goes where,

just divide in middle– Merge step preserves ordering from each half

6 2 8 4 10 7 1 5 9 3

void MergeSort(int array[], int n){ if (n > 1) { int n1 = n/2; int n2 = n - n1; int *arr1 = CopySubArray(array, 0, n1); int *arr2 = CopySubArray(array, n1, n2);

MergeSort(arr1, n1); MergeSort(arr2, n2);

Merge(array, arr1, n1, arr2, n2);

delete[] arr1; delete[] arr2; }}

CopySubArray

// Create a new array in memoryvoid CopyArray(int arr[], int n, int * & copy){ copy = new int[n];

for(int i = 0; i < n; i++) { copy[i] = arr[i];}

}

Merge codevoid Merge(int array[], int arr1[], int n1, int arr2[], int n2){ int p = 0, p1 = 0, p2 = 0;

while (p1 < n1 && p2 < n2) { if (arr1[p1] < arr2[p2]) array[p++] = arr1[p1++]; else array[p++] = arr2[p2++]; } while (p1 < n1) array[p++] = arr1[p1++]; while (p2 < n2) array[p++] = arr2[p2++];}

void Merge(int array[], int arr1[], int n1, int arr2[], int n2){ int p, p1, p2; p = p1 = p2 = 0; while (p1 < n1 && p2 < n2) { // Merge until hit if (arr1[p1] < arr2[p2]) { // end of one array array[p++] = arr1[p1++]; } else { array[p++] = arr2[p2++]; } } while (p1 < n1) { // Merge rest of array[p++] = arr1[p1++]; // remaining array } while (p2 < n2) { array[p++] = arr2[p2++]; }}

4array

arr1

arr2

n1

n2

p1

p2

p

7 8 12 4

5 9 16 18 4

Merge sort analysis

void MergeSort(int array[], int n){ if (n > 1) { int n1 = n/2; int n2 = n - n1; int *arr1 = CopySubArray(array, 0, n1); int *arr2 = CopySubArray(array, n1, n2); MergeSort(arr1, n1); MergeSort(arr2, n2); Merge(array, arr1, n1, arr2, n2); delete[] arr1; delete[] arr2; }}

MS(N)

Merge sort analysis= N

= N/2 + N/2MS(N/2) MS(N/2)+

N/4 N/4 N/4 N/4 = 4*N/4+

N/8 N/8 N/8 N/8 N/8 N/8 N/8 N/8+

= 8*N/8

Each level contributes N

...

MS(N)

Merge sort analysis

MS(N/2) MS(N/2)

N/4 N/4 N/4 N/4

N/8 N/8 N/8 N/8 N/8 N/8 N/8 N/8

N/2K = 1 N = 2K

lg N = K

lg N levels * N per level= O(NlgN)

K levels

N/2K

In clock time

• Compare SelectionSort to MergeSort– 10,000 3 sec .05 sec– 20,000 13 sec .15 sec– 50,000 78 sec .38 sec– 100,000 5 min .81 sec– 200,000 20 min 1.7 sec– 1,000,000 8 hrs (est) 9 sec

• O(NlgN) is looking pretty good! But can we do even better?

Can we do even better than MergeSort?

• O(N log N) is fastest sort in the general case– So, theoretically, answer is “no”

• But, we can come up with a different O(N log N) sort that is practically faster

• Want to avoid overhead of creating new arrays (as is done in MergeSort)– Bring on the QuickSort!

Quicksort

5 3 7 4 8 6 2 1

Recursive Insight

5 3 7 4 8 6 2 1

Recursive Insight

5 3 7 4 8 6 2 1

select “pivot”

Partition array so:

• everything smaller than pivot is on left

• everything greater than or equal to pivot is on right

• pivot is in-between

Recursive Insight

5 3 7 4 8 6 2 1

Partition array so:

• everything smaller than pivot is on left

• everything greater than or equal to pivot is on right

• pivot is in-between

Recursive Insight

2 3 1 4 5 6 8 7

Recursive Insight

2 3 1 4 5 6 8 7

Now recursive sort “red” sub-array

Recursive Insight

1 2 3 4 5 6 8 7

Now recursive sort “red” sub-array

Recursive Insight

1 2 3 4 5 6 8 7

Now recursive sort “red” sub-array

Then, recursive sort “blue” sub-array

Recursive Insight

1 2 3 4 5 6 7 8

Now recursive sort “red” sub-array

Then, recursive sort “blue” sub-array

Recursive Insight

1 2 3 4 5 6 7 8

Everything is sorted!

void Quicksort(int arr[], int n)

{

if (n < 2) return;

int boundary = Partition(arr, n);

// Sort subarray up to pivot Quicksort(arr, boundary);

// Sort subarray after pivot to end Quicksort(arr + boundary + 1, n – boundary - 1);

}

“boundary” is the index of the pivot

This is equal to the number of elements before pivot

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 7 4 8 6 2 1

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 7 4 8 6 2 1

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 7 4 8 6 2 1

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 7 4 8 6 2 1

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 7 4 8 6 2 1

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 7 4 8 6 2 1

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 7 4 8 6 2 1

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 8 6 2 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 8 6 2 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 8 6 2 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 8 6 2 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 8 6 2 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 8 6 2 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 8 6 2 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 8 6 2 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 8 6 2 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 2 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 2 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 2 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 2 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 2 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 2 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 2 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 2 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 2 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

5 3 1 4 2 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

2 3 1 4 5 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

2 3 1 4 5 6 8 7

pivotlh rh

int Partition(int arr[], int n){ int lh = 1, rh = n - 1;

int pivot = arr[0]; while (true) { while (lh < rh && arr[rh] >= pivot) rh--; while (lh < rh && arr[lh] < pivot) lh++; if (lh == rh) break; Swap(arr[lh], arr[rh]); } if (arr[lh] >= pivot) return 0; Swap(arr[0], arr[lh]); return lh;}

2 3 1 4 5 6 8 7

pivotlh rh

Returns 4 (index of pivot)

void Quicksort(int arr[], int n){ if (n < 2) return;

int boundary = Partition(arr, n);

// Sort subarray up to pivot Quicksort(arr, boundary);

// Sort subarray after pivot to end Quicksort(arr + boundary + 1, n – boundary - 1);

}

void Quicksort(int arr[], int n){ if (n < 2) return;

int boundary = Partition(arr, n);

// Sort subarray up to pivot Quicksort(arr, boundary);

// Sort subarray after pivot to end Quicksort(arr + boundary + 1, n – boundary - 1);

}

O(1)

O(n)

T(n/2)

T(n/2)

T(n) = O(1) + O(n) + 2T(n/2)= O(n) + 2T(n/2)

Same as MergeSort O(n log n)

The whole recursion

5 3 7 4 8 6 2 1

First partition

2 3 1 4 5 6 8 7

First partition

2 3 1 4 5 6 8 7

Recursive sort {2, 3, 1, 4}

2 3 1 4 5 6 8 7

Partition {2, 3, 1, 4}

1 2 3 4 5 6 8 7

Recursive sort {1}

1 2 3 4 5 6 8 7

Recursive sort {1}

1 2 3 4 5 6 8 7

base case

Recursive sort {3, 4}

1 2 3 4 5 6 8 7

Partition {3, 4}

1 2 3 4 5 6 8 7

Recursive sort {4}

1 2 3 4 5 6 8 7

Recursive sort {4}

1 2 3 4 5 6 8 7

base case

Recursive sort {6, 8, 7}

1 2 3 4 5 6 8 7

Leap of faith!

1 2 3 4 5 6 7 8

Empirical comparison of MergeSort vs QuickSort

N Merge sort Quicksort

10 0.54 msec 0.10 msec

20 1.17 msec 0.26 msec

40 2.54 msec 0.52 msec

100 6.90 msec 1.76 msec

200 14.84 msec 4.04 msec

400 31.25 msec 8.85 msec

1000 84.38 msec 26.04 msec

2000 179.17 msec 56.25 msec

4000 383.33 msec 129.17 msec

10,000 997.67 msec 341.67 msec

top related