adsa: sorting/3 1 241-423 advanced data structures and algorithms objective –examine popular...

64
ADSA: Sorting/3 241-423 Advanced Data Structures and Algorithms Objective examine popular sorting algorithms, with an emphasis on divide and conquer Semester 2, 2013-2014 3. Sorting Algorithms

Upload: alexina-bradley

Post on 18-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

ADSA: Sorting/3 1

241-423 Advanced Data Structures and Algorithms

• Objective– examine popular sorting algorithms, with an

emphasis on divide and conquer

Semester 2, 2013-2014

3. Sorting Algorithms

ADSA: Sorting/3 2

Contents

1. Insertion Sort

2. Divide and Conquer Algorithms

3. Merge Sort

4. Quicksort

5. Comparison of Sorting Algorithms

6. Finding the kth Largest Element

ADSA: Sorting/3 3

1. Insertion Sort

• Each pass inserts an element (x) into a sorted sublist (sub-array) on the left.

• Items larger than x move to the right to make room for its insertion.

ADSA: Sorting/3 4

Insertion Sort Diagram

ADSA: Sorting/3 5

Outline Algorithm

• Assume the first array element is in the right position.

• In the ith pass (1 ≤ i ≤ n-1), the elements in the range 0 to i-1 are already sorted.

• Insert ith position target into correct position j by moving elements in the range [j, i-1] to the right until there is space in arr[j].

ADSA: Sorting/3 6

Simple Insertion Sort

public static void insertion_srt(int arr[]){

int n = arr.length;

for (int i = 1; i < n; i++) {

int j = i;

int target = arr[i]; // sort ith elem

while ((j > 0) && (arr[j-1] > target)){

arr[j] = arr[j-1]; // move right

j--;

}

arr[j] = target;

}

}

ADSA: Sorting/3 7

insertionSort()public static <T extends Comparable<? super T>>void insertionSort(T[] arr){ int n = arr.length; for (int i = 1; i < n; i++) { int j = i; T target = arr[i];

while (j > 0 && target.compareTo(arr[j-1]) < 0) {

arr[j] = arr[j-1];

j--;

}

arr[j] = target;

}

} // end of insertionSort()

ADSA: Sorting/3 8

Insertion Sort Efficiency

• Best case running time is O(n) – when the array is already sorted

• The worst and average case running times are O(n2).

• Insertion sort is very efficient when the array is "almost sorted".

ADSA: Sorting/3 9

2. Divide and Conquer Algorithms

• Divide a problem into smaller versions of the same problem, using recursion.

• Solve the smaller versions.

• Combine the small versions solutions together to get an answer for the big, original problem.

ADSA: Sorting/3 10

Examples

• Binary search

• Merge sort and quicksort (here)

• Binary tree traversal

ADSA: Sorting/3 11

3. Merge Sort• Sort an array with n elements by splitting it

into two halves. Keep splitting in half recursively.

• Sort the small elements.

• Merge the small elements recursively back together into a single sorted array.

ADSA: Sorting/3 12

Merge Sort Diagram

ADSA: Sorting/3 13

General Sort Methods

• Ford and Topp's Arrays class provides two versions of the merge sort algorithm.– one version takes an Object array arr[] as input;– the other version is generic and specifies arr[]

as an array of type T

• Both methods call msort() to carry out the merge sort.

ADSA: Sorting/3 14

sort() - with Object Array

public static void sort(Object[] arr){ // create a temporary array Object[] tempArr = arr.clone();

msort(arr, tempArr, 0, arr.length);}

sort the entire array

(the range 0-arr.length)

ADSA: Sorting/3 15

sort() - Generic Version

public static <T extends Comparable<? super T>>void sort(T[] arr){ // create a temporary array T[] tempArr = (T[])arr.clone();

msort(arr, tempArr, 0, arr.length);}

ADSA: Sorting/3 16

msort()

• Split into two lists by computing the midpoint of the index range:

int midpt = (last + first)/2;

• Call msort() recursively on the index range [first, mid) and on the index range [mid, last).

• When the resulting lists are small, start merging them back together into sorted order.

ADSA: Sorting/3 17

Tracing msort()

split

merge

ADSA: Sorting/3 18

msort()private static void msort(Object[] arr, Object[] tempArr, int first, int last){ // if sublist has more than 1 elem. if ((first + 1) < last){ int midpt = (last + first)/2;

msort(arr, tempArr, first, midpt); msort(arr, tempArr, midpt, last);

// if arr[] is now sorted, finish if (((Comparable)arr[midpt-1]).compareTo (arr[midpt]) <= 0) return; :

ADSA: Sorting/3 19

// indexA scans arr[] in range [first, mid) int indexA = first;

// indexB scans arr[] in range [mid, last) int indexB = midpt;

int indexC = first; // for merged temp list

/* while both sublists are not finished, compare arr[indexA] and arr[indexB]; copy the smaller into the temp list */ while (indexA < midpt && indexB < last) { if (((Comparable)arr[indexA]).compareTo (arr[indexB]) < 0) { tempArr[indexC] = arr[indexA]; indexA++; }

ADSA: Sorting/3 20

else { tempArr[indexC] = arr[indexB]; indexB++; } indexC++; }

// copy over what's left of sublist A while (indexA < midpt) { tempArr[indexC] = arr[indexA]; indexA++; indexC++; } :

ADSA: Sorting/3 21

// copy over what's left of sublist B while (indexB < last) { tempArr[indexC] = arr[indexB]; indexB++; indexC++; }

// copy temp array back to arr[] for (int i = first; i < last; i++) arr[i] = tempArr[i]; }

} // end of msort()

ADSA: Sorting/3 22

msort() Notes

• Continue only as long as first+1 < last

• Do not merge arr if arr[mid-1] < arr[mid]

ADSA: Sorting/3 23

Recursion Tree for Merge Sort

ADSA: Sorting/3 24

Efficiency of Merge Sort

• Total number of comparisons = no. of levels * no. of comparisons at a level

• msort() starts with a list of size n

• msort() recurses until the sublist size is 1

• Each level roughly halves the sublist size:– n, n/2, n/4, ..., 1

– no. of levels = log2n (roughly)

ADSA: Sorting/3 25

• No. of msort() calls at a level:– at level 0: 1 msort() call– at level 1: 2 calls– at level 2: 4 calls– ...– at level i: 2i calls

ADSA: Sorting/3 26

• No of comparisons in 1 msort call at a level:– at level 0: a msort() call compares n elements

– at level 1: n/2 comparisons

– at level 2: n/4 comparisons

– ...

– at level i: n/2i elements

• Total no. of comparisons at a level:– no. of calls at a level * comparisons in 1 msort()call

– 2i * n/2i = n

ADSA: Sorting/3 27

• Total number of comparisons = no. of levels * no. of comparisons at a level

= log2n * n

• So the worst case running time is = O(n log2n)

ADSA: Sorting/3 28

4. Quicksort

• Uses a divide-and-conquer strategy like merge sort.

• But, unlike merge sort, quicksort is an in-place sorting algorithm– elements are exchanged within the list without the need

for temporary lists/arrays– space efficient

ADSA: Sorting/3 29

Quicksort Steps

• Pick an element, called a pivot, from the list.

• Reorder the list so that all elements which are less than the pivot come before the pivot and so that all elements greater than the pivot come after it

ADSA: Sorting/3 30

• Recursively call quicksort on the sublist of lesser elements and the sublist of greater elements.

• The stopping case for the recursion are lists of size zero or one, which are always sorted.

ADSA: Sorting/3 31

Quicksort Diagram

pivot

ADSA: Sorting/3 32

Partitioning a List

• The pivot is the element at index– mid = (first + last)/2.

• Separate the elements of arr[] into two sublists, Sl and Sh.

– Sl contains the elements ≤ pivot (l = low)

– Sh contains the elements ≥ pivot (h = high)

ADSA: Sorting/3 33

• Exchange arr[first] and arr[mid]

• Scan the list with index range [first+1, last)– scanUp starts at first+1 and moves up the list,

finding elements for Sl.

– scanDown starts at position last -1 and moves down the list, finding elements for Sh.

ADSA: Sorting/3 34

• When arr[scanUp] pivot and arr[scanDown] pivot then the two elements are in the wrong sublists.

• Exchange the elements at the two positions and then resume scanning.

ADSA: Sorting/3 35

ADSA: Sorting/3 36

ADSA: Sorting/3 37

• scanUp and scanDown move toward each other until they meet or pass one another (scanDown scanUp).

ADSA: Sorting/3 38

• scanDown is at the place where the pivot should appear– exchange arr[0] and arr[scanDown] to correctly

position the pivot

ADSA: Sorting/3 39

pivotIndex()

• The methodpublic static <T extends Comparable<? super

T>> int pivotIndex(T[] arr, int first, int last)

takes array arr and index range [first, last) and returns the index of the pivot after partitioning arr[].

ADSA: Sorting/3 40

public static <T extends Comparable<? super T>>int pivotIndex(T[] arr, int first, int last){ int mid; // index for the midpoint T pivot;

if (first == last) // empty sublist return last; else if (first == (last-1)) // 1-element sublist return first; else { mid = (last + first)/2; pivot = arr[mid];

:

ADSA: Sorting/3 41

// exchange pivot and bottom end of range arr[mid] = arr[first]; arr[first] = pivot;

int scanUp = first + 1; // scanning indices int scanDown = last - 1;

while(true) { /* move up the lower sublist while scanUp is less than or equal to scanDown and the array value is less than pivot */ while ((scanUp <= scanDown) && (arr[scanUp].compareTo(pivot) < 0)) scanUp++;

ADSA: Sorting/3 42

/* move down upper sublist while array value is greater than the pivot */ while (pivot.compareTo(arr[scanDown]) < 0) scanDown--;

/* if indices are not in their sublists, partition is complete */ if (scanUp >= scanDown) break;

// found two elements in wrong sublists; exchange T temp = arr[scanUp]; arr[scanUp] = arr[scanDown]; arr[scanDown] = temp;

scanUp++; scanDown--; } :

ADSA: Sorting/3 43

// copy pivot to index posn (scanDown) that // partitions the sublists arr[first] = arr[scanDown]; arr[scanDown] = pivot;

return scanDown; }} // end of pivotIndex()

ADSA: Sorting/3 44

quicksort()

public static <T extends Comparable<? super T>>void quicksort(T[] arr){ qsort(arr, 0, arr.length); }

• quicksort() sorts a generic array arr[] by calling qsort() with the index range [0, arr.length).

ADSA: Sorting/3 45

qsort()

• Recursively partition the elements in the index range into smaller and smaller sublists, terminating when the size of a list is 0 or 1.

• For efficiency, handle a list of size 2 by comparing the elements and exchanging them if necessary.

ADSA: Sorting/3 46

• For larger lists, call pivotIndex() to reorder the elements and determine the pivot.

• Make two calls to qsort(): – the first call specifies the index range for the

lower sublist– the second call specifies the index range for the

upper sublist

ADSA: Sorting/3 47

qSort() Diagram

ADSA: Sorting/3 48

private static <T extends Comparable<? super T>>void qsort(T[] arr, int first, int last){ // if range is less than two elements if ((last – first) <= 1) return;

// if sublist has two elements else if ((last – first) == 2) {

:

ADSA: Sorting/3 49

/* compare arr[first] and arr[last-1] and exchange if necessary */ if (arr[last-1].compareTo(arr[first]) < 0) { T temp = arr[last-1]; arr[last-1] = arr[first]; arr[first] = temp; } return; } else { int pivotLoc = pivotIndex(arr, first, last); qsort(arr, first, pivotLoc); qsort(arr, pivotLoc +1, last); }

} // end of qsort()

ADSA: Sorting/3 50

Running Time of Quicksort

• The average case running time is O(n log2n).

• The best case occurs when the array is already sorted.

ADSA: Sorting/3 51

• Quicksort is efficient even when the array is in descending order.

ADSA: Sorting/3 52

• The worst-case occurs when the chosen pivot is always the largest or smallest element in its sublist. – the running time is O(n2)– highly unlikely

ADSA: Sorting/3 53

5. Comparison of Sorting Algorithms

• An inversion in an array, arr[], is an ordered pair (arr[i], arr[j]), i < j, where arr[i] > arr[j].

• When sorting in ascending order, arr[i] and arr[j] are out of order.

ADSA: Sorting/3 54

• The O(n2) sorting algorithms compare adjacent elements, generally remove one inversion with each iteration– e.g. selection and insertion sort

• The O(n log2n) sorting algorithms compare non-adjacent elements, and generally remove more than one inversion with each iteration.– e.g. quicksort and merge sort

ADSA: Sorting/3 55

Timing Sorts

import java.util.Random;import ds.util.Arrays;import ds.time.Timing;

public class TimingSorts{

public static void main(String[] args) { final int SIZE = 75000; Integer[] arr1 = new Integer[SIZE], arr2 = new Integer[SIZE], arr3 = new Integer[SIZE]; Random rnd = new Random(); :

ADSA: Sorting/3 56

/* load each array with the same sequence of random numbers in the range 0 to 999999 */ int rndNum; for (int i=0; i < SIZE; i++) { rndNum = rnd.nextInt(1000000); arr1[i] = arr2[i] = arr3[i] = rndNum; }

// call timeSort() for each sort timeSort(arr1, 0, "Merge sort"); timeSort(arr2, 1, "Quick sort"); timeSort(arr3, 2, "Insertion sort"); } // end of main()

ADSA: Sorting/3 57

public static <T extends Comparable<? super T>> void timeSort(T[] arr, int sortType, String sortName) { Timing t = new Timing(); t.start();

if(sortType == 0) Arrays.sort(arr); // merge sort in F&T else if (sortType == 1) Arrays.quicksort(arr); else Arrays.insertionSort(arr);

double timeRequired = t.stop();

outputFirst_Last(arr); System.out.print(" " + sortName + " time is " + timeRequired + "\n\n"); } // end of timeSort()

}

ADSA: Sorting/3 58

public static void outputFirst_Last(Object[] arr) // output first 3 elements and last 3 elements { for (int i=0; i < 3; i++) System.out.print(arr[i] + " ");

System.out.print(". . . ");

for (int i=n-3; i < arr.length; i++) System.out.print(arr[i] + " ");

System.out.println(); }

ADSA: Sorting/3 59

Output

26 38 47 . . . 999980 999984 999984 Merge sort time is 0.109

26 38 47 . . . 999980 999984 999984 Quick sort time is 0.078

26 38 47 . . . 999980 999984 999984 Insertion sort time is 100.611 O(nO(n22))

O(n logO(n log22n)n)

ADSA: Sorting/3 60

6. Finding the kth Largest Element

• Sort the array and then access the element at position k. – running time is O(n log2n) is we use quicksort

or merge sort

• For a more efficient solution, locate the position of the kth-largest value by partitioning the elements into two sublists.

ADSA: Sorting/3 61

• The lower sublist contains k elements that are ≤ the kth-largest.

• The upper sublist contains elements that are ≥ the kth-largest.

• The elements in the sublists do not need to be ordered.

values ≤ kth-largest values ≥ kth-largestkth-largest

k k+1 ... n-10 ... k-1

ADSA: Sorting/3 62

• Use the pivoting technique from the quicksort algorithm to create a partition.

• The algorithm is recursive: – index = pivotIndex() – If index == k, done, return arr[index]; – otherwise, call pivotIndex() with range [first, index) if k < index,

or with range [index+1, last) if k > index.• examine only one of the lists

ADSA: Sorting/3 63

public static <T extends Comparable<? super T>>int findKth(T[] arr, int first, int last, int k){ if (first > last) return -1; // partition range (first, last) in arr about the // pivot arr[index] int index = pivotIndex(arr, first, last);

// if index == k, we are done. kth largest is arr[k] if (index == k) return arr[index]; // return array value else if(k < index) // search in lower sublist (first, index) findKth(arr, first, index, k); else // search in upper sublist (index+1, last) findKth(arr, index+1, last, k);}

ADSA: Sorting/3 64

Running Time of findKth()

• The running time is O(n)– no of comparisons = n + n/2 + n/4 + n/8 + ...

= 2n

• This is faster than the O(n log2n) result for a sorted array– this is to be expected since findKth() only uses one of its

sublists at each recursive call compared to quicksort or merge sort which use both