itec 2620a introduction to data structures instructor: prof. z. yang course website: zyang/itec...

23
ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: http://people.math.yorku.ca /~zyang/itec2620a.htm Office: TEL 3049

Upload: katherine-charles

Post on 29-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

ITEC 2620AIntroduction to Data Structures

Instructor: Prof. Z. YangCourse Website: http://people.math.yorku.ca/~zyang/itec2620a.htm Office: TEL 3049

Page 2: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

Sorting

Page 3: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

3

Key Points

• Non-recursive sorting algorithms– Selection Sort– Insertion Sort– Best, Average, and Worst

cases

Page 4: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

4

Sorting

• Why is sorting important?– Easier to search sorted data sets– Searching and sorting are primary

problems of computer science• Sorting in general

– Arrange a set of elements by their “keys” in increasing/decreasing order.

• Example: How would we sort a deck of cards?

Page 5: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

5

Selection Sort

• Find the smallest unsorted value, and move it into position.

• What do you do with what was previously there?– “swap” it to where the smallest

value was– sorting can work on the original

array

• Example

Page 6: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

6

Pseudocode

• Loop through all elements (have to put n elements into place)– for loop

• Loop through all remaining elements and find smallest– initialization, for loop, and branch

• Swap smallest element into correct place– single method

Page 7: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

7

Insertion Sort

• Get the next value, and push it over until it is “semi” sorted.– elements in selection sort are in

their final position– elements in insertion sort can

still move• which do you use to organize a pile

of paper?

• Example

Page 8: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

8

Pseudocode

• Loop through all elements (have to put n elements into place)– for loop

• Loop through all sorted elements, and swap until slot is found– while loop– swap method

Page 9: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

9

Bubble Sort

• Slow and unintuitive…useless• Relay race

– pair-wise swaps until smaller value is found

– smaller value is then swapped up

Page 10: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

10

Cost of Sorting

• Selection Sort– Is there a best, worst, and

average case?• two for loops always the same

– n elements in outer loop– n-1, n-2, n-3, …, 2, 1 elements

in inner loop• average n/2 elements for each

pass of the outer loop

– n * n/2 compares

Page 11: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

11

Cost of Sorting (Cont’d)

• Insertion Sort– Worst – same as selection sort,

next element swaps till end• n * n/2 compares

– Best – next element is already sorted, no swaps• n * 1 compares

– Average – linear search, 50% of values n/4• n * n/4 compares

Page 12: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

12

Value of Sorting

• Current cost of sorting is roughly n2 compares

• Toronto phone book– 2 million records– 4 trillion compares to sort

• Linear search– 1 million compares

• Binary search– 20 compares

Page 13: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

13

Trade-Offs

• Write a method, or re-implement each time?

• Buy a parking pass, or pay cash each time?

• Sort in advance, or do linear search each time?

• Trade-offs are an important part of program design– which component should you

optimize?– is the cost of optimization worth the

savings?

Page 14: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

Complexity Analysis

Page 15: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

15

Key Points

• Analysis of non-recursive algorithms– Estimation– Complexity Analysis– Big-Oh Notation

Page 16: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

16

Factors in Cost Estimation

• Does the program’s execution depend on the input?– Math.max(a, b);

• always processes two numbers constant time

– maxValue(anArray);• processes n numbers varies with

array size

Page 17: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

17

Value of Cost Estimation

• Constant time programs– run once, always the same…– estimation not really required

• Variable time programs– run once– future runs depend on relative

size of input• based on what function?

Page 18: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

18

Cost Analysis

• Consider the following code:sum = 0;for (i=1; i<=n; i++)

for (j=1; j<=n; j++) sum++;• It takes longer when n is larger.

Page 19: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

19

Asymptotic Analysis

• “What is the ultimate growth rate of an algorithm as a function of its input size?”

• “If the problem size doubles, approximately how much longer will it take?”

• Quadratic• Linear (linear search)• Logarithmic (binary search)• Exponential

Page 20: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

20

Big-Oh Notation

• Big-Oh represents the “order of” the cost function– ignoring all constants, find the largest

function of n in the cost function

• Selection Sort– n * n/2 compares + n swaps

• O(n2)

• Linear Search– n compares + n increments + 1

initialization• O(n)

Page 21: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

21

Simplifying Conventions

• Only focus on the largest function of n

• Ignore smaller terms • Ignore constants

Page 22: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

22

Examples

• Example 1:– Matrix multiplication

Anm * Bmn = Cnn

• Example 2:selectionSort(a);

for (int i = 0; i < n; i++)

binarySearch(i,a);

Page 23: ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: zyang/itec 2620a.htm Office: TEL 3049

23

Trade-Offs and Limitations

• “What is the dominant term in the cost function?”– What if the constant term is larger than

n?

• What happens if both algorithms have the same complexity?– Selection sort and Insertion sort are both O(n2)

• Constants can matter– Same complexity (obvious) and different

complexity (problem size)