data structures and programming -...

36
Data Structures and Programming Dept. of Electrical Engineering National Taiwan University ( Introduction) Data Structures and Programming Spring 2019 1 / 36

Upload: others

Post on 03-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Data Structures and Programming

    Õ§¡ Theory of Computation

    Ý_�

    Dept. of Electrical EngineeringNational Taiwan University

    ( Introduction) Data Structures and Programming Spring 2019 1 / 36

  • Course Organization

    E-mail: [email protected]: http://www.ee.ntu.edu.tw/ ∼ yenTime: 9:10-12:10, TuesdayPlace: BL 113Office hours: by appointmentClass web page:http://ccf.ee.ntu.edu.tw/ ∼ yen/courses/ds19.html

    ( Introduction) Data Structures and Programming Spring 2019 2 / 36

  • Instructor

    2

    顏嗣鈞

    • 學歷 博士 Univ. of Texas at Austin (CS) 1986碩士 交大計算機工程研究所 1982

    學士 台大電機系 1980

    • 經歷 台大電機系 教授 1991 – present台大計算機及資訊網路中心 主任 2014 – present台大電機系 系主任 2010 -- 2013台大電機系 副教授 1990 -- 1991

    美國Iowa State Univ. 計算機科學系助理教授 1986-1990

    • 專長 演算法設計分析、資訊視覺化、計算理論

    ( Introduction) Data Structures and Programming Spring 2019 3 / 36

  • Prerequisites

    Familiarity in C, C++, JAVA, or Python

    Textbook: Data Structures & Algorithm Analysis in C++ (3rd or 4thEd.), Mark Weiss, Addison Wesley.

    ( Introduction) Data Structures and Programming Spring 2019 4 / 36

  • Topics

    Preliminaries: Introduction. Algorithm analysis.Abstract data types: Stacks. Queues. Lists. List operations. Listrepresentations. List traversals. Doubly linked lists.Trees: Tree operations. Tree representations. Tree traversals.Threaded trees. Binary trees. AVL trees. 2-3 trees. B-trees.Red-black trees. Binomial trees. Splay trees, and more.Hashing: Chaining. Open addressing. Collision handling.Priority queues: Binary heaps. Binomial heaps. Fibonacci heaps.Min-max heaps. Leftist heaps. Skew heaps.Sorting: Insertion sort. Selection sort. Quicksort. Heapsort.Mergesort. Shellsort. Lower bound of sorting.Disjoint sets: Set representations. Union-find. Path compression.Graphs: Graph representations. Basic graph algorithms.Amortized analysis. Binomial heaps. Skew heaps. Fibonacciheaps.Adv. data structures: Tries. Top-down splay trees, and more.

    ( Introduction) Data Structures and Programming Spring 2019 5 / 36

  • Grading

    Homework + Programming Assignments: 25-30 %Midterm Exam.: 35 - 40 %Final Exam.: 35 - 40 %

    Academic Integrity With the exception of group assignments, thework (including homework, programming assignments, tests) must bethe result of your individual effort. This implies that one studentshould never have in his/her possession a copy of all or part ofanother student’s homework. It is your responsibility to protect yourwork from unauthorized access. Academic dishonesty has no place ina university, in particular, in NTUEE. It wastes our time and yours,and it is unfair to the majority of students. Any form of cheating willautomatically result in a failing grade in the course.

    ( Introduction) Data Structures and Programming Spring 2019 6 / 36

  • Data Structures vs. Programming

    ( Introduction) Data Structures and Programming Spring 2019 7 / 36

  • Data Structures, Algorithms, and Programs

    Data structureI Organization of data to solve the problem at hand

    AlgorithmI Outline, the essence of a computational procedure, step-by-step

    instructionsProgram

    I implementation of an algorithm in some programming language

    ( Introduction) Data Structures and Programming Spring 2019 8 / 36

  • Overall Picture

    Using a computer to help solveproblems:

    Precisely specifying the problemDesigning programs

    I architectureI algorithms

    Writing programsVerifying (testing) programs

    ( Introduction) Data Structures and Programming Spring 2019 9 / 36

  • C++ 6= Data Structures

    One of the all time great books in computer science:

    The Art of Computer Programming(1968-1973) by Donald Knuth

    Examples in assembly language (and English)!

    ( Introduction) Data Structures and Programming Spring 2019 10 / 36

  • Abstract Data Types

    ( Introduction) Data Structures and Programming Spring 2019 11 / 36

  • Advanced Data Structures

    ”Why not just use a big array?”Example problem

    I Search for a number k in a set of N numbersSolution # 1: Linear Search

    I Store numbers in an array of size NI Iterate through array until find kI Number of checks

    F Best case: 1 (k = 15)F Worst case: N(k = 27)F Average case: N/2

    ( Introduction) Data Structures and Programming Spring 2019 12 / 36

  • Advanced Data Structures

    Solution # 2: Binary Search Tree (BST)I Store numbers in a binary search tree

    F Requires: Elements to be sortedI Properties:

    F The left (resp., right) subtree of a node contains only nodes with keysless than (resp., greater than) the node’s key

    F Both the left and right subtrees must also be binary search treesI Search tree until find kI Number of checks

    F Best case: 1 (k = 15)F Worst case: log2 N (k = 27)F Average case: (log2 N)/2

    ( Introduction) Data Structures and Programming Spring 2019 13 / 36

  • Should things be Ordered?

    ( Introduction) Data Structures and Programming Spring 2019 14 / 36

  • Example

    Problem ArtifactsI N = 1,000,000,000

    F 1 billion (Walmart transactions in 100 days)I 1 Ghz processor = 109 cycles per second

    Solution #1I Worst case: 1 billion checks = 10 seconds

    Solution #2I Worst case: 30 checks = 0.0000003 seconds

    Does it matter? N vs. log2 N

    ( Introduction) Data Structures and Programming Spring 2019 15 / 36

  • Computational Complexity

    Computational complexity: an abstract measure of the time andspace necessary to execute an algorithm as functions of its inputsize.Input size: size of encoded ”binary” strings.

    I sort n words of bounded length - input size: nI the input is the integer n - input size: log2 nI the input is the graph G = (V,E) - input size: |V| and |E|

    Runtime comparison: assume 1 BIPS, 1 instruction/opTime Big-Oh n = 10 n = 100 n = 104 n = 106 n = 108

    500 O(1) 5*10-7 sec 5*10-7 sec 5*10-7

    sec5*10-7 sec 5*10-7

    sec

    3n O(n) 3*10-8 sec 3*10-7 sec 3*10-5

    sec

    0.003 sec 0.3 sec

    n lg n O(n lg n) 3*10-8 sec 6*10-7 sec 1*10-4

    sec

    0.018 sec 2.5 sec

    n2 O(n2) 1*10-7 sec 1*10-5 sec 0. 1 sec 16.7 min 116 days

    n3 O(n3) 1*10-6 sec 0.001 sec 16.7 min 31.7 yr ∞2n O(2n) 1*10-6 sec 4 *1011

    cent.∞ ∞ ∞

    n! O(n!) 0.003 sec ∞ ∞ ∞ ∞

    ( Introduction) Data Structures and Programming Spring 2019 16 / 36

  • Can’t Finish the Assigned Task

    I can’t find an efficient algorithm, I guess I’m just too dumb.

    ( Introduction) Data Structures and Programming Spring 2019 17 / 36

  • Mission Impossible

    I can’t find an efficient algorithm, because no such algorithm ispossible.

    ( Introduction) Data Structures and Programming Spring 2019 18 / 36

  • Difficult because ...

    I can’t find an efficient algorithm, but neither can all these famouspeople.

    ( Introduction) Data Structures and Programming Spring 2019 19 / 36

  • Life Cycle of Software Development

    ( Introduction) Data Structures and Programming Spring 2019 20 / 36

  • Data ”Structures”Array: requires that you copy all theelements in the array over

    Linked List: allows you to make theinsertion very quickly

    General areas include:Sequential storageHierarchical storageAdjacency storage

    ( Introduction) Data Structures and Programming Spring 2019 21 / 36

  • Goal

    Learn to write efficient and elegant softwareHow to choose between two algorithms

    I Which to use? bubble-sort, insertion-sort, merge-sortHow to choose appropriate data structures

    I Which to use? array, vector, linked list, binary treeIn this course, we will look at:

    I different techniques for storing, accessing, and modifyinginformation on a computer

    I algorithms which can efficiently solve problems

    We will see that all data structures have trade-offs - there is noultimate data structure...The choice depends on our requirements

    ( Introduction) Data Structures and Programming Spring 2019 22 / 36

  • Why should you care?

    Complex data structures and algorithms are used in every realprogram

    I Data compression uses trees: MP3, Gif, etcKI Networking uses graphs: Routers and telephone networksI Security uses complex math algorithms: GCD and large decimalsI Operating systems use queues and stacks: Scheduling and

    recursion

    Many problems can only be solved using complex data structuresand algorithms

    ( Introduction) Data Structures and Programming Spring 2019 23 / 36

  • What this course is NOT about

    This course is not about C++I Although we will use C++ to implement some of the concepts

    This course is not about MATHI Although we will use math to formalize many of the concepts

    Competency in both math and C++ (or other programminglanguages) is therefore welcomed.

    I C++: inheritance, overloading, overriding, files, linked-lists,multi-dimensional arrays

    I Math: polynomials, logarithms, inductive proofs, logic

    ( Introduction) Data Structures and Programming Spring 2019 24 / 36

  • The Big Idea

    Definition of Abstract Data TypeI A collection of data along with specific operations that manipulate

    that dataI Has nothing to do with a programming language!

    Two fundamental goals of algorithm analysisI Correctness: Prove that a program works as expectedI Efficiency: Characterize the run-time of an algorithm

    Alternative goals of algorithm analysisI Characterize the amount of memory requiredI Characterize the size of a programs codeI Characterize the readability of a programI Characterize the robustness of a program

    ( Introduction) Data Structures and Programming Spring 2019 25 / 36

  • Clever? Efficient?

    ( Introduction) Data Structures and Programming Spring 2019 26 / 36

  • Why study data structures?

    Clever ways to organize information in order to enable efficientcomputation

    ( Introduction) Data Structures and Programming Spring 2019 27 / 36

  • Why so many data structures?

    Ideal data structure:fast, elegant, memory efficientGenerates tensions:

    I time vs. spaceI performance vs. eleganceI generality vs. simplicityI one operation’s performance vs.

    another’s

    Dictionary ADTlistbinary search treeAVL treeSplay treeRed-Black treehash table...

    ( Introduction) Data Structures and Programming Spring 2019 28 / 36

  • An Example: Shortest Path

    Given a weighted graph and two vertices u and v, we want to finda path of minimum total weight between u and v.

    I Length of a path is the sum of the weights of its edges.

    Applications: Internet packet routing, Flight reservations, Drivingdirections

    ( Introduction) Data Structures and Programming Spring 2019 29 / 36

  • Dijkstra’s algorithm

    ( Introduction) Data Structures and Programming Spring 2019 30 / 36

  • Example (cont.)

    ( Introduction) Data Structures and Programming Spring 2019 31 / 36

  • Questions

    Operations performed?

    ( Introduction) Data Structures and Programming Spring 2019 32 / 36

  • Key steps

    ( Introduction) Data Structures and Programming Spring 2019 33 / 36

  • Straightforward approach

    Questions:Is the above efficient?Can we do better?

    ( Introduction) Data Structures and Programming Spring 2019 34 / 36

  • Priority Queues

    ( Introduction) Data Structures and Programming Spring 2019 35 / 36

  • Questions?

    ( Introduction) Data Structures and Programming Spring 2019 36 / 36