lecture 1 introduction to data structures and algorithms
TRANSCRIPT
9/2/21
1
Lecture 1Introduction to Data Structures
and Algorithms
ECE 241 – Advanced Programming IFall 2021
Mike Zink
0
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Overview
1
• Introduction of data structures and algorithms• Information about course
1
9/2/21
2
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
What is an algorithm?
• Well-defined computational procedure• Takes set of values as input and produces set
of values as output• Tool to solve well defined computational
problem
2
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Algorithm example: Sorting numbers
3
• Input: A sequence of n numbers: <a1, a2, …, an>• Output: A permutation (reordering)
<a’1, a’2, …, a’n> of input sequence such that a’1 <= a’2 <= … <= a’n• Input: <31, 41, 59, 26, 41, 58>
Output: <26, 31, 41, 41, 58, 59>• Since sorting is essential, a large number of
sorting algorithms exist
3
9/2/21
3
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
A Data-centric View of the World
4
• Data are important in many disciplines• DNA sequences, images, network traffic, sensor readings,
maps, customer profiles, etc.• COVID-19 related data across all disciplines
• Past ECE focus: processing• Computation at core: processors, computers, etc.
• New ECE focus: data• Data at core: data centers, data security, data science, etc.
• Need structured approach to handling data
4
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
What kind of problems are solved by algorithms?
5
• Human Genome Project• Internet: Routing, searches, security• Electronic commerce• Commercial enterprises• …
5
9/2/21
4
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Human Genome Project
6
• Complete mapping and understanding of all the genes of human beings• ~20,500 human genes • ~3 billion chemical base pairs that make up
human DNA• Requires algorithms to store and analyze
6
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Infectious Disease Modeling
7
• Flu forecast• COVID-19: forecast of death and
hospitalization• Generate computer models (in form of
algorithms) to make predictions• Statistical modeling
7
9/2/21
5
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Internet
8
• Routing algorithms that find shortest path and adapt to changes• Search engines that find pages with particular
information• Encrypted communication• Queue management
8
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Electronic Commerce
9
• Ad placements• Google search• Recommendations: Netflix, Amazon, etc.• Hashing for authentication• Electronic trading
9
9/2/21
6
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Manufacturing and Commercial Enterprises
10
• Allocation of scarce resources: • Assigning crews to flights• Routes for package deliveries• Storage and distribution of goods
10
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Why Data Structures?
11
• Way to store and organize data• Facilitate access and modification of data• Different data structures have strengths and
weaknesses• Better suited for a specific algorithm than
others
11
9/2/21
7
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Data Structures and Algorithms
12
• Data structures• Representation and organization of data
• Algorithms• Methods for implementing operations on data
structures• Data structures and algorithms are closely
related• Data structures are often tuned for certain
algorithms
12
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Big Data
13
• Extremely large volumes of data• Can be computationally analyzed• Internet of Things• Social media content
• Requires algorithms and data structures to analyze
13
9/2/21
8
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Artificial Intelligence
14
• Often ”trained” on big data sets• Image recognition• Voice recognition
• Relies on a large set of algorithm, varying in complexity• Linear regression• Statistics, probability
14
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Complexity Problems
15
• Data sets can be very large• Operations may take a lot of time or memory• Efficient implementations can make a big difference
• Examples of large data sets and sizes?
15
9/2/21
9
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
• Example: Million Song Dataset
16
• http://labrosa.ee.columbia.edu/millionsong/• Data set of songs: title, artist, recording years, etc.
• How would you figure out• Which artist has recorded most songs?
• Which song has been covered the most times?• What are the most common words in a title?
• What are potential problems?
16
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
What will you learn in the course?
17
• How to think about data and operations on data
• How to design data structures for efficient use
• How to determine the efficiency of algorithms• A set of common data structures and algorithms for typical
operations on data (e.g., searching, sorting, etc.)
• Advanced programming techniques (e.g., recursion, etc.)
• How to implement a larger software project with a professional IDE
17
9/2/21
10
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Python
18
• We will use Python as programming language• Data structures and algorithms are independent of programming
language
• This is not a programming course• We focus on higher level issues• You should already have experience with a programming
language (Python, Java, C/C++)• IDE used in this course: PyCharm• Professional environment
18
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
PyCharm
19
• IDE for Pyhton• https://www.jetbrains.com/pycharm/• Very important to become familiar with tool!!• See intro slides for
PyCharm (more during this week’sdiscussion)
19
9/2/21
11
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Course Organization
20
• Course web site:http://www.ecs.umass.edu/ece241/
20
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Course Events
21
• Events to attend• Lectures• Discussions
• Assignments• Homework• Projects
• Exams• 2 exams during semester• 1 final exam
21
9/2/21
12
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Course Policy
22
• Read the syllabus in its entirety!• Understand expectations for this course
• Deadlines are posted on web site• No extensions for homework or projects
• Assignments and grades managed through Moodle and Gradescope• No sharing of code• Automated checking of submissions
22
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Office Hours
23
• TAs:• Tuesdays: 5 PM (Guoyi)• Wednesdays 7 PM (Zibin)
• Prof. Zink:• Fridays 8 – 9 AM
23
9/2/21
13
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Properties of Data Structures and Algorithms
24
24
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Quality Measures for Data Structures
25
• What do you expect from a good data structure?
25
9/2/21
14
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Quality Metrics for Algorithms
26
• What do you expect from a “good” algorithm?
26
Correctness
• In some cases absolute• E.g., what is the sum of all values in array?
• Some algorithms have multiple solutions• Examples?
• Some algorithms may have solutions where correctness is hard to define• Examples?
27
27
9/2/21
15
Multiple solutions
• Data set for number of wins by pitcher (2011):• {(Halladay, 21),
(Jiminez, 19),(Lester,19),(Price,19),(Sabathia,21),(Wainwright,20)}
• Which pitcher has most wins?28
28
Unclear “correctness”
• Make image square:
29
?
?
?
29
9/2/21
16
Running time performance
• “The algorithm finishes in 12 seconds.”• How to express performance more accurately?
30
30
Memory use performance
• “The data structure requires 727 bytes.”• How to express memory requirements in a more
meaningful way?
31
31
9/2/21
17
It all depends…
• Quality of data structure depends on operation• Example: unsorted “pile” of papers• Very fast for adding item• Not very fast for finding item
• Example: papers sorted by last name• Slower for adding item• Faster for finding item (if search is by last name)• However: also slow if search is by first name
32
32
Data types
• What are typical data types to store information?
33
33
9/2/21
18
Data types
• Common types• Numeric types: integer, floating point• Boolean• Text types: character, string• Composite types• References (“pointer”)
34
34
Algorithms and data types• Many algorithms can be used with different data types
• Sorting of integers• Sorting of strings• Sorting of composite types
• May require small adaptation in code• Different comparison functions• Different code for “moving data around”
• Why is this point important?• I do not want you to leave this course saying:
“Useless. We only learned how to sort numbers!”35
35
9/2/21
19
Objects• For algorithm discussion, we can think of “objects”
in a very general way• Operations on objects• Creation (and destruction)• Comparison
• Operations on objects in data structures• Insertion• Deletion• Algorithm-specific operations: search, sort, etc. 36
36
ECE 241 – Data Structures Fall 2020 © 2020 Mike Zink
Next Steps
37
• Next lecture “Asymptotic Notation and Merge Sort” on Tuesday
• First discussion TODAY!!• Will post first HW later today
37
9/2/21
20
ECE 241 – Data Structures Fall 2018 © 2018 Mike Zink38
38