welcome to ist 210: organization of data. teaching team zihan zhou – office hour: 3-4pm monday or...
TRANSCRIPT
Welcome to IST 210: Organization of Data
IST210 2
Teaching Team
• Zihan Zhou– Office hour: 3-4pm Monday or by appointment– Office: 320 IST Building
• TA: Jian-Syuan Wong– Office hour: 3:45-4:45pm Thursday– Location: 321D IST Building
IST210 3
Course Website• Log into your ANGEL• You will see all course information on
wikispace
IST210 4
Why Should You Take This Course?
• It is required!– Pre-requisite for other advanced courses
• Why did IST make this course required?– Importance in organizing data, information, and
knowledge
Database
User InterfaceInformation Privacy
IST210 5
Organization of the data is important
• Have you ever thought about how the data is organized behind these systems? – Class schedule system: http://schedule.psu.edu– Online shopping system: www.amazon.com
• Huge amount of data need to be well organized: we need a database to efficiently and accurately modify and query the data– UPS Technology
IST210 6
Organization of the data is important
• Web 2.0– http://www.youtube.com
• Web 1.0 vs. Web 2.0• How to deliver Web-2.0 services?
– The central piece -- database• Manage complicated data
– Youtube.com: video clips, users, comments, tags, …– Facebook.com: profiles, photos, music, comments, …
– User interface– Web service: search and data retrieval
IST210 7
beer
Using Data for Competitive Advantage
• Success of companies relies on – Data: Wall Street companies– Flow of data: FedEx and UPS– Capabilities to extract information from data
• “What Wal-Mart Knows About Customers' Habits”
"And the pre-hurricane top-selling item was _______.“ -- Linda M. Dillman, Wal-Mart's chief information officer
IST210 8
What is This Course About?
• Database– An effective and efficient way to organize data
• Key issues that will be covered in this course– How to design a database?
• Relational Database, E-R Diagram
– How to query a database?• SQL
– How to build a website connecting with database?• HTML, PHP, web server…
• Examples of course projects in previous classes– http://www.youtube.com/channel/UCLTMn5zyFv-AZ3-gxwj-
OxA/videos
IST210 9
How Will This Course Be Taught?
• Classes– 11:15 AM – 12:05 AM Monday, Wednesday and
Friday in IST Room 110– Lecture, discussion, in-class exercise, projects
• Lab– 2:30 PM – 3:45 PM Wednesday in IST Room 202– Programming, projects
• Semester-long project– Group project on an idea developed by your group.
IST210 10
Readings for Lectures• Textbook: Database Concepts by Kroenke &
Auer/ Pearson/ 978-0132742924, 6th edition
• Additional readings will be provided – always check the latest schedule on course website!
Previous editions: content should be similar, assignments might be different
IST210 11
In Classes
• Attendance– Attendance is required for every class!– Attendance check: 5% in final grading– Quiz in class (for attendance check only)– If you are not able to attend class with reasonable
excuse, instructor or TA should be notified before class
• Class rule: Computers will be taken control during lecture time
IST210 12
After Classes
• Study groups strongly encouraged– Discuss general approaches, not the way to solve the
problems
• Assignments and exams should be finished independently
• Project will be carried as a team (4 persons)• Keep track of up-to-date IT news and events
– See how they can be related to this course
IST210 13
Labs• Led by TA
– Programming, projects• Individual programming exercises
– Only 5% in final grading• Team project
– A website connecting with database
IST210 14
Project
• A web-based database system to demonstrate what you have learnt– Data organization to facilitate
• Information access• Information organization and management
– Innovation is strong encouraged!
• Group will be assigned based on your technical backgrounds– Finish the survey on ANGEL before next class!– Group size: 4 students
IST210 15
Project (cont.)
• Five stage progress reports– Each has a very specific problem related to the project
• One final report – Put all progress reports together
• Make changes based on suggestions from TA and me– Include the final result of your design
• Screen shots of your service• URL pointing to your database
• Final in-class presentation
IST210 16
Project (cont.)
• Database design– Coding is an integral part of the class
• PHP: HyperText Preprocessor• HTML: HyperText Markup Language• Templates and examples will be provided
– No coding experience• Work hard• Start early• Don’t panic, we will help you
– come to office hour and come to the lab!
IST210 17
Coding in Fun
IST210 18
Grading
• Homework: 15%• Exams: 40% = 15% (Midterm) + 25% (Final)• Project: 35%• Lab: 5%• Class attendance: 5%
IST210 19
Grading: Homework
• Total five homework assignments: 3%*5 = 15%– Due: one week after an assignment is given– Must submit online through ANGEL before deadline
• Cut-off time on ANGEL: midnight• Strict late submission penalty
– After the deadline but less than 24 hours: penalized 10%– More than 24 hours but less than 48 hours late: penalized
30%– No submissions are accepted more than 48 hours late
– Independent work!• Do not exchange your answers with your classmates!• Do not search for solutions online!
IST210 20
Grading: Exam
• Midterm (15%): Chapter 1-3• Final Exam (25%): Everything we covered in
class, with more focus on Chapter 4-5
IST210 21
Grading: Project• Group grade
– 3% for each progress report (total 15%)– 10% for the final report– 10% for the final presentation
• Individual adjustment– For each report: 30% is related to individual
contribution and involvement.• Done your part of work: 15%• Participation: 15%
IST210 22
Some Challenges You Will Face …
• Programming– HTML, PHP, SQL– Do not worry about grading if zero experience
• 5% lab• Programming in projects are done as a team • No programming in assignments, midterm exam and final
• Individual homework assignments– Some are time-consuming– Make good use of the office hours
• Team project– This project is not something you can finish within 3 days, not even 3
weeks!– Team work is the key.
IST210 23
Policy
• Academic Integrity– Individual assignments must be completed
independently. Students are strongly encouraged to form study groups and to learn from peer students. However, discussion on homework questions in study group should be limited to general approaches to solutions. Specific answers should never be discussed. Penn State's policy regarding Academic Integrity must be followed.
• University policy– http://www.psu.edu/dept/oue/aappm/R-6.html
IST210 24
Question?
IST210 25
Chapter 1. Introduction to Database
IST210 26
Purpose of a Database
• The purpose of a database is to keep track of things
• Unlike a list or spreadsheet, a database may store information that is more complicated than a simple list
IST210 27
Mini Case
• You are designing our course selection system– What aspects you need to store a record?
• Student ID, Student Name, Student's Department, Email • CourseID, Instructor, CourseName , Location
– What questions (i.e. queries) will users ask?• Student: What class I have registered for this semester?• Instructor: How many students are registered and what
are their backgrounds?
– What tool would you use to manage the data?• Excel?
IST210 28
An Example List
Can you see any potential problem when using this list?
IST210 29
Problems with Lists: Redundancy
• In a list, each row is intended to stand on its own. As a result, the same information may be entered several times– If there are 40 students taking IST210, class
information like “CourseID” will be entered 40 times.
IST210 30
Problems with Lists: Multiple Themes• In a list, each row may contain information on
more than one theme. As a result, needed information may appear in the lists only if information on other themes is also present
Student info Course info
IST210 31
List Modification Issues
• Redundancy and multiple themes create modification problems– Deletion problems– Update problems– Insertion problems
IST210 32
List Modification Issues: Delete
Delete: Kate drops 230Problem: Information about Kate AND IST 230 will be lost
IST210 33
List Modification Issues: Update
Update: IST210 Location ChangedProblem: Need to update multiple rows
IST210 34
List Modification Issues: Insert
Insert: A new course with no student registered yet
IST210 35
List Modification Issues: Insert
Insert: A new course with no student registered yetProblem: blank cells for student information
IST210 36
A Long List to Several Small Lists
Two themes: Student, Course
INFORMATION LOSS! Registration information is not in Student and Course tables
IST210 37
A Long List to Several Small Lists
PROBLEMS!One cell does NOT allow multiple values. (IMPORTANT! This rule is strictly enforced in database.)
Two themes: Student, Course
IST210 38
A Long List to Several Small Lists
Student Entity
Course Entity
Student-Course Relationship
Three themes: two entities and one relationship
IST210 39
A Long List to Several Small ListsStudent
Course
Registration
Key points in splitting: 1. A table must be connected with other table(s) through shared column(s)Student (StudentID) RegistrationCourse (CourseID) Registration2. One cell can only have one value
Revisit previous issues:Insert: A new student not taking any classUpdate: IST210 location changedDelete: Kate drops 230
Use above criteria to check whether you split the tables correctly!
IST210 40
Exercise
• What are the problems with this table.• Split it into multiple tables.• Check whether you split the table correctly.
IST210 41
Next Class on Wednesday
• Introduction to data and database (cont.)• HTML basics• Lab exercise after class: IST webspace and a
simple personal webpage
IST210 42
Remember
• Complete Programming Skill Survey on ANGEL!