csci 1200 introduction to computing for non majors chapter 7 tami meredith, ph.d....
Post on 18-Jan-2016
215 Views
Preview:
TRANSCRIPT
CSCI 1200Introduction to Computing for
Non Majors
CHAPTER 7
Tami Meredith, Ph.D.tami.meredith@live.com
Chapter 7 Objectives Explain what a database is and
describe its basic structure Identify the kinds of problems that can
be best solved with database software Describe different kinds of database
software Describe database operations for
storing, sorting, updating, querying, and summarizing information
2
The Google Guys Searchfor Tomorrow Handles hundreds of
millions of queries a day
Launched by two Ph.D. students (Larry Page and Sergey Brin)
In 2004, Google offered shares to the public
Google is an example of a "text database"
3
Get to know your search engines
Be specific when you search
Ask questions Use quotes to
narrow your search Know your pluses
and minuses
More is usually better than less
Refine your queries and try again – search is a process, not an action
Explore keywords Dig, don't just look
at the first page of results
4
Web Search
What is a Database?
Database: an organized or structured collection of information/data that can easily be accessed (retrieval/search), managed, and updated
Databases make it easier to:› Store large quantities of information› Retrieve information quickly and flexibly › Organize and reorganize information › Print and distribute information in a variety of
ways
5
Types of DatabasesMedia Databases: music, movie, and photo
libraries – often based on keywords, annotations, and meta information
PIM (personal information manager): electronic organizer
GIS (geographical information systems): include geographic and demographic data in map form
Web databases: large scale textual databases for HTML and related content – also consider the tags as well as the content
Relational Databases: General purpose systems based on SQL
Custom systems for "Big Data"6
Database Anatomy
Database: organized collection of information stored in a computer
Database program: software tool for organizing storage and retrieval of that information
7
Database Anatomy
Table: collection of related information› Like a single page of a spreadsheet
Record: a single row in a table› information related to one person, product, or event› Each discrete chunk of information in a record is a
field Field: A cell in a table (intersection of a row and
a column)› holds different types of data: numeric (height,
income), text (title, name), structured (date, address), computed (age), reference (parent, sibling)
8
Database Operations Import: receive data in some form (often as text or
CSV) Add: New records Update: Change existing records Delete: Unneeded records Browse: navigate through information (not very useful) Query: find records that match specific criteria (search
and retrieve) Sort: rearrange records (alphabetically or numerically) Print: reports, labels, and form letters Export: to another program
9
Database Access
Structured Query Language (SQL): standard language for programming complex queries› Graphical user interfaces allow point-and-click queries
SQL statements are understood by MS Access, MS SQL Server, DB2, Oracle, Sybase, MySQL and others.
SQL is a sublanguage that:› Uses high-level, easy-to-understand statements› Has a relational database orientation› Is portable across a wide range of systems
10
Customers: A Simple Table
ID CustomerName
ContactName
Address City PostCode
Country
1 Alfreds Futterkiste
Maria Anders Obere Str. 57 Berlin 12209 Germany
2 Ana Trujillo Emparedados y helados
Ana Trujillo Avda. de la Constitución 2222
México D.F.
05021 Mexico
3 Antonio Moreno Taquería
Antonio Moreno Mataderos 2312 México D.F.
05023 Mexico
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1 1DP UK
5 Berglunds snabbköp
Christina Berglund
Berguvsvägen 8 Luleå S-958 22 Sweden
11
SQL Queries
Describe desired result – let the system figure out how to get the data for you
You need to know how the tables are structured or you'll be unable to do anything
Format:SELECT (which columns) FROM (which table) WHERE (conditions that apply)
Example:SELECT CustomerName FROM CustomersWHERE Country=`Mexico`;
SELECT: Notes
Conditions use: =, <, <=, >, >=, <> Conditions combined using AND, OR,
NOT Domain-specific conditions also exist: e.g.,
LIKE SQL implements multi-sets, duplicate
records are possible unless DISTINCT is used
SELECT * identifies all columnsE.g., SELECT * FROM STUDENTS;
Simple SQL Queries
SELECT * FROM Customers; SELECT CustomerName, City FROM
Customers; SELECT DISTINCT City FROM
Customers; SELECT * FROM Customers
WHERE Country='Mexico'; SELECT * FROM Customers WHERE
ID=1;14
SQL Queries with AND & OR
SELECT * FROM CustomersWHERE Country='Germany'AND City='Berlin';
SELECT * FROM CustomersWHERE City='Berlin'OR City='London';
SELECT * FROM CustomersWHERE Country='Germany'AND (City='Berlin' OR City='München');
15
Sorting Results in SQL
SELECT * FROM CustomersORDER BY Country;
SELECT * FROM CustomersORDER BY Country, CustomerName;
SELECT * FROM CustomersORDER BY Country ASC, CustomerName DESC;
16
Joins
Getting and merging data from two tables
The reason that Databases can do so much more than a spreadsheet
Done with multiple FROM clausesSELECT EMPLOYEE.Name , DEPARTMENT.LocationFROM EMPLOYEE, DEPARTMENT WHERE EMPLOYEE.Dnumber = DEPARTMENT.Dnumber;
Complex SQL
Basic SQL will do a lot, but to handle industrial tasks, SQL is a lot more complex
Entire "programs" can be written in SQL
Used in conjunction with other programming languages
E.g., SQL + PHP is the basis of most WWW content management systems (CMS) 18
Pattern Matching
SELECT Fname FROM EMPLOYEE WHERE Address LIKE `%Houston%`;
% Matches zero or more characters SELECT Fname FROM EMPLOYEE WHERE
PostalCode LIKE `B_____`; _ matches one character SELECT Fname FROM EMPLOYEE WHERE
(Salary BETWEEN 30000 AND 50000); BETWEEN = ((S >= 30000) AND (S
<=50000))
Aliasing
SELECT E.Name FROM EMPLOYEE AS E WHERE E.Age > 30;
SELECT E.Name FROM EMPLOYEE E WHERE …
INSERT INTO `menu` (`parentid`,`display`,`link`,`module`,`target`,`status`,`menuid`) VALUES((SELECT `id` FROM (SELECT * FROM `menu`) AS
something WHERE `display`='Projects'),'EMWerx',(SELECT `id` from `content_pages` WHERE `name`='EMWerx'),'Content','same','1','1');
Operations
SELECT .15*Salary AS TaxRate FROM EMPLOYEE;
Operations include: +, -, * , / for numbers
| | is string concatenation +,- increment or decrement a date,
time, or timestamp by an interval - of two times, dates, or timestamps
yields an interval
Nested Queries
IN returns TRUE if a value is a member of a multiset
SELECT DISTINCT PnumberFROM PROJECTWHERE Pnumber IN(SELECT Pnumber FROM PROJECT, DEPARTMENT,
EMPLOYEE WHERE Dnum=Dnumber AND Mgr_ssn = SSN AND Lname=`Smith`);
ALL, ANY (also SOME) also work on multisets SELECT Lname FROM EMPLOYEE WHERE
Salary > ALL (SELECT Salary FROM EMPLOYEE WHERE Dno=5);
Evaluation
Unqualified attributes refer to the innermost nested query, use an alias to rename a result and refer to its attributes for outer queries
If a WHERE clause contains a nested query the inner query is evaluated once for each outer query
Exists
Returns TRUE if there is at least one tuple/row in the argument
Can be used with NOT SELECT Lname FROM EMPLOYEE
WHERE NOT EXISTS (SELECT Pnumber FROM PROJECT);
UNIQUE returns TRUE if a multiset is a set --- i.e., has no duplicates
Explicit Joins
Possible to move a join condition to a FROM clause for making queries clearer
SELECT Fname FROM(EMPLOYEE JOIN DEPARTMENT ON
Dno=Dnumber)WHERE Dname=`Research`;
Aggregate Functions
COUNT, SUM, MIN, MAX, AVG SELECT AVG(SALARY), MIN(SALARY),
MAX(SALARY) FROM EMPLOYEE; SELECT COUNT(*) FROM EMPLOYEE,
DEPARTMENT WHERE Dno=Dnumber AND Dname<>‘Research’;
SELECT COUNT (DISTINCT Salary) FROM EMPLOYEE;
Database-Management Systems
Database Management Systems (DBMS) help manage the tables of a database
Conciseness: redundancy wastes space and may cause inaccuracies if duplicates not updated at the same time
Correctness: requires all tables to contain the correct information
Completeness: all tables should contain all available information
27
3 Valued Logic Databases do not use traditional
"Boolean" logic based on True and False
They add in a third value, "Unknown," to handle missing data values
Adds a lot of complexity to database systems but also a lot of power and flexibility to prevent errors due to unknown/missing values
RDBMS What makes a database relational?
› Tables related› Information spans a single table› Complex relationships are possible› Data may have attributes & values
29
Database Trends Downsizing and decentralizing
› Client/server approach: uses requests to access database
› Distributed databases: spread data across networks Data mining: discovery of hidden predictive
information› Helps to predict trends and patterns in data› Not always correct, use with care
Data scrubbing: Repairing incorrect (dirty) data Data warehouses: large, centralized systems
housing enterprise data
30
Fads Object-oriented databases: store objects
› Every object is an instance of a class.› The class includes the data and the operations› Not been as successful as hoped
Multidimensional database technology stores data in more than two dimensions› Data is organized in cubes› More hype and terminology than innovation
(basic theory always supported this idea)
31
Database Trends
Intelligent searches› More artificial
intelligence technology
› Natural language queries
› Contextual search tools will classify results
› Clustering of results by similarity
32
Database Management
Design and creation of a database is a highly specialised task
Database administration is an IT career SQL seems simple, but yet, is very
complex to fully master Companies are highly dependent on
their databases and can fail if errors are made
33
Get a professional, it's easy to screw up and the costs are very high if you do
Choose the right tool for the job
Knowing how you'll do retrieval is the key to successful design
Start with a plan; be prepared to change your plan
Make your data:› consise› complete› correct
Databases are only as good as their data
Query with care Too much data is
better than too little
34
Database Tips
"Identity Theft" The new bogeyman Applies to any case of someone accessing your
accounts as "you" (e.g., boy/girlfriend using your bank card, cell phone, etc.)
Not as common as media makes it out to be Can be triggered by being careless online or by
errors in judgment Usually just a simple case of someone using
your account after you share access information No different to lending your bank card to the
wrong person
35
Privacy Concerns
Record matching uses a unique field to combine information from different databases
Privacy is decreasing› Data can be transmitted almost instantly› Workplace monitoring› Surveillance cameras
36
Don’t give out specifically identifying information: SIN, DoB, Driver's license number
Minimise other information you distribute
Be stingy with your email address – though spam filtering and multiple addresses make this less important now
Say no to sharing your personal information
Think before you post – everything lasts forever
Pay attention to privacy preferences
If you think there’s. incorrect or damaging information about you in a file, find out
Don't believe companies – they can be infiltrated, use dubious techniques
37
Privacy Concerns
Summary Database programs enable users to store,
organize, retrieve, communicate, and manage large amounts of information
There are many specialized databases such as media libraries, GIS, and PIMs
Organizations are moving toward a client/server approach that enables users to have access to data stored in servers throughout the organization’s network
The accumulation of data by government agencies and businesses is a growing threat to our right to privacy
38
top related