database and data warehouse october 8, 2015. 2 learning goals explain basic concepts of data...

33
Database and Data Warehouse October 8, 2015

Upload: aileen-wiggins

Post on 04-Jan-2016

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

Database and Data Warehouse

October 8, 2015

Page 2: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

2

LEARNING GOALSExplain basic concepts of data management.

Describe traditional file systems and identify their problems.

Define database management systems and describe their various functions.

Explain how the relational database model works.

Explain Object-Oriented databases.

Explain Data Warehouse, Data Mart

Page 3: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

3

Mini case: Bank Accounts’ dataWhat basic data/info are required for opening

bank account?– – – – – – – – – – – – – – –

Page 4: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

4

What is a database?Collection of related files containing records on

entities like people, places, events, things, etc.Databases make data easy to access and

manage.

Customers Info Accounts Info Employees Info

Access and Management tools

Page 5: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

5

Basic Concepts of Data Management

Database: Collection of data organized in different containers

Table 1 Table 2 Table 3

ReportForm 1

Acc #:_______Name:_______

Page 6: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

6

Basic Concepts of Database systems

Table– Two-dimensional structure composed of rows and columns

Field– Like a column in a spreadsheet

Field name– Like a column name in a spreadsheet– Examples: AccountID, Customer, Type, Balance

Field values– Actual data for the field

Record– Set of fields that describe an entity (a person, an account, etc.)

Primary key – A field, or group of fields, that uniquely identifies a record

AccountID Customer Type Balance

660001 John Smith Checking $120.00

660002 Linda Martin Saving $9450.00

660003 Paul Graham Checking $3400.00

Accounts table

Each table has:

Fields Records 1 Primary key

Page 7: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

7

Basic Concepts in Data Management A Primary key could be a single field like in these tables

AccountID Customer Type Balance

660001 John Smith Checking $120.00

660002 Linda Martin Saving $9450.00

660003 Paul Graham Checking $3400.00

Primary key

Primary key could be a composite key, i.e. multiple fields

Page 8: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

8

Traditional File SystemsEarly attempt to computerize manual filling systemSystem of files that store groups of records used by a particular

software applicationSimple but with a cost

– Inability to share data– Inadequate security– Difficulties in maintenance and expansion– Allows data duplication (e.g. redundancy)

Application 1

Program 1

File 1

File 2

File 3

Program 2

File 1

File 2

File 3

Application 2

Program 1

File 1

File 2

File 3

Program 2

File 1

File 2

File 3

Page 9: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

9

Traditional File System AnomaliesInsertion anomaly– Data needs to be entered more than once if located

in multiple file systemsModification anomaly– Redundant data in separate file systems– Inconsistent data in your system

Deletion anomaly– Failure to simultaneously delete all copies of

redundant data– Deletion of critical data

Page 10: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

Database AdvantagesDatabase advantages from a business

perspective include:– Ease of data insertion• Example: can insert a new address once; and the

address is updated in all forms, reports, etc.

– Increased flexibility• Handling changes quickly and easily

– Increased scalability and performance• Scalability: how the DB can adapt to increased demand

– Reduced information redundancy & inconsistency– Increased information integrity (quality)

• Can’t delete a record if related info is used in other container

– Increased information security

Page 11: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

Common Database software– – – – – – – – – – – –

Page 12: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

12

Types of DBMSsDesktop– Designed to run on desktop computers– Used by individuals or small businesses– Requires little or no formal training– Does not have all the capabilities of larger DBMSs– Examples: Microsoft Access, FileMaker

Desktop

Server / Enterprise

Handheld

Page 13: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

13

Types of DBMSs (Cont.)

Server / Enterprise– Designed for managing larger and complex databases by

large organizations– Typically operate in a client/server setup– Either centralized or distributed

• Centralized – all data on one server– Easy to maintain– Prone to run slowly when many simultaneous users– No access if the one server goes down

• Distributed – each location has part of the database– Very complex database administration– Usually faster than centralized– If one server crashes, others can still continue to operate.

– Examples: Oracle Enterprise, DB2, Microsoft SQL Server

Page 14: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

14

Types of DBMSs (Cont.)

Handheld– Designed to run on handheld devices– Less complex and have less capabilities than

Desktop or Server DBMSs– Example: Oracle Database Lite, IBM’s DB2

Everywhere.

Page 15: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

15

Database Management System (DBMS)Combination of software and data for– Collecting, storing and managing data in a database

environment.A DBMS includes:– Database– Database engine (for accessing and modifying the DB content)– Data Manipulation Language

Application 1Program-1 Program-2

Application 2Program-1 Program-2

DBMS

Page 16: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

Software through which users and application programs interact with a database

Database Management System (DBMS)

Page 17: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

17

DBMS FunctionsCreate database structure (tables, relationships, schema, etc.)Transform data into information (reports, ..)Provide user with different logical views of actual database

contentProvide security: password authentication, access control

– DBMSs control who can add, view, change, or delete data in the database

ID Name Amt01 John 23.0002 Linda 3.0003 Paul 53.00

Physical viewID Name02 Linda

Name Amt Paul 53.00

ID Name Amt01 John 23.0002 Linda 3.00

Logical views

Page 18: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

18

DBMS Functions (cont.)

Allowing multi-user access with control– Control concurrency of access to data– Prevent one user from accessing data that has not

been completely updated• When selling tickets online, Ticketmaster allows you to

hold a ticket for only 2 minutes to make your purchase decision, then the ticket is released to sell to someone else – that is concurrency control

Page 19: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

19

Database Models

Database model = a representation of the relationship between structures (e.g. tables) in a database

Common database models– Flat file model– Relational model (the most common, today)– Object-oriented database model– Hierarchical model– Network model

Page 20: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

20

Flat File Database model Stores data in basic table structures No relationship between tables Used on PDAs for address book

Page 21: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

21

Relational Database ModelMultiple two-dimensional tables related by common fieldsUses controlled redundancy to create fields that provide linkage

relationships between tables in the database– These fields are called foreign keys – the secret to a relational

database– A foreign key is a field, or group of fields, in one table that is

the primary key of another table Handles One-to-Many and One-to-One relationships

Page 22: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

22

Object-Oriented Database modelNeeded for multimedia applications that

manage images, voice, videos, graphics, etc.Used in conjunction with Object-oriented

programming languagesSlower compared to relational DBMS for

processing large volume of transactionsHybrid object-relational Databases are

emerging

Page 23: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

Hierarchical Database ModelData is organized into a tree-like structure using parent-

child relationships.Created in the 1960s by IBMLimited to storing data in One-to-Many relationships– One parent segment to many child segments

Not very flexibleExamples: IBM’s Information Management System (IMS)

and Windows registry.

Page 24: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

Network Database modelDeveloped in 1969Many-to-Many relationships between entitiesAny record may be linked to any other recordHighly flexible but also highly complexHard to maintainRarely used today

Page 25: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

Data Warehouse

a logical collection of information gathered from many different data sources

Stores current and historical data supports business analysis activities and decision-

making tasks

The primary purpose of a data warehouse is to aggregate information throughout an organization into a single repository for decision-making purposes

25

Page 26: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

26

Data Warehouse FundamentalsMany organizations need internal, external, current,

and historical dataData Warehouse are designed to, typically, store

and manage data from operational transaction systems, Web site transactions, external sources, etc.

Page 27: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

27

Data Warehouse: A Multi-Tiered ArchitectureData Warehouse: A Multi-Tiered Architecture

DataWarehouse

ExtractTransformLoadRefresh

OLAP Engine

AnalysisQueryReportsData mining

Monitor&

IntegratorMetadata

Data Sources Front-End Tools

Serve

Data Marts

Operational DBs

Othersources

Data Storage

OLAP Server

Page 28: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

Multidimensional Analysis Data mining – the process of analyzing data to extract

information not offered by the raw data aloneData-mining tools use a variety of techniques (fuzzy-

logic, neural networks, intelligent agents) in order to find patterns and relationships in large volumes of dataand infer rules that predict future behavior and guide

decision makingOther analytical tools: query tools, statistical tools, etc.

used toAnalyze data, determine relationships, and test hypotheses

about the data

28

Page 29: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

Data Warehouse Fundamentals

Extraction, transformation, and loading (ETL) – a process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse.

Page 30: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

Information Cleansing or Scrubbing Organizations must maintain high-quality data

in the data warehouseInformation cleansing or scrubbing

– a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information

– first, occurs during ETL. Then, when the data is in the Data Warehouse using Information cleansing or scrubbing tools.

30

Page 31: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

31

Data MartSubset of data warehouses that is highly focused

and isolated for a specific population of usersExample: Marketing data mart, Sales data mart, etc.

Page 32: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

Database vs. Data WarehouseDatabases contain information in a series of

two-dimensional tablesIn a Data Warehouse and data mart,

information is multidimensional, it contains layers of columns and rows

32

Date

Produ

ct

Cou

ntr

y

sum

sum TV

VCRPC

1Qtr 2Qtr 3Qtr 4Qtr

U.S.A

Canada

Mexico

sum

Total annual salesof TV in U.S.A.

Page 33: Database and Data Warehouse October 8, 2015. 2 LEARNING GOALS  Explain basic concepts of data management.  Describe traditional file systems and identify

33

Summary QuestionsNotes

1) What is a database, a table, a field, a record, a primary key, a composite key?

2) What are the problems with traditional file systems?

3) What are the major functions of a DBMS?

4) (a) Name some Desktop DBMSs. (b) Name some Enterprise DBMSs. (c) Handheld DBMSs

5) Describe hierarchical database model, network model

6) What are the differences between Flat File, Relational, and Object-oriented database models?

7) What is Data warehouse? Data Mart?

8) What is Extraction, transformation, and loading (ETL)? What is data-mining? What is Information cleansing or scrubbing?