chapter 6: foundations of business intelligence - databases and information management dr. andrew p....

29
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.

Upload: stewart-weaver

Post on 29-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Chapter 6:Foundations of Business

Intelligence - Databases and Information Management

Dr. Andrew P. Ciganek, Ph.D.

Page 2: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

File Organization Concepts

• Computer system uses hierarchies– Database: Group of related files

– File: Group of records of same type

– Record: Group of related fields• Record: Describes an entity (person, place, thing)

– Field: Group of characters• Attribute: Characteristic describing the entity

– e.g., Date or Grade belong to entity COURSE

Page 3: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

The Data Hierarchy

Page 4: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Problems With Traditional File Environment

• Files maintained by different departments– Data redundancy and inconsistency

• Data redundancy: Duplicate data in multiple files• Data inconsistency: Same attribute, different values

– Program-data dependence• Changes in program requires changes to data

accessed by program

Page 5: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Traditional File Processing

Page 6: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

The Database Approach to Data Management

• Database– Data organized to serve many applications by

centralizing data and controlling redundant data• Database management system (DBMS)

– Separates logical and physical views of data– Solves problems of traditional file environment

• Controls redundancy• Eliminated inconsistency• Enables central management and security

Page 7: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Human Resources Database with Multiple Views

Page 8: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Relational DBMS

• Data as 2-dimension tables called relations or files• Each table contains data on entity and attributes• Table: Grid of columns and rows

– Rows: Records for different entities

– Columns: Represents attribute (field) for entity

– Key field: Field used to uniquely identify each record• Primary key: Field in table used for key fields• Foreign key: Primary key used in second table as

look-up field to identify records from original table

Page 9: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Relational Database Tables

Page 10: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Relational Database Tables

Page 11: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Operations of a Relational DBMS

• Basic operations to develop useful sets of data– SELECT: Creates subset of data of all records that

meet stated criteria

– JOIN: Combines relational tables to provide more information than available in individual tables

Page 12: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Basic Relational DBMS Operations

Select Part_Number = 137 or 150, Join by Supplier_Number

Page 13: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Example of an SQL Query

• Select Statement: Query data for specific info• Conditional Selection: ID which rows of a table are

displayed, based on criteria contained in the WHERE clause

• Joining Two Tables: Used to combine data from two or more tables and display the results

Page 14: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

An Access Query

Page 15: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Designing Databases

• Design process identifies– Relationships among data elements, redundant

database elements

– Most efficient way to group data elements to meet business requirements, needs of app programs

• Normalization– Minimize redundant data elements

Page 16: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Normalization of Order

Page 17: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Using Databases to Improve Performance and Decision Making

• For very large databases and systems, special capabilities and tools are required for analyzing large quantities of data and for accessing data from multiple systems– Data warehousing

– Data mining

Page 18: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Database Warehouses

• Store current and historical data from many core operational transaction systems

• Consolidates and standardizes information for use across enterprise, but data cannot be altered

• Provide query, analysis, and reporting tools

Page 19: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Components of a Data Warehouse

Page 20: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Business Intelligence

• Tools for consolidating, analyzing, and providing access to vast amounts of data to help users make better business decisions– e.g., Harrah’s Entertainment analyzes customers to

develop gambling profiles and identify most profitable customers

• Principle tools include– Software for database query and reporting

– Online analytical processing (OLAP)

– Data mining

Page 21: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Online Analytical Processing (OLAP)

• Supports multidimensional data analysis– Gives first glimpse of possible relationships

• Enables viewing data using multiple dimensions– Each aspect of information (product, pricing, cost,

region, time period) is different dimension• e.g., How many washers sold in East in June?

• OLAP enables rapid, online answers to ad hoc queries

Page 22: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Multidimensional Data Model

Page 23: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Data Mining

• More discovery driven than OLAP– Finds hidden patterns, relationships in large dbs

– Infers rules to predict future behavior

– The patterns and rules are used to guide decision making and forecast the effect of those decisions

• Popularly used to provide detailed analyses of patterns in customer data for 1:1 marketing campaigns or to identify profitable customers

Page 24: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Using Databases to Improve Performance and Decision Making

• Predictive analysis– Uses data mining techniques, historical data, and

assumptions about future conditions to predict outcomes of events

• e.g., Probability a customer will respond to an offer or purchase a specific product

• Data mining seen as challenge to individual privacy– Combines information from many diverse sources to

create detailed “data image” about each of us • e.g., income, driving habits, hobbies, families, and

political interests

Page 25: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Text Mining:For’ and ‘Against’ Exercise

1. Read the article and the following statement.

2. Summarize the best evidence you can give FOR, or in support of, the statement.

3. Summarize the best evidence you can give AGAINST the statement.

4. Include only accurate evidence

• The benefits of text mining greatly outweigh the costs.

Page 26: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Web Mining

• Discovery and analysis of useful patterns and information from WWW– e.g., to understand customer behavior, evaluate

effectiveness of Web site, etc.

• Web content mining– Knowledge extracted from content of Web pages

• Web structure mining– e.g., links to and from Web page

• Web usage mining– User interaction data recorded by Web server

Page 27: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Managing Data Resources

• Establishing an information policy– Information policy: Specifies firm’s rules, procedures,

roles for sharing, standardizing data

– Data administration: Responsible for specific policies and procedures; data governance

– Database administration: Database design and management group responsible for defining, organizing, implementing, maintaining database

Page 28: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Ensuring Data Quality

• More than 25% critical data in Fortune 1000 company databases is inaccurate or incomplete– Before new database in place, need to identify and

correct faulty data and establish better routines for editing data once database in operation

– Most data quality problems stem from faulty input

Page 29: Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D

Managing Data Resources

• Data quality audit– Structured survey of the accuracy and level of

completeness of the data in an IS

• Data cleansing– Detecting, and correcting data that are incorrect,

incomplete, improperly formatted, or redundant

– Enforces consistency among different sets of data from separate IS