data warehousing, access, analysis, mining, and visualization
TRANSCRIPT
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
1/45
1
CHAPTER 4
Data Warehousing, Access,
Analysis, Mining, and Visualization
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
2/45
2
Data Warehousing, Access,Analysis, Mining,and
Visualization
MSS foundation
Many new concepts
Object-oriented databases
Intelligent databases
Data warehouse
Data mining
Online analytical processing
Multidimensionality
Internet / Intranet / Web
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
3/45
3
Data Warehousing, Access,Analysis,and Visualization
What to do with all the data that organizationscollect, store, and use?(Information overload!)
Solution
Data warehousing
Data access
Data mining
Online analytical processing (OLAP)
Data visualization
Data sources
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
4/45
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
5/45
5
The Natureand SourcesofData
Data: Raw
Information: Data organized to convey meaning
Knowledge: Data items organized and processed to
convey understanding, experience, accumulated
learning, and expertise
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
6/45
6
DSS Data Items
Documents
Pictures
Maps
Sound
Animation
Video
Can be hard or soft
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
7/45
7
Data Sources Internal
External
Personal
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
8/45
8
Data Collection, Problems,and Quality
Problems (Table 4.1)
Quality: determines usefulness of data
Intrinsic data quality
Accessibility data quality
Representation data quality
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
9/45
9
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
10/45
10
Data Quality Issuesin
Data Warehousing Uniformity
Version Completeness check
Conformity check
Genealogy check (drill down)
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
11/45
11
TheInternet andCommercialDatabase
Services
For external data
The Internet: major supplier of external data
Commercial Data Banks: sell access to specialized
databases
Can add external data to the MSS in a timely
manner and at a reasonable cost
Decision Support Systems and Intelligent Systems, Efraim Turban and Jay E. Aronson, 6th edition
Copyright 2001, Prentice Hall, Upper Saddle River, NJ
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
12/45
12
The InternetandCommercialDatabases
ServersUse Web Browsers to
Access vital information by employees and
customers
Implement executive information systems
Implement group support systems (GSS)
Database management systems provide data in
HTML, on Web servers directly
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
13/45
13
Database Management SystemsinDSS
DBMS: Software program for entering (or adding)
information into a database; updating, deleting,
manipulating, storing, and retrieving information
A DBMS + modeling language to develop DSS
DBMS to handle LARGE amounts of information
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
14/45
14
Database Organizationand Structure
Relational databases
Hierarchical databases
Network databases Object-oriented databases
Multimedia-based databases
Document-based databases Intelligent databases
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
15/45
15
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
16/45
16
Data Warehousing
Physical separation of operational and decision support
environments
Purpose: to establish a data repository making operational data
accessible
Transforms operational data to relational form
Only data needed for decision support come from the TPS
Data are transformed and integrated into a consistent structure
Data warehousing (information warehousing): solves the data
access problem
End users perform ad hoc query, reporting analysis and
visualization
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
17/45
17
Data Warehousing Benefits
Increase in knowledge worker productivity
Supports all decision makers data requirements
Provide ready access to critical data
Insulates operation databases from ad hocprocessing
Provides high-level summary information
Provides drill down capabilities
Yields Improved business knowledge
Competitive advantage
Enhances customer service and satisfaction
Facilitates decision making
Help streamline business processes
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
18/45
18
Data WarehouseArchitectureand Process
Two-tier architecture
Three-tier architecture
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
19/45
19
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
20/45
20
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
21/45
21
Data Warehouse Components Large physical database
Logical data warehouse
Data mart Decision support systems (DSS) and executive
information system (EIS)
Can feed OLAP
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
22/45
22
Data Marts
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
23/45
23
DW Suitability
For organizations where
Data are in different systems
Information-based approach to management in use
Large, diverse customer base
Same data have different representations in different
systems
Highly technical, messy data formats
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
24/45
24
CharacteristicsofDataWarehousing
1. Data organized by detailed subject with
information relevant for decision support
2. Integrated data
3. Time-variant data
4. Non-volatile data
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
25/45
25
OLAP: Data AccessandMining, Querying,and
Analysis
Online analytical processing (OLAP)
DSS and EIS computing done by end-users in online
systems
Versus online transaction processing (OLTP)
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
26/45
26
OLAP Activities Generating queries
Requesting ad hoc reports
Conducting statistical and other analyses
Developing multimedia applications
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
27/45
27
OLAP usesthedata warehouseandasetoftools,usually with
multidimensional capabilities
Query tools
Spreadsheets
Data mining tools
Data visualization tools
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
28/45
28
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
29/45
29
UsingSQL for Querying
SQL (Structured Query Language)
Data language
English-like, nonprocedural, very user friendly
language
Free format
Example:
SELECT Name, Salary
FROM Employees
WHERE Salary >2000
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
30/45
30
Data Miningfor
Knowledge discovery in databases
Knowledge extraction
Data archeology
Data exploration
Data pattern processing
Data dredging
Information harvesting
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
31/45
31
MajorData MiningCharacteristicsand Objectives
Data are often buried deep
Client/server architecture
Sophisticated new tools--including advanced visualization
tools--help to remove the information ore End-user miner empowered by data drills and other power
query tools with little or no programming skills
Often involves finding unexpected results
Tools are easily combined with spreadsheets, etc.
Parallel processing for data mining
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
32/45
32
Data Mining Application Areas
Marketing
Banking
Retailing and sales
Manufacturing and production
Brokerage and securities trading
Insurance
Computer hardware and software
Government and defense Airlines
Health care
Broadcasting
Law enforcement
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
33/45
33
IntelligentData Mining
Use intelligent search to discover information withindata warehouses that queries and reports cannoteffectively reveal
Find patterns in the data and infer rules from them
Use patterns and rules to guide decision making andforecasting
Five common types of information that can be yieldedby data mining: 1) association, 2) sequences, 3)classifications, 4) clusters, and 5) forecasting
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
34/45
34
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
35/45
35
Main ToolsUsedinIntelligentData Mining
Case-based Reasoning
Neural Computing
Intelligent Agents
Other Tools
Decision trees Rule induction
Data visualization
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
36/45
36
Data VisualizationandMultidimensionality
Data Visualization Technologies
Digital images
Geographic information systems
Graphical user interfaces
Multidimensions
Tables and graphs
Virtual reality
Presentations
Animation
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
37/45
37
Multidimensionality
3-D + Spreadsheets (OLAP has this)
Data can be organized the way managers like to seethem, rather than the way that the system analysts do
Different presentations of thesame data can bearranged easily and quickly
Dimensions: products, salespeople, market segments,business units, geographical locations, distributionchannels, country, or industry
Measures: money, sales volume, head count, inventoryprofit, actual versus forecast
Time: daily, weekly, monthly, quarterly, or yearly
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
38/45
38
MultidimensionalityLimitations
Extra storage requirements
Higher cost
Extra system resource and time consumption More complex interfaces and maintenance
Multidimensionality is especially popular in
executive information and support systems
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
39/45
39
Geographic Information
Systems (GIS) A computer-based system for capturing, storing,
checking, integrating, manipulating, and displaying
data using digitized maps
Spatially-oriented databases
Useful in marketing, sales, voting estimation, planned
product distribution
Available via the Web
Can use with GPS
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
40/45
40
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
41/45
41
Virtual Reality
An environment and/or technology that provides
artificially generated sensory cues sufficient to
engender in the user some willing suspension of
disbelief
Can share data and interact
Can analyze data by creating a landscape
Useful in marketing, prototyping aircraft designs
VR over the Internet through VRML
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
42/45
42
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
43/45
43
Business Intelligence
onthe Web Can capture and analyze data from Web
Tools deployed on Web
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
44/45
44
Summary
Data for decision making come from internal andexternal sources
The database management system is one of the
major components of most management supportsystems
Familiarity with the latest developments is critical
Data contain a gold mine of information if they candig it out
Organizations are warehousing and mining data
Multidimensional analysis tools and new enterprise-wide system architectures are useful
OLAP tools are also useful
-
8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization
45/45
45
Summary (contd.)
New data formats for multimedia DBMS
Internet and intranets via Web browser
interfaces for DBMS access
Built-in artificial intelligence methods inDBMS