data mining and data warehousing – a connected view
Post on 21-Dec-2015
213 views
TRANSCRIPT
![Page 1: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/1.jpg)
Data Mining and Data Warehousing – a connected view
![Page 2: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/2.jpg)
Introduction
• Data mining describes a collection of techniques that aim to find useful but undiscovered patterns in collected data
• The goal of data mining is to create models for decision-making that predict future behavior based on analysis of past activity
![Page 3: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/3.jpg)
Introduction
• Data warehousing is a blend of technologies aimed at the effective integration of operational databases into an environment that enables the strategic use of data. These technologies include relational and multidimensional database management systems, client/server architecture, metadata modeling and repositories, graphical user interfaces, and much more.
![Page 4: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/4.jpg)
Operational vs Informational Databases
![Page 5: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/5.jpg)
Table 2-1 Operational Versus informational Databases
![Page 6: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/6.jpg)
Operational vs Informational Databases
![Page 7: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/7.jpg)
Table 2-2 Comparison of Data Stores, and Data Warehouses
![Page 8: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/8.jpg)
Definition and characteristics of a data warehouse
• It’s a database designed for analytical tasks• It supports a relatively small number of users• Its usage is read-intensive• Its content is periodically updated (mostly additions)• It contains current and historical dta• It contains a few large tables• Each query frequently results in a large result set and involves
frequent full table scan and multi-table joins• A formal definition of the data warehouse is offered by W.H.
Inmon– A data warehouse is a subject-oriented, integrated, time-variant, non-
volatile collection of data in support of management decisions
![Page 9: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/9.jpg)
Data warehouse architecture
![Page 10: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/10.jpg)
Figure 2-1 Data Warehouse Environment
![Page 11: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/11.jpg)
Data warehouse architecture
![Page 12: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/12.jpg)
Figure 2-1 Data Warehouse and Data Operational Data Store
![Page 13: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/13.jpg)
Data warehouse architecture
![Page 14: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/14.jpg)
Figure 2-3 Two-tiered Data WarehouseArchitecture
![Page 15: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/15.jpg)
Data warehouse architecture
![Page 16: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/16.jpg)
Figure 2-4 Multi-tiered Data WarehouseArchitecture
![Page 17: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/17.jpg)
Data mining defined
• Data mining as the process of discovering meaningful new correlations, patterns, and trends by digging into (mining) large amounts of data stored in warehouse.
• The major attraction of data mining is its capability to build predictive rather than retrospective models
![Page 18: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/18.jpg)
Predictive versus Retrospective Models
![Page 19: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/19.jpg)
Table 2-3 Predictive Versus Retrospective Models
![Page 20: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/20.jpg)
Data Mining application Domain
• Customer retention
• Sales and customer service
• Marketing
• Risk Assessment and Fraud Detection
![Page 21: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/21.jpg)
Data Mining Categories and Research Focus
• Data mining techniques deal with discovery and learning, and as such fall into three major learning modes: supervised, unsupervised, and reinforcement learning
• Data mining techniques can be categorized:– Representation of models and results
– The type of data the techniques operates on
– Application type
– Pattern attributes
![Page 22: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/22.jpg)
Data Mining Categories and Research Focus
• Data mining categorized by business problems– Retrospective Analysis
– Predictive Analysis
• These two classes of business problems can be further classified by– Classification
– Clustering/Segmentation
– Associations
– Sequencing
![Page 23: Data Mining and Data Warehousing – a connected view](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d605503460f94a412ab/html5/thumbnails/23.jpg)
Data Mining Categories and Research Focus
• Approaches that underlie the most contemporary research in data mining:– The induction approach– The database querying approach– The compression approach– The approach of approximation and searching