(5)data mining and ware housing

16
A Paper Presentation on  Presented by

Upload: raghavendra-raghav

Post on 06-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 2: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 2/16

B.YOGI REDDY K.SREEHARSHA

06471A0508 06471A1235

EMAIL: [email protected] EMAIL:[email protected]

(3/4) B.Tech (3/4)B.Tech

Mobile No:9490847674

 

NARASARAOPETA ENGINEERING COLLEGE  Kotappakonda Road,Yellamanda(P.O), NARASARAOPET.

 

Page 3: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 3/16

 

Abstract

  One may

claim that the exponential

growth in the amount of 

data provides

great opportunities for data

mining. In many real world

applications, the number of 

sources over which this

information is fragmented

grows at an even faster rate,

resulting in barriers to

widespread application of 

data mining. A data

warehouse is designed

especially for decision

support queries.

Data warehousing

is the process of extracting

and transforming

operational data into

informational data and

loading it into a central

data store or warehouse.

The idea behind

data mining , then is the “

non trivial process of 

identifying valid, novel ,

potentially useful, and

ultimately understandable

patterns in India”

Data mining is

concerned with the analysis

of data and the use of 

software technique for 

finding patterns and

regularities in sets of data.

Data mining potential can

  be enhanced if the

appropriate data has been

collected and stored in data

warehouse

Page 4: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 4/16

Data warehousing

  provides the means to

change raw data into

information for making

effective business decision

  – the emphasis on

information , not data. The

data warehouse is the hub

for decision support data.

This paper also

explains partition algorithm

to discover all requirements

sets from the data

warehousing using the data

mining. Also explained

relation between

operational data , data

warehouse and data marts.

.

  Content Overview

 

Page No

Introduction

4

  Warehouse with a database

5

  What is Data-Warehousing?

5

Page 5: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 5/16

  Warehousing Functions

6

Architecture Of Data Warehouse

7

What is Data Mining ?

7

Warehousing and Mining

8

Data Mining as a part of Knowledge Discovery

9

Goals of Data Mining and Knowlegdge Discovery

9

Compendium

10

 

Introduction

  “Knowledge [no

more Information] is not

only power, but also has

Page 6: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 6/16

significant competitive

advantage”

Organizations have

lately realized that just  processing transactions

and/or information’s faster 

and more efficiently, no

longer provides them with a

competitive advantage vis-

à-vis their competitors for 

achieving business

excellence. Information

technology (IT) tools that

are oriented towards

knowledge processing can

  provide the edge that

organizations need to

survive and thrive in the

current era of fierce

competition. The increasing

competitive pressures and

the desire to leverage

information technology

techniques have led many

organizations to explore the

  benefits of new emerging

technology – viz. "Data

Warehousing and Data

Mining". What is needed

today is not just the latest

and updated to the nano-

second information, but the

cross-functional

information that can help

decisions making activity as

"on-line" process.

 

Evolution of Information

Technology Tools

The evolution of the

information systems

characterize the evolution

of systems from data

maintenance systems, to

systems that transform the

data into "information" for 

use in the decision making

  process. These systems

supported the information

acquisition from the

database of transactional

data. The evolution of new

  patterns in the changing

scenario could not be

 provided by these systems

directly, the planner was

supposed to do this from

experience.

Page 7: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 7/16

The Transformation of Data into Knowledge andassociated tools.

 

What is Data-

Warehousing? 

The data warehousemakes an attempt to figure

out "what we need", beforewe know we need it.

What it actually is ?

* A data

warehouse

stores

current and

historicaldata

* This data is

taken from

various, perhaps

incompatible

, sources and

stored in auniform

format

* Several toolstransform

this data into

meaningful

 business

Processing

 Transactions

Processi 

 Knowledge Data  InformationProce

ManagementInformation

Data Mining Tool

On-Line AnalytiProcessing Too

Page 8: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 8/16

information

for the

  purpose of comparisons,

trends and

forecasting* Data in a

warehouse is

not updates

or changedin any way,

  but is only

loaded andaccessed

later on

* Data is

organizedaccording to

subject

instead of 

application.

In general a database

is not a data warehouse

unless it has the followingtwo features:

• It collects

information froma number of  

different disparate

sources and is the

  place where thisdisparity is

reconciled, and

• It allows

several differentapplications to

make use of thesame information.

conceptually, a Data Warehouse looks like this:

Page 9: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 9/16

 Information Sources

always include the

core operational

systems which form

the backbone of 

day-to-day

activities. It is these

systems which have

traditionally

 provided

management

information to

support decision

making.

  Decision Support 

Tools are used to

analyze the

information stored

in the warehouse,

typically to identify

trends and new

 business

opportunities..

The Data

Warehouse itself is

the bridge between

the operational

systems and the

decision support

tools. It holds a copy

of much of the

operational system

data in a logical

structure which is

more conducive to

analysis. The Data

Warehouse, which

will be refreshed in

scheduled bursts

from operational

systems and from

relevant external

data sources,

  provides a single,

consistent view of 

corporate data,

leaving operational

systems unaffected.

Data – Warehouse

Functions

Page 10: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 10/16

The main function

 behind a data warehouse is

to get the enterprise-wide

data in a format that is most

useful to end-users,

regardless of their locations.

Datawarehousing is used

for:

• Increasing the speed and

flexibility of analysis.

* Providing a foundation

for enterprise-wide

integration and access.

* Improving or re-inventing

 business processes.

* Gaining a clear  

understanding of customer 

 behavior.

Data Warehouse

Architecture

Each

implementation of a data

warehouse is different in its

detailed design (a schematic

high-level of the

architecture and its

components is given in the

figure below), but all are

characterised by a handful

of the following key

components:

• A data

model to define

the warehouse

contents.

A carefullydesigned

warehouse

database,

whether 

hierarchical,

relational, or 

multidimensiona

l. While

choosing a

DBMS it must

 be kept in view

that the database

management

system should

  be powerful

enough to

handle huge

amount of data

Page 11: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 11/16

running up to

terabytes.

• A front end

for  DecisionSupport System

(DSS) for 

reporting and for 

structured and

unstructured

analysis.

Schematic view of the Data Warehouse Architecture.

 Data Mining

Data base mining or Data

mining (DM) (formally

termed Knowledge

Discovery in Databases – 

KDD) is a process that aims

to use existing data to

invent new facts and to

uncover new relationships

 previously unknown even to

experts thoroughly familiar 

with the data. It is like

extracting precious metal

(say gold etc.) and/or gems,

hence the term “mining”, It

is based on filtration and

assaying of mountain of 

data “ore” in order to get

“nuggets” of knowledge.

Legacy Database

Operational Database

External Data Source

Data

Warehous

Metadata

Extract Transform

Maintain

• Query anreporting

• Multi-

dimensioanalysistools

• Other OLtools

• Data min

tools

Page 12: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 12/16

The data mining process is diagrammatically

exemplified in Figure below

Transformed Data 

1

2

  N

The Data Mining Process.

Data Mining and Data Warehousing

· The goal of a data

warehouse is to

support decision

making with data.

 Data Sources

Data

WarehouseSelected

Data

AssimilateTransformSelect Mine

ExtractedInformation

Page 13: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 13/16

· Data mining can be

used in conjunction

with a datawarehouse to help

with certain types of 

decisions.

· Data mining can beapplied to

operational

databases withindividual

transactions.

· To make data

mining more

efficient, the data

warehouse should

have an aggregatedor summarized

collection of data.

· Data mining helps

in extractingmeaningful new

  patterns that cannot

 be found necessarily  by merely querying

or processing data or 

metadata in the datawarehouse.

Data Mining as a Part of the Knowledge Discovery

Process

· Knowledge Discovery in

Databases, frequently

abbreviated as KDD,

typically encompasses more

than data mining.

· The knowledge discovery

 process comprises six

 phases:

Data selection ,Data

about specific items

or categories of 

items, or from stores

in a specific region

or area of the

country, may be

selected.

Data cleansing

 process then may

correct invalid zip

codes or eliminate

records with

incorrect phone

 prefixes.

Enrichment typically

enhances the data with

additional sources of 

information.

Page 14: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 14/16

Data transformation and encoding may be done to reduce the

amount of data.

Goals of Data Mining and Knowledge

Discovery

The goals of data mining

fall into the following

classes:

Prediction :Data mining

can show how certain

attributes within the data

will behave in the future.

Identification: Data

  patterns can be used to

identify the existence of an

item, an event, or an

activity.

Classification : Data

mining can partition the

data so that different classes

or categories can be

identified based on

combinations of parameters.

Optimization :One

eventual goal of data

mining may be to optimize

the use of limited resources

such as time, space, money,

or materials and to

maximize output variables

such as sales or profits

under a given set of  

constraints.

Conclusion

 

Data warehousing provides

the means to change raw

data into information for 

making effective business

decision – the emphasis on

information, not data. The

data warehouse is the hub

for decision support data.

Comprehensive data

warehouse that integrate

Page 15: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 15/16

operational data with

customer, supplier, and

market information have

resulted in an explosion of 

information. Completion

requires timely and

sophisticated analysis on an

integrated view of the data

. Data mining tool can

enhance inference process.

Speed up design cycle, but

con not be substitute for 

statistical and domain

expertise. Data mining

allows for the creation of a

self learning organization.

So the future of 

data warehouse lies in their 

accessibility from the

internet. Successful

implementation of a data

warehouse and data mining

requires a high

  performance; scalable

combination of hardware

and software which can

integrate easily within

existing system, so

customer can use data

warehouse to improve their 

decision –making—and

their competitive advantage

 

Last but never the least,

the Internet  has emerged as

the largest data warehouse

of unstructured and free

form data. The new

technologies are geared

towards mining this great

data warehouse.

A good

data warehouse provides the

RIGHT data…to the

RIGHT PEOPLE… at the

RIGHT time… RIGHT

now! While data

warehousing organizes data

Page 16: (5)Data Mining and Ware Housing

8/3/2019 (5)Data Mining and Ware Housing

http://slidepdf.com/reader/full/5data-mining-and-ware-housing 16/16

for business analysis,

internet has emerged as the

standard for information

sharing.