data mining & warehousing- presentation

Upload: ragini-sundarraman

Post on 14-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Data Mining & Warehousing- Presentation

    1/25

  • 7/30/2019 Data Mining & Warehousing- Presentation

    2/25

    NADEEM . A 111A1012

    JAYADURGA .S 111A1009

    RAGINI .S 111A1013

  • 7/30/2019 Data Mining & Warehousing- Presentation

    3/25

  • 7/30/2019 Data Mining & Warehousing- Presentation

    4/25

    WHICH ARE OURLOWEST/HIGHESTMARGIN

    CUSTOMERS ?

    WHO ARE MY CUSTOMERSAND WHAT PRODUCTS

    ARE THEY BUYING?

    WHICH CUSTOMERS

    ARE MOST LIKELY TOGO TO THECOMPETITORS ?

    WHAT IMPACT WILLNEW PRODUCTS ORSERVICESHAVE ON REVENUE

    AND MARGINS?

    WHAT PRODUCT

    PROMOTIONSHAVE THE BIGGESTIMPACT ON REVENUE?

    WHAT IS THE MOSTEFFECTIVEDISTRIBUTION

    CHANNEL?

  • 7/30/2019 Data Mining & Warehousing- Presentation

    5/25

    I CANT GET THE DATA I NEED NEED AN EXPERT TO GET THE DATA

    I CANT UNDERSTAND THE

    DATA I FOUND

    AVAILABLE DATA POORLYDOCUMENTED

    I CANT USE THE DATA I FOUND RESULTS ARE UNEXPECTED DATA NEEDS TO BE TRANSFORMED FROM

    ONE FORM TO OTHER

    I CANT FIND THE DATA I NEED DATA IS SCATTERED OVER THE NETWORK

    MANY VERSIONS, SUBTLE DIFFERENCES

  • 7/30/2019 Data Mining & Warehousing- Presentation

    6/25

    1960S: DATA COLLECTION, DATABASE CREATION, IMS AND

    NETWORK DBMS

    1970S: RELATIONAL DATA MODEL, RELATIONAL DBMS

    IMPLEMENTATION

    1980S: RDBMS, ADVANCED DATA MODELS (EXTENDED-

    RELATIONAL, OO, DEDUCTIVE, ETC.) AND APPLICATION-ORIENTED DBMS (SPATIAL, SCIENTIFIC, ENGINEERING,ETC.)

    1990S2000S: DATA MINING AND DATA WAREHOUSING, MULTIMEDIA

    DATABASES, AND WEB DATABASES

    6

  • 7/30/2019 Data Mining & Warehousing- Presentation

    7/25

    The data warehouse is that portion of anoverall Architected Data Environment thatserves as the single integrated source of

    data for processing information.

  • 7/30/2019 Data Mining & Warehousing- Presentation

    8/25

    Data explosion problem

    Automated data collection tools and mature database

    technology lead to tremendous amounts of data stored

    in databases, data warehouses and other information

    repositories

    We are drowning in data, but starving for knowledge!

    Solution: Data warehousing and data mining

    Extraction of interesting knowledge (rules, regularities,

    patterns, constraints) from data in large databases

    8

  • 7/30/2019 Data Mining & Warehousing- Presentation

    9/25

    DATA

    WAREHOUSE

    SubjectOriented

    Integrated

    NonVolatile

    Time

    variant

    Accessible

    Process

    Oriented

  • 7/30/2019 Data Mining & Warehousing- Presentation

    10/25

    DATA MART

    STAGING AREA

    OLAP

    OLAP TOOLS

  • 7/30/2019 Data Mining & Warehousing- Presentation

    11/25

    Data

    Acquisition

    Warehouse

    Design

    Analytical

    Data Store

    Enterprise

    Warehouse

    Data Marts

    Metadata

    Directory

    Metadata

    Repository

    DA

    TA

    MANAG

    EMENT

    M

    ETADATA

    MA

    NAGEMENT

    Data

    Analysis

    Web

    Information

    Systems

    Operational,

    External &otherDatabases

  • 7/30/2019 Data Mining & Warehousing- Presentation

    12/25

  • 7/30/2019 Data Mining & Warehousing- Presentation

    13/25

    The data warehouse is distinctly different from theoperational data used and maintained by day-to-day operational systems. Data warehousing is notsimply an access wrapper for operational data,where data is simply dumped into tables for

    direct access.

  • 7/30/2019 Data Mining & Warehousing- Presentation

    14/25

    OPERATIONAL DATA

    Application oriented

    Detailed

    Accurate, as of the moment ofaccess

    Serves the clerical community

    Performance sensitive

    (immediate response required

    when entering a transaction)Flexible structure; variable

    contents

    Small amount of data used in aprocess

    DATA WAREHOUSE

    Subject oriented

    Summarized

    Represents values over time

    Serves the managerial

    community

    Performance relaxed(immediacy not required)

    Static structure

    large amount of data used in aprocess

  • 7/30/2019 Data Mining & Warehousing- Presentation

    15/25

    Data mining (knowledge discovery indatabases): Extraction of interesting (non-trivial, implicit,

    previously unknown and potentially useful)information or patterns from data in largedatabases

    Process of finding different patterns or co-relationsamong the data in large relational databases.

    Popular and highly used in the INFORMATIONINDUSTRY

    15

  • 7/30/2019 Data Mining & Warehousing- Presentation

    16/25

    Market Analysis And Management

    Corporate Analysis And RiskManagement

    Fraud Detection And Management

    16

  • 7/30/2019 Data Mining & Warehousing- Presentation

    17/25

    Where are the data sources for analysis? Credit card transactions, loyalty cards, discount coupons,

    customer complaint calls, plus (public) lifestyle studies

    Target marketing Find clusters of model customers who share the same

    characteristics: interest, income level, spending habits, etc.

    Determine customer purchasing patterns over time

    Conversion of single to a joint bank account: marriage, etc. Cross-market analysis

    Associations/co-relations between product sales

    Prediction based on the association information

    17

  • 7/30/2019 Data Mining & Warehousing- Presentation

    18/25

    18

    Customer profiling

    Data mining can tell you what types of customers buywhat products (clustering or classification)

    Identifying customer requirements

    Identifying the best products for different customers

    Use prediction to find what factors will attract newcustomers

    Provides summary information

    Various multidimensional summary reports

    Statistical summary information (data central tendencyand variation)

  • 7/30/2019 Data Mining & Warehousing- Presentation

    19/25

    Finance planning and asset evaluation Cash flow analysis and predictionAnalysis of trends, financial ratios and market value

    Resource planning: Summarize and compare the resources and spending

    Competition: Monitor competitors and market directions

    Group customers into classes and a class-based pricingprocedure Set pricing strategy in a highly competitive market

    19

  • 7/30/2019 Data Mining & Warehousing- Presentation

    20/25

    Applications widely used in health care, retail, credit card services,

    telecommunications (phone card fraud), etc.

    Approach use historical data to build models of fraudulent behavior

    and use data mining to help identify similar instances

    Examples auto insurance: detect a group of people who stage

    accidents to collect on insurance money laundering: detect suspicious money transactions

    (US Treasury's Financial Crimes Enforcement Network) medical insurance: detect professional patients and ring

    of doctors and ring of references20

  • 7/30/2019 Data Mining & Warehousing- Presentation

    21/25

    21

    Detecting inappropriate medical treatmentAustralian Health Insurance Commission identifies that in

    many cases blanket screening tests were requested (saveAustralian $1m/yr.).

    Detecting telephone fraud Telephone call model: destination of the call, duration,

    time of day or week. Analyze patterns that deviate froman expected norm.

    British Telecom identified discrete groups of callers withfrequent intra-group calls, especially mobile phones, andbroke a multimillion dollar fraud.

  • 7/30/2019 Data Mining & Warehousing- Presentation

    22/25

    Sports IBM Advanced Scout analyzed NBA game statistics

    (shots blocked, assists, and fouls) to gain competitiveadvantage for New York Knicks and Miami Heat

    AstronomyJPL and the Palomar Observatory discovered 22 quasars

    with the help of data mining

    Internet Web Surf-Aid IBM Surf-Aid applies data mining algorithms to Web

    access logs for market-related pages to discovercustomer preference and behavior pages, analyzingeffectiveness of Web marketing, improving Web siteorganization, etc.

    22

  • 7/30/2019 Data Mining & Warehousing- Presentation

    23/25

    Other Applications

    Text mining (news group, email,documents)

    Stream data mining

    Web mining.

    DNA data analysis

  • 7/30/2019 Data Mining & Warehousing- Presentation

    24/25

  • 7/30/2019 Data Mining & Warehousing- Presentation

    25/25