sanjay data warehouse

Upload: sanjay-kaushik

Post on 03-Jun-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Sanjay Data Warehouse

    1/18

    Data Warehousing and Data

    Mining Concepts in IT Industry

    By:Sanjay Kaushik078

  • 8/12/2019 Sanjay Data Warehouse

    2/18

    Data Warehousing

    A data warehouse is a: subject-oriented,integrated, timevarying, non-volatile collection

    of data in support of the management's decision-making process.

    A data warehouse is a centralized repository that

    stores data from multiple information sourcesand transforms them into a common,multidimensional data model for efficientquerying and analysis.

  • 8/12/2019 Sanjay Data Warehouse

    3/18

    Architecture

  • 8/12/2019 Sanjay Data Warehouse

    4/18

    Various ETL tools used in market are:

    IBM InformaticaData StageOracle Warehouse Bulider

    Ab InitioData Junctionmicrosoft sql server integrationtransform ondemand

    transformation manager

  • 8/12/2019 Sanjay Data Warehouse

    5/18

    ETL tools are unified enterprise dataintegration platform that allows companies and

    government organizations of all sizes to access,discover and integrate data from virtually anybusiness system, in any format and deliver thatdata throughout the enterprise for query andreporting (i.e., business intelligence). ETL toolsprovide developers with an interface fordesigning source-to-target mappings,

    transformation, and job control parameters.

  • 8/12/2019 Sanjay Data Warehouse

    6/18

  • 8/12/2019 Sanjay Data Warehouse

    7/18

    OLAP allows business users to slice and dice data at will. Normally data inan organization is distributed in multiple data sources and areincompatible with each other.A retail example: Point-of-sales data and sales made via call-center or the

    Web are stored in different location and formats. It would a timeconsuming process for an executive to obtain OLAP reports such as - Whatare the most popular products purchased by customers between the ages15 to 30?

    OLTPs are designed for optimal transaction speed. When a consumer

    makes a purchase online, they expect the transactions to occurinstantaneously. With a database design, call data modeling, optimized fortransactions the record 'Consumer name, Address, Telephone, OrderNumber, Order Name, Price, Payment Method' is created quickly on thedatabase and the results can be recalled by managers equally quickly ifneeded.

  • 8/12/2019 Sanjay Data Warehouse

    8/18

    Different OLAP tools in market:

    Oracle Enterprise BI ServerMicrosoft BI & OLAP toolsBM Cognos Series 10QlikView

    Board Management IntelligenceToolkitHyperion SystemAP NetWeaver BI

    MicrostrategyAP Business Objects Enterprise XirSAS Enterprise BI

    http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/microsoft-bi-business-intelligence-bi-tools/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/ibm-cognos-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/qlikview-bi-tool/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/board-management-intelligence-toolkit/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/hyperion-system-9/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-netweaver-bi/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/microstrategy-bi-tool/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-businessobjects/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sas-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sas-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sas-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sas-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-businessobjects/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-businessobjects/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/microstrategy-bi-tool/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-netweaver-bi/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-netweaver-bi/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-netweaver-bi/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-netweaver-bi/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/hyperion-system-9/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/board-management-intelligence-toolkit/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/board-management-intelligence-toolkit/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/qlikview-bi-tool/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/ibm-cognos-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/ibm-cognos-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/ibm-cognos-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/ibm-cognos-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/microsoft-bi-business-intelligence-bi-tools/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/
  • 8/12/2019 Sanjay Data Warehouse

    9/18

    Data Mining

    Data mining is part of the knowledge discoveryprocess that offers a new way to look at data.Data mining consists of the nontrivial

    extraction of implicit, previously unknown,and potentially useful information from data.Data mining is then the process of discoveringmeaningful new correlations, patterns andtrends by sifting through vast amounts of datausing statistical and mathematical techniques.

  • 8/12/2019 Sanjay Data Warehouse

    10/18

    Different Phases :1: Exploration.

    2: Model building and validation.3:Deployment.

    Different Data Mining tools in market

    AlphaBlox

    TanagraCART DarwinSPSSAnd many more

    http://www.alphablox.com/http://www.alphablox.com/
  • 8/12/2019 Sanjay Data Warehouse

    11/18

    Case Study: Insurance Company

    Specifically, data mining can help insurance firms inbusiness practices such as:

    Optimizing products and pricing. Acquiring new customers.

    Retaining existing customers. Performing sophisticated campaign management. Detecting fraudulent claims. Estimating outstanding loss reserve.

  • 8/12/2019 Sanjay Data Warehouse

    12/18

    Taking the case of Detection of fraudulent claims.

    fraudulent claims are an ever-present problem for insurance firms, andtechniques for identifying and mitigating fraud are critical for their long-termsuccess.Quite often, successful fraud detection analyses, such as those from a dataminingproject, can provide a very high return on investment.

  • 8/12/2019 Sanjay Data Warehouse

    13/18

  • 8/12/2019 Sanjay Data Warehouse

    14/18

  • 8/12/2019 Sanjay Data Warehouse

    15/18

    Insurance companies around the world lose more and moremoney through fraudulent claims each year. They need torecoup this lost money so they can continue providing

    superior services for their customers.

    Fraudulent claims are typically not the biggest claims,because perpetrators are well aware that the big claims are

    scrutinized more rigorously than average claims.for fraudulent claims, analysts must look for unusualassociations, anomalies or outlying patterns in the data.Specific analytical techniques adept at finding suchsubtleties are social network link analysis, market basket

    analysis, cluster analysis and predictive modeling. Forexample, Informatica uses SPSS to segment customer databy uncovering certain relationships between data sets, whichare red flags for fraud-related losses.

  • 8/12/2019 Sanjay Data Warehouse

    16/18

  • 8/12/2019 Sanjay Data Warehouse

    17/18

    Discover small subsets of claims with a high percentage ofrecoverable fraud? Isolate the factors that indicate a claim or payment requesthas a high probability of fraudulence? Develop rules and use them to flag only those claims orrequests most likely to be fraudulent?

    Ensure your adjusters could review claims or requests thatare not only likely to be fraudulent but also have the greatestadjustment potential?

    Capitalize on existing dataThe previously audited claims hold the key to recoupingmoney in the future. By creating models from historicalinformation, we can accurately pinpoint fraudulent claims

  • 8/12/2019 Sanjay Data Warehouse

    18/18

    Steps for data mining:Building models to find fraudulent claimsUnderstand your dataDetermine your population makeupDiscover relationships in your dataBuild a modelUse the model against actual recordsCompare your subset to the entire populationStrategically deploy your data mining results for optimumsuccess