data mining
DESCRIPTION
TRANSCRIPT
![Page 1: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/1.jpg)
Ahmed Moussa30-Feb-2010
![Page 2: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/2.jpg)
What is Data Mining?
![Page 3: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/3.jpg)
The Evolution of Data Analysis
![Page 4: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/4.jpg)
Evolutionary Step Business Question Enabling Technologies
Product Providers Characteristics
Data Collection (1960s)"What was my total revenue in the last
five years?"
Computers, tapes, disks IBM, CDC Retrospective, static data
delivery
Data Access (1980s) "What were unit sold last March?"
Relational databases
(RDBMS), Structured Query Language (SQL),
ODBC
Oracle, Sybase, Informix, IBM,
Microsoft
Retrospective, dynamic data delivery at record
level
Data Warehousing & Decision Support (1990s)
"What were unit sales in last March?
Drill down to Other."
On-line analytic processing (OLAP),
multidimensional databases, data
warehouses
SPSS, Comshare, Arbor, Cognos,
Microstrategy,NCR
Retrospective, dynamic data delivery at multiple
levels
Data Mining (Emerging Today)"What’s likely to
happen to unit sales next month? Why?"
Advanced algorithms,
multiprocessor computers, massive
databases
SPSS/Clementine, Lockheed, IBM, SGI, SAS, NCR, Oracle, numerous
startups
Prospective, proactive information delivery
- - RDBMS: A relational database management system
- ODBC: Open Database Connectivity (ODBC) provides a standard software API method for using database management systems (DBMS).
- OLAP : Online analytical processing, is an approach to quickly answer multi-dimensional analytical queries.
- SPSS: Statistical Package for the Social Sciences (formerly SPSS) is a computer program used for statistical analysis. Before 2009 it was called SPSS, but in 2009 it was re-branded as PASW.
![Page 5: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/5.jpg)
Results of Data Mining Include
![Page 6: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/6.jpg)
Results of Data Mining Include• Forecasting what may happen in the future
• Classifying people or things into groups by recognizing patterns.
• Clustering people or things into groups based on their attributes.
• Associating what events are likely to occur together.
• Sequencing what events are likely to lead to later events.
![Page 7: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/7.jpg)
Results of Data Mining Include• Forecasting what may happen in the future
• Classifying people or things into groups by recognizing patterns.
• Clustering people or things into groups based on their attributes.
• Associating what events are likely to occur together.
• Sequencing what events are likely to lead to later events.
![Page 8: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/8.jpg)
Results of Data Mining Include• Forecasting what may happen in the future
• Classifying people or things into groups by recognizing patterns.
• Clustering people or things into groups based on their attributes.
• Associating what events are likely to occur together.
• Sequencing what events are likely to lead to later events.
![Page 9: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/9.jpg)
Results of Data Mining Include• Forecasting what may happen in the future
• Classifying people or things into groups by recognizing patterns.
• Clustering people or things into groups based on their attributes.
• Associating what events are likely to occur together.
• Sequencing what events are likely to lead to later events.
![Page 10: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/10.jpg)
Results of Data Mining Include• Forecasting what may happen in the future
• Classifying people or things into groups by recognizing patterns.
• Clustering people or things into groups based on their attributes.
• Associating what events are likely to occur together.
• Sequencing what events are likely to lead to later events.
![Page 11: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/11.jpg)
Data mining is not• Crunching of bulk data
• “Blind” application of algorithms• Going to find relationships where none exist
• Presenting data in different ways
• A database intensive task
• A difficult to understand technology requiring an advanced degree in computer science
![Page 12: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/12.jpg)
Data Mining Is• A class of techniques that find patterns in data.
• A user-centric, interactive process which leverages analysis technologies and computing power.
• A group of techniques that find relationships that have not previously been discovered.
• Not reliant on an existing database.
• A relatively easy task that requires knowledge of the business problem/subject matter expertise.
![Page 13: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/13.jpg)
Data Mining Is• A class of techniques that find patterns in data.
• A user-centric, interactive process which leverages analysis technologies and computing power.
• A group of techniques that find relationships that have not previously been discovered.
• Not reliant on an existing database.
• A relatively easy task that requires knowledge of the business problem/subject matter expertise.
![Page 14: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/14.jpg)
Data Mining Is• A class of techniques that find patterns in data.
• A user-centric, interactive process which leverages analysis technologies and computing power.
• A group of techniques that find relationships that have not previously been discovered.
• Not reliant on an existing database.
• A relatively easy task that requires knowledge of the business problem/subject matter expertise.
![Page 15: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/15.jpg)
Data Mining Is• A class of techniques that find patterns in data.
• A user-centric, interactive process which leverages analysis technologies and computing power.
• A group of techniques that find relationships that have not previously been discovered.
• Not reliant on an existing database.
• A relatively easy task that requires knowledge of the business problem/subject matter expertise.
![Page 16: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/16.jpg)
Data Mining Is• A class of techniques that find patterns in data.
• A user-centric, interactive process which leverages analysis technologies and computing power.
• A group of techniques that find relationships that have not previously been discovered.
• Not reliant on an existing database.
• A relatively easy task that requires knowledge of the business problem/subject matter expertise.
![Page 17: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/17.jpg)
Examples of What People are Doing with Data Mining:
• Fraud/Non-Compliance Anomaly detection
• Isolate the factors that lead to fraud, waste and abuse• Target auditing and investigative efforts more
effectively• Credit/Risk Scoring• Intrusion detection • Parts failure prediction • Recruiting/Attracting customers • Maximizing profitability (cross selling, identifying
profitable customers) • Service Delivery and Customer Retention • Build profiles of customers likely to use which services
![Page 18: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/18.jpg)
Examples of What People are Doing with Data Mining:
• Fraud/Non-Compliance Anomaly detection
• Isolate the factors that lead to fraud, waste and abuse• Target auditing and investigative efforts more
effectively• Credit/Risk Scoring• Intrusion detection • Parts failure prediction • Recruiting/Attracting customers • Maximizing profitability (cross selling, identifying
profitable customers) • Service Delivery and Customer Retention • Build profiles of customers likely to use which services
![Page 19: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/19.jpg)
Examples of What People are Doing with Data Mining:
• Fraud/Non-Compliance Anomaly detection
• Isolate the factors that lead to fraud, waste and abuse• Target auditing and investigative efforts more
effectively• Credit/Risk Scoring• Intrusion detection • Parts failure prediction • Recruiting/Attracting customers • Maximizing profitability (cross selling, identifying
profitable customers) • Service Delivery and Customer Retention • Build profiles of customers likely to use which services
![Page 20: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/20.jpg)
Examples of What People are Doing with Data Mining:
• Fraud/Non-Compliance Anomaly detection
• Isolate the factors that lead to fraud, waste and abuse• Target auditing and investigative efforts more
effectively• Credit/Risk Scoring• Intrusion detection • Parts failure prediction • Recruiting/Attracting customers • Maximizing profitability (cross selling, identifying
profitable customers) • Service Delivery and Customer Retention • Build profiles of customers likely to use which services
![Page 21: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/21.jpg)
Examples of What People are Doing with Data Mining:
• Fraud/Non-Compliance Anomaly detection
• Isolate the factors that lead to fraud, waste and abuse• Target auditing and investigative efforts more
effectively• Credit/Risk Scoring• Intrusion detection • Parts failure prediction • Recruiting/Attracting customers • Maximizing profitability (cross selling, identifying
profitable customers) • Service Delivery and Customer Retention • Build profiles of customers likely to use which services
![Page 22: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/22.jpg)
Examples of What People are Doing with Data Mining:
• Fraud/Non-Compliance Anomaly detection
• Isolate the factors that lead to fraud, waste and abuse• Target auditing and investigative efforts more
effectively• Credit/Risk Scoring• Intrusion detection • Parts failure prediction • Recruiting/Attracting customers • Maximizing profitability (cross selling, identifying
profitable customers) • Service Delivery and Customer Retention • Build profiles of customers likely to use which services
![Page 23: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/23.jpg)
Examples of What People are Doing with Data Mining:
• Fraud/Non-Compliance Anomaly detection
• Isolate the factors that lead to fraud, waste and abuse• Target auditing and investigative efforts more
effectively• Credit/Risk Scoring• Intrusion detection • Parts failure prediction • Recruiting/Attracting customers • Maximizing profitability (cross selling, identifying
profitable customers) • Service Delivery and Customer Retention • Build profiles of customers likely to use which services
![Page 24: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/24.jpg)
Examples of What People are Doing with Data Mining:
• Fraud/Non-Compliance Anomaly detection
• Isolate the factors that lead to fraud, waste and abuse• Target auditing and investigative efforts more effectively
• Credit/Risk Scoring• Intrusion detection • Parts failure prediction • Recruiting/Attracting customers • Maximizing profitability (cross selling, identifying profitable
customers) • Service Delivery and Customer Retention
• Build profiles of customers likely to use which services
![Page 25: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/25.jpg)
Examples of What People are Doing with Data Mining:
Right offer for the right customer throw the right channel in the right time
![Page 26: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/26.jpg)
How Can We Do Data Mining?• A standard process
• Existing data
• Software technologies
• Situational expertise
![Page 27: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/27.jpg)
How Can We Do Data Mining?• A standard process
• Existing data
• Software technologies
• Situational expertise
The data mining process must be reliable and repeatable by people with little data mining background.
![Page 28: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/28.jpg)
Phases and Tasks
BusinessUnderstanding
DataUnderstanding
EvaluationDataPreparation
Modeling
Determine Business ObjectivesBackgroundBusiness ObjectivesBusiness Success Criteria
Situation AssessmentInventory of ResourcesRequirements, Assumptions, and ConstraintsRisks and ContingenciesTerminologyCosts and Benefits
Determine Data Mining GoalData Mining GoalsData Mining Success Criteria
Produce Project PlanProject PlanInitial Asessment of Tools and Techniques
Collect Initial DataInitial Data Collection Report
Describe DataData Description Report
Explore DataData Exploration Report
Verify Data Quality Data Quality Report
Data SetData Set Description
Select Data Rationale for Inclusion / Exclusion
Clean Data Data Cleaning Report
Construct DataDerived AttributesGenerated Records
Integrate DataMerged Data
Format DataReformatted Data
Select Modeling TechniqueModeling TechniqueModeling Assumptions
Generate Test DesignTest Design
Build ModelParameter SettingsModelsModel Description
Assess ModelModel AssessmentRevised Parameter Settings
Evaluate ResultsAssessment of Data Mining Results w.r.t. Business Success CriteriaApproved Models
Review ProcessReview of Process
Determine Next StepsList of Possible ActionsDecision
Plan DeploymentDeployment Plan
Plan Monitoring and MaintenanceMonitoring and Maintenance Plan
Produce Final ReportFinal ReportFinal Presentation
Review ProjectExperience Documentation
Deployment
![Page 29: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/29.jpg)
Phases and Tasks
![Page 30: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/30.jpg)
Phases and Tasks
![Page 31: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/31.jpg)
Phases and Tasks
A) Business UnderstandingDetermine Business ObjectivesBackgroundBusiness ObjectivesBusiness Success CriteriaSituation AssessmentInventory of Resources Requirements, Assumptions, and ConstraintsRisks and ContingenciesTerminologyCosts and Benefits
Determine Data Mining Goal Data Mining GoalsData Mining Success Criteria
Produce Project PlanProject PlanInitial Asessment of Tools and Techniques
![Page 32: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/32.jpg)
Phases and Tasks
![Page 33: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/33.jpg)
Phases and Tasks
B) Data UnderstandingExplore DataData Exploration Report Verify Data Quality Data Quality Report
Collect Initial DataInitial Data Collection Report
Describe DataData Description Report
![Page 34: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/34.jpg)
Phases and Tasks
![Page 35: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/35.jpg)
Phases and Tasks
C) Data PreparationData SetData Set DescriptionSelect Data Rationale for Inclusion/ExclusionClean Data Data Cleaning Report
Integrate DataMerged DataFormat DataReformatted DataConstruct DataDerived AttributesGenerated Records
![Page 36: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/36.jpg)
Phases and Tasks
![Page 37: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/37.jpg)
Phases and Tasks
D) ModelingSelect Modeling Modeling TechniqueModeling AssumptionsGenerate Test DesignTest DesignBuild ModelParameter SettingsModels and Model DescriptionAssess ModelModel AssessmentRevised Parameter Settings
![Page 38: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/38.jpg)
Phases and Tasks
![Page 39: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/39.jpg)
Phases and Tasks
D) EvaluationEvaluate ResultsAssessment of Data Mining Results w.r.t. Business Success CriteriaApproved ModelsReview ProcessReview of ProcessDetermine Next StepsList of Possible ActionsDecision
![Page 40: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/40.jpg)
Phases and Tasks
![Page 41: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/41.jpg)
Phases and Tasks
E) DeploymentPlan DeploymentDeployment PlanPlan Monitoring and MaintenanceMonitoring and Maintenance PlanProduce Final ReportFinal ReportFinal PresentationReview ProjectExperience and Documentation
![Page 42: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/42.jpg)
Data mining success story
Scheduled its workforce to provide faster, more accurate answers
to questions.
The US Internal Revenue Service needed to improve customer service and...
![Page 43: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/43.jpg)
Data mining success story
analyzed suspects’ cell phone usage to focus investigations.
The US Drug Enforcement Agency needed to be more effective in their drug “busts” and
![Page 44: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/44.jpg)
Data mining success story
Reduced direct mail costs by 30% while garnering 95% of the campaign’s
revenue.
HSBC need to cross-sell more effectively by identifying profiles that would be interested in
higher yielding investments and...
![Page 45: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/45.jpg)
Final Comments
![Page 46: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/46.jpg)
Data Mining can be utilized in any organization that needs to find patterns or relationships in their data.
![Page 47: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/47.jpg)
Data Mining can be utilized in any organization that needs to find patterns or relationships in their data.
By using the DM methodology, analysts can have a reasonable level of assurance that their Data Mining efforts will render useful, repeatable, and valid results.
![Page 48: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/48.jpg)
![Page 49: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/49.jpg)
![Page 50: Data mining](https://reader034.vdocuments.us/reader034/viewer/2022051816/5457cdffaf79597c108b73ef/html5/thumbnails/50.jpg)