data mining by jason baltazar, phil cademas, jillian latham, rachel peeler & kamila singh
TRANSCRIPT
What is Data Mining?Data Mining is data processing using
sophisticated data search capabilities and statistical algorithms to discover patters and correlations in large preexisting databases.
2 Broad Categories: Supervised & Unsupervised
Unsupervised Data Mining“Descriptive Modeling”
Uncover patterns and relationships among data
No predetermined parameters
Observations after analysis
Used to assist in making business decisions
Cluster Analysis “Automated Data Mining”
Used to discover the segments or groups within a customer data set
Determine classes of similar customers that naturally fit together
Demographics
Segmented Markets Marketing and Advertising
Supervised Data Mining “Predictive Modeling”
Set goals and parameters prior to data mining
Concentration: only relevant patterns
Predict outcomes
Anomaly Detection, Classification & Prediction, Regression, Analysis
Anomaly Detection Models built to specify “normal” ranges of
results
Fraud Detection Tax, insurance, credit card industries
Prevent Identity Theft
Detect breaches in computer security
PayPal 15% of all e-commerce in the U.S.
Classification & Prediction Most common data analysis tool
“Who will buy what, and how much will they buy?”
Credit analysis / Credit Scoring – Who are my “good credit risks?”
Based on spending habits, income, and/or demographics
Can be used in customer segmentation, business modeling, credit analysis, etc.
Classification & Prediction Human Resources
Turnover analysis, employee development, recruiting, training, and employee retention
Determine the “value” of employees Fill leadership/management positions from within
the organization Groom and promote based on a set of
predetermined skills, attitudes, and competencies
Regression AnalysisStatistics applies to data to make
predictions i.e. How product price and promotions
affect sales
Marketing, pricing, product positioning, sales forecasting, advertising, human relations, customer service
Objectives: market response modeling and sales forecasting
Text MiningText Mining is the process of
automatically processing text and extracting information from it
Presidential election
Text Mining ApplicationsSecurity Applications
Biomedical Applications
Online Media Applications
Academic Applications
Data Mining Advantages
Helps to reduce costs
Provides improved and more detail oriented service
Increases market effectiveness
Beneficial to all industries
Data Mining Disadvantages Privacy Issues
Access to personal information
Security Issues Insufficient security systems
Misuse of information & inaccurate information
Data Mining Privacy
Who has access to consumer personal information CVS Pharmacy & Marketing Companies
Data Mining Ethics: Consumers
How far is too far? Trustworthy?
Data is being collected & used
Opt out boxes What are some solutions that give consumers
control?Access to databases that have their informationThe right to change what information is available
Data Mining Ethics: Businesses
Help enhance overall customer satisfaction Profit enhancer? Violation of privacy
Sometimes partnered with marketing companies They also have access to private
information