forecast anything! the seven data mining models andy cheung isv developer evangelist microsoft hong...
TRANSCRIPT
Forecast Anything!The Seven Data Mining Models
Andy CheungISV Developer EvangelistMicrosoft Hong Kong
Forecast Anything!The Seven Data Mining Models
Andy CheungISV Developer EvangelistMicrosoft Hong Kong
Agenda
AnnouncementAnnouncement
OverviewOverview
Microsoft Mining Model AlgorithmsMicrosoft Mining Model Algorithms
Lucky Draw!!!Lucky Draw!!!
AnnouncementLearn Microsoft Technologies and Win Learn Microsoft Technologies and Win Some Prize!Some Prize!To make it easier for you to learn Microsoft technologies, To make it easier for you to learn Microsoft technologies, we have changed the way to deliver seminar contents we have changed the way to deliver seminar contents by offering you by offering you Offline Webcast CDsOffline Webcast CDs. .
•3 CDs in 6 months – 3 topics and assessment3 CDs in 6 months – 3 topics and assessment•If you can pass the assessment criteria, you will receive If you can pass the assessment criteria, you will receive a $150 Park’n Shop cash coupon!a $150 Park’n Shop cash coupon!
Since this is a trial offer, the maximum number of Since this is a trial offer, the maximum number of participants will be limited to participants will be limited to 5050 (on first-come-first- (on first-come-first-serve basis). Register now by sending email to Microsoft serve basis). Register now by sending email to Microsoft Macau Team at Macau Team at [email protected]@microsoft.com!!
Data Mining Overview
Microsoft Data Mining AlgorithmsMicrosoft Data Mining Algorithms
Explores Explores Your DataYour Data
Finds Finds PatternsPatterns
Performs Performs PredictioPredictio
nsns
Microsoft Mining Model Algorithms
Decision Trees Decision Trees
Naive BayesNaive Bayes
Cluster AnalysisCluster Analysis
Sequence ClusteringSequence Clustering
Association RulesAssociation Rules
Time SeriesTime Series
Neural NetworksNeural Networks
Decision TreesClassify each Classify each casecase to one of a few to one of a few discrete discrete broad categoriesbroad categories of selected of selected attributeattributess
The process of building is recursive The process of building is recursive partitioning – splitting data into partitioning – splitting data into partitions and then splitting it up morepartitions and then splitting it up more
Initially all cases are in one big boxInitially all cases are in one big box
Decision Trees
The algorithm tries all possible breaks in The algorithm tries all possible breaks in classes using all possible values of eachclasses using all possible values of each inputinput attribute; it then selects the split that attribute; it then selects the split that partitions data to the purest classespartitions data to the purest classes of the of the searched variablesearched variable
Several measures of puritySeveral measures of purity
Then it repeats splitting for each new classThen it repeats splitting for each new classAgain testing all possible breaksAgain testing all possible breaks
Unuseful branches of the tree can be Unuseful branches of the tree can be pre-pruned or post-prunedpre-pruned or post-pruned
Decision Trees
Decision trees are used for Decision trees are used for classification aclassification annd predictiond prediction
Typical questions:Typical questions:Predict which customers will leavePredict which customers will leave
Help in mailing and promotion Help in mailing and promotion campaignscampaigns
Explain reasons for a decisionExplain reasons for a decision
What are the movies young female What are the movies young female customers like to buy?customers like to buy?
Microsoft Mining Models
Naïve BayesClassification and Prediction ModelClassification and Prediction Model
Calculates probabilities for each Calculates probabilities for each possible state of the input attribute possible state of the input attribute given each state of the predictable given each state of the predictable attributeattribute
Naïve Bayes Used for classificationUsed for classification
Assign new cases to predefined Assign new cases to predefined classesclasses
Some typical questions:Some typical questions:Categorize bank loan applicationsCategorize bank loan applications
Determining which home telephone Determining which home telephone lines are used for Internet accesslines are used for Internet access
Assigning customers to predefined Assigning customers to predefined segmentssegments
Quickly gathering basic Quickly gathering basic comprehensioncomprehension
Cluster AnalysisGrouping data into clustersGrouping data into clusters
Objects within a cluster have high Objects within a cluster have high similarity based on the attribute valuessimilarity based on the attribute values
The class label of each object is not The class label of each object is not knownknown
Several techniquesSeveral techniquesPartitioning methodsPartitioning methods
Hierarchical methodsHierarchical methods
Density based methodsDensity based methods
Model based methods, more…Model based methods, more…
Cluster AnalysisSegments a heterogeneous Segments a heterogeneous population into a number of more population into a number of more homogenous subgroups or homogenous subgroups or clustersclusters
Some typical questions:Some typical questions:Discover distinct groups of Discover distinct groups of customerscustomers
Identification of groups of houses in Identification of groups of houses in a citya city
In biology, derive animal and plant In biology, derive animal and plant taxonomiestaxonomies
Sequence Clustering
AAnalyzes sequence-oriented data that nalyzes sequence-oriented data that contains discrete-valued series contains discrete-valued series
TThe sequence attribute in the series he sequence attribute in the series holds a set of events with a specific holds a set of events with a specific order order that can be cosnsidered as a that can be cosnsidered as a modelmodel
Typically used forTypically used for Web customer Web customer analysisanalysis
Can be used for any other sequential Can be used for any other sequential datadata
Sequence Clustering
UserUser SequenceSequence
11 frontpage news travel travelfrontpage news travel travel
22 news news news news newsnews news news news news
33 frontpage news frontpage news frontpagefrontpage news frontpage news frontpage
44 news newsnews news
55 frontpage news news travel travel travelfrontpage news news travel travel travel
66 news weather weather weather weathernews weather weather weather weather
77 news health health business business businessnews health health business business business
88 frontpage sports sports sports weatherfrontpage sports sports sports weather
99 weatherweather
Click-Stream Analysis
Microsoft Mining Models
Association RulesFor For market basket analysesmarket basket analyses
Identify cross-selling opportunitiesIdentify cross-selling opportunities
Arrange attractive packagesArrange attractive packages
Considers each attribute/value pair Considers each attribute/value pair as an itemas an item
An item set is a combination of items An item set is a combination of items in a single transactionin a single transaction
The algorithm scans through the The algorithm scans through the dataset trying to find item sets that dataset trying to find item sets that tend to appear in many transactionstend to appear in many transactions
Association Rules – Support
Support is the percentage of Support is the percentage of rowsrows containing the item combination containing the item combination compared to the total number of compared to the total number of rows:rows:
Transaction 1: Frozen pizza, cola, milk Transaction 1: Frozen pizza, cola, milk Transaction 2: Milk, potato chips Transaction 2: Milk, potato chips Transaction 3: Cola, frozen pizza Transaction 3: Cola, frozen pizza Transaction 4: Milk, pretzels Transaction 4: Milk, pretzels Transaction 5: Cola, pretzels Transaction 5: Cola, pretzels
The support for the rule “If a The support for the rule “If a customer purchases Cola, then they customer purchases Cola, then they will purchase Frozen Pizza” is 40%will purchase Frozen Pizza” is 40%
Association Rules – ConfidenceWWhat if 100% of customers buy milk hat if 100% of customers buy milk
and and only 20% of those buy potato chips? only 20% of those buy potato chips?
The confidence of an association rule The confidence of an association rule is the support for the combination is the support for the combination divided by the support for the divided by the support for the conditioncondition
This gives a confidence for a rule “If a This gives a confidence for a rule “If a customer purchases Milk, they will customer purchases Milk, they will purchase Potato Chips” of (20% / purchase Potato Chips” of (20% / 60%) = 33%60%) = 33%
Time Series
Predict continuous columns, such as Predict continuous columns, such as product sales or stock performance in product sales or stock performance in a forecasting scenarioa forecasting scenario
Builds a model in two stages Builds a model in two stages First stage creates a list of optimal First stage creates a list of optimal candidate input columnscandidate input columns
Second stage investigates each Second stage investigates each candidate input column and determines candidate input column and determines if it improves the modelif it improves the model
Microsoft Mining Models
Neural NetworkDData modeling tool that is able to capture ata modeling tool that is able to capture and represent complex input/output and represent complex input/output relationshipsrelationships
Neural networks resemble the human Neural networks resemble the human brain in the following two ways: brain in the following two ways:
A neural network acquires knowledge through A neural network acquires knowledge through learninglearning
A neural network's knowledge is stored within A neural network's knowledge is stored within inter-neuron connection strengths known as inter-neuron connection strengths known as synaptic weightssynaptic weights
It It explores all possible data relationships explores all possible data relationships It is slowIt is slow
Back-Propagation
Training a neural network is setting Training a neural network is setting the best weights on the inputs of the best weights on the inputs of each of the unitseach of the units
The back-propagation process:The back-propagation process:Get a training example and calculate Get a training example and calculate outputsoutputs
Calculate the error – the difference Calculate the error – the difference between the calculated and the between the calculated and the expected (known) resultexpected (known) result
Adjust the weights to minimize the errorAdjust the weights to minimize the error
Conclusion: When To Use WhatAnalytical problemAnalytical problem ExamplesExamples AlgorithmsAlgorithms
Classification: Assign cases to predefined classes
Credit risk analysisChurn analysisCustomer retention
Decision TreesNaive BayesNeural Nets
Segmentation: Taxonomy for grouping similar cases
Customer profile analysisMailing campaign
ClusteringSequence Clustering
Association: Advanced counting for correlations
Market basket analysisAdvanced data exploration
Decision TreesAssociation
Time Series Forecasting: Predict the future
Forecast salesPredict stock prices
Time Series
Prediction: Predict a value for a new case based on values for similar cases
Quote insurance ratesPredict customer income
All
Deviation analysis: Discover how a case or segment differs from others
Credit card fraud detectionNetwork infusion analysis
All
© 2004 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.