page number: 1 datamining in e-business: veni, vidi, vici! by prof. dr. veljko milutinovic

10
Page Number : 1 Datamining in e-Business: Veni, Vidi, Vici! by Prof. Dr. Veljko Milutinovic

Upload: aleah-wommack

Post on 16-Dec-2015

220 views

Category:

Documents


2 download

TRANSCRIPT

Page Number:

1

Datamining in e-Business: Veni, Vidi, Vici!

by

Prof. Dr. Veljko Milutinovic

Page Number:

2

THIS IS A DEMO VERSION OF THIS IS A DEMO VERSION OF

THE TUTORIAL IN DATAMINING FOR E-BUSINESSTHE TUTORIAL IN DATAMINING FOR E-BUSINESS

ONLY A FEW SLIDES OF THE ORIGINAL TUTORIAL ONLY A FEW SLIDES OF THE ORIGINAL TUTORIAL ARE PRESENTED HEREARE PRESENTED HERE

Page Number:

3

Focus of this PresentationFocus of this PresentationFocus of this PresentationFocus of this Presentation

Data Mining problem types

Data Mining models and algorithms

Efficient Data Mining

Available software

Page Number:

4

Decision TreesDecision TreesDecision TreesDecision Trees

Balance>10 Balance<=10

Age<=32 Age>32

Married=NO Married=YES

Page Number:

5

Decision TreesDecision TreesDecision TreesDecision Trees

Page Number:

6

Rule InductionRule InductionRule InductionRule Induction

Method of deriving a set of rules to classify cases

Creates independent rules that are unlikely to form a tree

Rules may not cover all possible situations

Rules may sometimes conflict in a prediction

Page Number:

7

Comparison of foComparison of fouurteen DM toolsrteen DM tools

Evaluated by four undergraduates inexperienced at data mining, Evaluated by four undergraduates inexperienced at data mining, a relatively experienced graduate student a relatively experienced graduate student,, and and a profes a professsional data mining consultantional data mining consultant

Run under the MS Windows 95, MS Windows NT, Run under the MS Windows 95, MS Windows NT, Macintosh System 7.5Macintosh System 7.5

Use one of the four technologies: Use one of the four technologies: Decision Trees, Rule Inductions, NeuralDecision Trees, Rule Inductions, Neural,, or Polynomial Networks or Polynomial Networks

Solve two binary classification problems: Solve two binary classification problems: multi-class classification and noiseless estimation problem multi-class classification and noiseless estimation problem

Price from 75$ to 25.000$Price from 75$ to 25.000$

Page Number:

8

Comparison of foComparison of fouurteen DM toolsrteen DM tools

The Decision Tree products were The Decision Tree products were - - CART CART

- Scenario - Scenario - See5 - See5

- S-Plus - S-Plus The Rule Induction tools were The Rule Induction tools were

- - WizWhy WizWhy - - DataMindDataMind

- - DMSK DMSK Neural Networks were built from three programsNeural Networks were built from three programs

- - NeuroShell2NeuroShell2- PcOLPARS - PcOLPARS

- - PRW PRW The Polynomial Network tools were The Polynomial Network tools were

- - ModelQuest Expert ModelQuest Expert - - Gnosis Gnosis - a module of - a module of NeuroShellNeuroShell22

- - KnowledgeMiner KnowledgeMiner

Page Number:

9

Criteria for evaluating DM toolsCriteria for evaluating DM tools

A list of 20 criteria for evaluating DM tools, put into 4 categories:A list of 20 criteria for evaluating DM tools, put into 4 categories:

CapabilityCapability measures what a desktop tool can do, measures what a desktop tool can do, and how well it does itand how well it does it

- Handles- Handles missing datamissing data- - - - Considers misclassification costsConsiders misclassification costs

- Allows data transformations- Allows data transformations- - Includes qIncludes quality of tesing uality of tesing

optionsoptions - Has - Has a a programming languageprogramming language- Provides useful - Provides useful

output reportsoutput reports - - Provides Provides vvisualisationisualisation

Page Number:

10

Criteria for evaluating DM toolsCriteria for evaluating DM tools

InteroperabilityInteroperability shows a tool’s ability to interface shows a tool’s ability to interface with other computer applicationswith other computer applications

- Importing data- Importing data- Exporting data- Exporting data

- Links to other applications- Links to other applications

Flexibility Flexibility

- Model adjustment flexibility- Model adjustment flexibility- Customizable work - Customizable work

enviromentenviroment - Ability to - Ability to write or change codewrite or change code