strategic information systems iv stv401t / b btip05 /...
TRANSCRIPT
![Page 1: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/1.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
1
STRATEGIC INFORMATION SYSTEMS IV
STV401T / B
BTIP05 / BTIX05 - BTECH
DEPARTMENT OF INFORMATICS
LECTURE: 03 – (A)
DATA MINING
By: Dr. Tendani J. Lavhengwa
![Page 2: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/2.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
Inspirational Quotes
• My personal quote:
“Always be a thought ahead. Do not fear the blank page, everything started
somewhere”
• Quotes to consider as inspiration:
"We're entering a new world in which data may be more important than software" ~ Tim
O'Reilly
• Your quotes?
???
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
![Page 3: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/3.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
1. Literature in context and evolution
2. Data Mining - multiple definitions
3. Key terms and in definition
4. Goal of DM and Strategic Competitive Advantage
5. Blending DM with Confluence from multiple disciples
6. Data in Data Mining
7. How Data Mining works?
8. Data Mining applications and derived value
9. Databases vs Data Mining processing and Query examples
#. Start-up Items to discuss
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
![Page 4: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/4.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
1. Literature in context and evolution
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
In 1999, Dr Arno Penzias
[Nobel laureate and former chief scientist: Bell Labs)…
• “Data mining will become much more important and companies will throw away nothing about nothing about customers because it will be so valuable. If you’re not doing this, you’re out of business”
In 2006, Thomas Davenport
[Harvard Business Review] argued…
• “the latest strategic weapon for companies is analytical decision making””
![Page 5: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/5.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
1. Literature in context and evolution
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
![Page 6: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/6.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
2. Data Mining - multiple definitions
Def...1:
• Discovering or ‘mining’ knowledge from large amounts of data…(Turban, 2011)
Def…2:
• knowledge discovery in databases:
• –Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) information or patterns from data in large databases
Def…3:
• advanced methods for exploring and modelling relationships in large amount of data
Def…4:
• The non-trivial extraction of implicit, previously unknown, and potentially useful
• information from large databases.
Def…5:
• the use of automated data analysis techniques to uncover relationships among data items
Def…6:
• finding hidden information in a database
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
![Page 7: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/7.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
3. Key terms and in definition
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
• a field in an item • a function that maps an expression into a measure space
• an expression that describes a subset of facts
• a set of facts (items), usually stored in a database
Data: Pattern:
Attribute: Interestingness:
![Page 8: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/8.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
4. Goal of DM and Strategic Competitive Advantage
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
Decision Making Chain
Decision Making Chain
Pre-Processing
Data Mining
[info]
Decision Making
Goal of Data Mining
![Page 9: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/9.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
4. Blending DM with Confluence from multiple disciples
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
Data Mining
confluence
Database Technology
Statistics
Machine Learning
Visualisation
Information Science
Other Disciples…
![Page 10: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/10.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
6. Data in Data Mining
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
![Page 11: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/11.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
7. How Data Mining works?
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
DM… seeks to identify four major patterns
DM… builds models to identify patterns among the attributes presented in the dataset
Associations Predictions
Clusters Sequential patterns /
relationships
Note: discussed later in: Mining Algorithms and technologies
![Page 12: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/12.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
7. How Data Mining works?
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
• -establishing relationships among items that occur together
• - example: beer diapers going together in a market-basket analysis
Association:
• -the act of telling about the future occurrence
• -example: predicting the winner of a soccer match or a temperature for a particular day
Prediction / forecasting:
• -finding groups of entities with similar characteristics
• -example: -assigning customers into different income brackets and past purchase behaviour
Clustering:
• -finding time-based associations
• -discover time-ordered events
• -example: predicting that a customer who opens vehicle finance account will subsequently require an insurance policy (account). Later a minor scratches and dents cover
Sequence relationships / discovery
Major types of patterns Association, example
![Page 13: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/13.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
8. Data Mining applications and derived value
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
–auto insurance:
• detect a group of people who stage accidents to collect on insurance
–money laundering:
•detect suspicious money transactions (US Treasury's Financial Crimes Enforcement Network)
–medical insurance:
•detect professional patients and ring of doctors and ring of references
Detecting inappropriate medical treatment
•–Australian Health Insurance Commission identifies that in many cases blanket screening tests were requested (save Australian $1m/yr).
•Detecting telephone fraud
•–Telephone call model: destination of the call, duration, time of day or week. Analyse patterns that deviate from an expected norm.
•–British Telecom identified discrete groups of callers with frequent intra-group calls, especially mobile phones, and broke a multimillion dollar fraud.
•Retail
•–Analysts estimate that 38% of retail shrink is due to dishonest employees.
![Page 14: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/14.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
9. Databases vs Data Mining processing and Query examples
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)
* D
B *
•Query
•– Well defined
•– SQL
•Data
•– Operational data
•Output
•– Precise
•– Subset of database
* D
M *
•Query
•– Poorly defined
•– No precise query language
•Data
•– Not operational data
•Output
•– Fuzzy
•– Not a subset of database
Database
•– Find all credit applicants with first name of Sane.
•– Identify customers who have purchased more than Rs.10,000 in the last month.
•– Find all customers who have purchased milk
Data Mining
•– Find all credit applicants who are poor credit risks. (classification)
•– Identify customers with similar buying habits. (Clustering)
•– Find all items which are frequently purchased with milk. (association rules)
![Page 15: STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / …lavhengwaonline.weebly.com/uploads/8/4/7/8/84787622/dr... · 2018-02-17 · Corporate Affairs and Marketing (CA&M) Brand and](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f6cc7af5474240ff0464f91/html5/thumbnails/15.jpg)
Corporate Affairs and Marketing (CA&M) Brand and Event Management
15
QUESTIONS & ENQUIRIES
---
LECTURE: 05(A) - DATA MINING (DM) / KNOWLEDGE DISCOVERY FROM DATA (KDD)