divisive clustering - density based clustering - latest seminar …€¦ · data mining and...

16
„We are drowning in data, but we are starving for knowledge“ Part 2: Clustering - Hierarchical Clustering - Divisive Clustering - Density based Clustering Outline Data Mining and Knowledge Discovery in Large Databases Erik Kropat University of the Bundeswehr Munich, Germany

Upload: others

Post on 08-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

„We are drowning in data, but we are starving for knowledge“ Part 2: Clustering - Hierarchical Clustering - Divisive Clustering - Density based Clustering

Outline

Data Mining and

Knowledge Discovery in Large Databases

Erik Kropat University of the Bundeswehr

Munich, Germany

Page 2: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Why “Data Mining”?

• Companies are collecting massive amounts of data on customers, operations, and the competitive landscape.

Firms can gain a competitive advantage from these data

• But, there is far too much data

− Online shops record purchase behaviours for millions of customers (sometimes with hundreds features for each customer)

− Phone companies keep info on 100’s of millions of accounts (each with thousands of transactions)

− Databases can often be hundreds of terabytes in size (this will be peanuts in the future).

Page 3: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

„We are drowning in data, but we are starving for knowledge“

Why “Data Mining”?

(John Naisbitt)

Page 4: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Process of finding valuable and useful patterns in datasets

Knowledge Discovery in Large Databases

Page 5: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

… or more complex data sets • multimedia & sound

• images & video

• automatic news analysis

• social media analysis.

• businesses & investments

• finance & economics

• science & technology

• bioinformatics

• telecommunication

Analysis of data sets from …

Page 6: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

What are the data sources?

− Credit card transactions data

− Supermarket transactions data

− Loyalty cards

− Web server logs

− Social media

Variety of features

− Name and address − History of shopping and purchases − Demographics − Credit rating − Quality & market share of products

Consumer data

Page 7: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Business Intelligence ‒

Customer Data Analytics & Market Analysis

− customer segmentation

− market basket analysis

− target marketing

− geo-marketing

− cross-selling / up-selling

− customer relation management

Page 8: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Market Basket Analysis ‒ Cross Selling

Page 9: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Key Tasks

Assocation Rule Learning

Decision Trees

Automatic Derivation of Ontologies

Neural Networks

Digital Forensics

Page 10: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Retail

• Customer segmentation Identify purchase patterns of „typical“ customers

Targeted advertisement, costumized pricing, cost-effective promotions • Market basket analysis Identify the purchase behaviour of groups of customers

• Sales promotions Identify likely responders to sales promotions

Page 11: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Banking

• Credit rating

Given a large number names, which persons are likely to default on their credit cards?

• Fraud detection

− Credit card fraud detection

− Network intrusion detection

Page 12: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Telecommunications

Companies are facing an escalating competition and are forced to aggressively market special pricing programs aimed at retaining existing customers and attracting new ones. • Call detail record analysis Identify customer segments with similar use patterns.

Offer attractive pricing and feature promotions. • Customer loyalty / customer churn management Some customers repeatedly „churn“ (switch providers).

Identify those who are likely to switch or who are likely to remain loyal.

Companies can target their spending on customers who will produce the most profit. • Set pricing strategies in a highly competitive market.

Page 13: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Big Data is Big Business Companies are using their data sets to aim their services and products with increasing precision.

Business Intelligence

− SAP AG is a German global software corporation that provides enterprise software applications.

− SAP AG is one of the largest enterprise software companies. − In October 2007, SAP AG announced a $6.8 billion deal to acquire „Business Objects“.

− Since 2009 „Business Objects“ is a division of SAP AG instead of a separate company.

Page 14: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Outline

Page 15: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Part 1: Introduction - What is „Data Mining“ ? - Examples

Outline

Part 3: Clustering - Hierarchical Clustering - Partitional Clustering - Fuzzy Clustering - Graph Based Clustering

Part 4: Classification - k-th Nearest Neighbors - Support Vector Machines

Part 2: Formal Concept Analysis - Contexts and Concepts - Concept Lattices

Part 5: Spatial Data Mining - DBSCAN - Density & Connectivity

Part 6: Regulatory Networks - Eco-Finance Networks - Gene-Environment Networks

Page 16: Divisive Clustering - Density based Clustering - Latest Seminar …€¦ · Data Mining and Knowledge Discovery in Large Databases . Erik Kropat . University of the Bundeswehr Munich,

Questions ?

For more information after today Email me at [email protected]