data mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/lect-1-dm.pdf · • the elements...

57
1 Data Mining Lecture # 1 Introduction & Fundamentals

Upload: others

Post on 20-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

1

Data Mining

Lecture # 1Introduction & Fundamentals

Page 2: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

2

Intro & AffiliationsArea of research: Analysis of medical images/signals using Image/signal

processing and Machine Learning Techniques

www.biomisa.org/usman

www.biomisa.org

www.risetech.pk

www.albasr.com

www.ekko.pk

Page 3: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Reference Material

Text Book:

Data Mining --- Concepts and techniques, by Han and Kamber, Morgan Kaufmann, 3rd Edition. (ISBN:1-

55860-489-8)

Ref Books:

• Introduction to Data Mining – Pang-Ning Tan, Michael Steinbach, and Vipin Kumar,

Addison Wesley

• Principles of Data Mining, by Hand, Mannila, and Smyth, MIT Press, 2001. (ISBN:0-262-

08290-X)

• The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by

Hastie, Tibshirani, and Friedman, Springer, 2001. (ISBN:0-387-95284-5)

• Mining the Web --- Discovering Knowledge from Hypertext Data, by Chakrabarti,

Morgan Kaufmann, 2003. (ISBN:1-55860-754-4)

3

Page 4: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Reference Material II

• Software:– Weka : Data Mining Software in Java, by University of

Waikato, New Zealand– RapidMiner– GeNIe & SMILE, developed at the Decision Systems

Laboratory, University of Pittsburgh– bnlearn - an R package for Bayesian network learning

and inference– . . .

• Website:– http://www.kdnuggets.com/– ….

4

Page 5: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Topics

• Scope: Data Mining• Topics:

– Introduction to Data Mining– Data Understanding – Data Preprocessing– Data Ware Housing– Data Cube Technology– Mining Frequent Patterns– Advanced Pattern Mining– Classification– Advanced Classification Methods– Clustering – Outlier Detection

5

Page 6: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Grading

• Assignments 10%

• Quizzes 10%

• Project 10%

• Mid-Term Exam 30%

• Final Exam 40%

6

Page 7: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Assignment and Project

• Assignments– No assignments will be accepted after due date.– Programming assignments should be well

documented.– Students are “not” allowed to “copy” each other’s

work. Any such work would be marked zero– No tolerance to cheating. If you are not able to

explain your assignment, it will be considered cheating.

• Projects– Applying data mining techniques to solve actual

problems. 7

Page 8: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

DATA MINING

8

Page 9: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

9

Definition“Data mining is the exploration and analysis of large

quantities of data in order to discover valid, novel, potentially useful, and ultimately understandablepatterns in data.”

Valid: The patterns hold in general.

Novel: We did not know the pattern beforehand.

Useful: We can devise actions from the patterns.

Understandable: We can interpret and comprehend the patterns.

Page 10: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Alternative names

– Knowledge discovery (mining) in databases (KDD)

– Knowledge extraction,

– Knowledge engineering

– Data Science

– Data/pattern analysis

– Data archeology

– Data dredging

– Information harvesting

– Business intelligence

– etc.

10

Page 11: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

We will return to the actual topic in two minutes. In the meantime, we are going to play a quick game.

I am going to show you some problems which were shown to pigeons!

Let us see if you are as smart as a pigeon!

Page 12: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Examples of class A

3 4

1.5 5

6 8

2.5 5

Examples of class B

5 2.5

5 2

8 3

4.5 3

Pigeon Problem 1

Page 13: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Examples of class A

3 4

1.5 5

6 8

2.5 5

Examples of class B

5 2.5

5 2

8 3

4.5 3

8 1.5

4.5 7

What class is this object?

What about this one, A or B?

Pigeon Problem 1

Page 14: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Examples of class A

3 4

1.5 5

6 8

2.5 5

Examples of class B

5 2.5

5 2

8 3

4.5 3

8 1.5

This is a B!Pigeon Problem 1

Here is the rule.If the left bar is smaller than the right bar, it is an A, otherwise it is a B.

Page 15: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Examples of class A

4 4

5 5

6 6

3 3

Examples of class B

5 2.5

2 5

5 3

2.5 3

8 1.5

7 7

Even I know this one

Pigeon Problem 2 Oh! This ones hard!

Page 16: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Examples of class A

4 4

5 5

6 6

3 3

Examples of class B

5 2.5

2 5

5 3

2.5 3

7 7

Pigeon Problem 2

So this one is an A.

The rule is as follows, if the two bars are equal sizes, it is an A. Otherwise it is a B.

Page 17: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Examples of class A

4 4

1 5

6 3

3 7

Examples of class B

5 6

7 5

4 8

7 7

6 6

Pigeon Problem 3

This one is really hard!What is this, A or B?

Page 18: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Examples of class A

4 4

1 5

6 3

3 7

Examples of class B

5 6

7 5

4 8

7 7

6 6

Pigeon Problem 3 It is a B!

The rule is as follows, if the sum of the two bars is less than or equal to 10, it is an A. Otherwise it is a B.

Page 19: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Examples of class A

3 4

1.5 5

6 8

2.5 5

Examples of class B

5 2.5

5 2

8 3

4.5 3

Pigeon Problem 1

Here is the rule again.If the left bar is smaller than the right bar, it is an A, otherwise it is a B.

Lef

t B

ar

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

Right Bar

Page 20: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Examples of class A

4 4

5 5

6 6

3 3

Examples of class B

5 2.5

2 5

5 3

2.5 3

Pigeon Problem 2

Lef

t B

ar

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

Right Bar

Let me look it up… here it is.. the rule is, if the two bars are equal sizes, it is an A. Otherwise it is a B.

Page 21: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Examples of class A

4 4

1 5

6 3

3 7

Examples of class B

5 6

7 5

4 8

7 7

Pigeon Problem 3

Lef

t B

ar

100

10 20 30 40 50 60 70 80 90 100

10

20

30

40

50

60

70

80

90

Right Bar

The rule again:if the square of the sum of the two bars is less than or equal to 100, it is an A. Otherwise it is a B.

Page 22: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Why Mine Data? Commercial Viewpoint• Lots of data is being collected

and warehoused

– Web data, e-commerce

– purchases at department/grocery stores

– Bank/Credit Card transactions

22

Page 23: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,
Page 24: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

24

Page 25: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

25

Page 26: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

26

Page 27: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

A Single View to the Customer

Customer

Social Media

Gaming

Entertain

BankingFinance

OurKnownHistory

Purchase

Page 28: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Variety (Complexity)

• Relational Data (Tables/Transaction/Legacy Data)• Text Data (Web)• Semi-structured Data (XML) • Graph Data

– Social Network, Semantic Web (RDF), …

• Streaming Data – You can only scan the data once

• A single application can be generating/collecting many types of data

• Big Public Data (online, weather, finance, etc)

28

To extract knowledge all these types of data need to linked together

Page 29: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Evolution of Sciences• Before 1600, empirical science

• 1600-1950s, theoretical science

– Each discipline has grown a theoretical component. Theoretical models often

motivate experiments and generalize our understanding.

• 1950s-1990s, computational science

– Over the last 50 years, most disciplines have grown a third, computational branch

(e.g. empirical, theoretical, and computational ecology, or physics, or linguistics.)

– Computational Science traditionally meant simulation. It grew out of our inability

to find closed-form solutions for complex mathematical models.

• 1990-now, data science

– The flood of data from new scientific instruments and simulations

– The ability to economically store and manage petabytes of data online

– The Internet and computing Grid that makes all these archives universally

accessible

– Scientific info. management, acquisition, organization, query, and visualization

tasks scale almost linearly with data volumes. Data mining is a major new

challenge!29

Page 30: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Evolution of Database Technology

30

Page 31: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

What is (not) Data Mining?

What is Data Mining?

– Certain names are more prevalent in certain locations (O’Brien, O’Rurke, O’Reilly… in Boston area)

–Identify customers with similar buying habits

–Find all credit applicants who are poor credit risks.

What is not Data Mining?

– Look up phone number in phone directory

– Identify customers who have purchased more than $10,000 in the last month.

–Find all credit applicants with last name of Smith.

31

Page 32: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

32

Knowledge Discovery (KDD) Process

• This is a view from typical database systems and data warehousing communities

• Data mining plays an essential role in the knowledge discovery process

Data Cleaning

Data Integration

Databases

Data Warehouse

Task-relevant Data

Selection

Data Mining

Pattern Evaluation

Page 33: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

33

A Mining Framework

• Mining usually involves

– Data cleaning

– Data integration from multiple sources

– Warehousing the data

– Data cube construction

– Data selection for data mining

– Data mining

– Presentation of the mining results

– Patterns and knowledge to be used or stored into

knowledge-base

Page 34: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

34

Data Mining in Business Intelligence

Increasing potential

to support

business decisions End User

Business

Analyst

Data

Analyst

DBA

DecisionMaking

Data Presentation

Visualization Techniques

Data MiningInformation Discovery

Data Exploration

Statistical Summary, Querying, and Reporting

Data Preprocessing/Integration, Data Warehouses

Data Sources

Paper, Files, Web documents, Scientific experiments, Database Systems

Page 35: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

35

Mining vs. Data Exploration

• Business intelligence view

– Warehouse, data cube, reporting but not much mining

• Business objects vs. data mining tools

• Supply chain example: tools

• Data presentation

• Exploration

Page 36: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

36

KDD Process: A Typical View from ML and Statistics

Input Data Data Mining

Data Pre-Processing

Post-Processing

• This is a view from typical machine learning and statistics communities

Data integration

Normalization

Feature selection

Dimension reduction

Pattern discoveryAssociation & correlationClassificationClusteringOutlier analysis… … … …

Pattern evaluation

Pattern selection

Pattern interpretation

Pattern visualization

Page 37: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

37

Example: Medical Data Mining

• Health care & medical data mining – often

adopted such a view in statistics and machine

learning

• Preprocessing of the data (including feature

extraction and dimension reduction)

• Classification or/and clustering processes

• Post-processing for presentation

Page 38: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

• Draws ideas from: machine learning/AI, statistics, and database systems

etc.

Origins of Data Mining

Data Mining

Database

TechnologyStatistics

Machine

Learning

Pattern

RecognitionAlgorithm

Other

Disciplines

Visualization

38

Page 39: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

What is Machine Learning?

• Machine Learning– Study of algorithms that

– improve their performance

– at some task

– with experience

• Optimize a performance criterion using example data or past experience.

• Role of Statistics: Inference from a sample

• Role of Computer science: Efficient algorithms to– Solve the optimization problem

– Representing and evaluating the model for inference

Page 40: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Machine Learning

• According to Herbert Simon, learning is, “Any changein a System that allows it to perform better thesecond time on repetition of the same task or onanother task drawn from the same population.” [G. F.Luger and W. A. Stubblefield, Artificial Intelligence:Structures and Strategies for Complex ProblemSolving, The Benjamin/Cummings PublishingCompany, Inc. 1989.]

Page 41: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

41

Why “Learn”?• Machine learning is programming computers to

optimize a performance criterion using example data or past experience.

• Learning is used when:– Human expertise does not exist (navigating on Mars),– Humans are unable to explain their expertise (speech

recognition)– Solution changes in time (routing on a computer

network)– Solution needs to be adapted to particular cases (user

biometrics)

Page 42: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

The machine learning

pipeline

Page 43: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

43

ML Methods

• Supervised Learning

– Classification

– Regression/Prediction

• Unsupervised Learning

• Association Analysis

Page 44: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Predicting house prices

Page 45: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Sentiment analysis

Page 46: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Document

retrieval

Page 47: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Product

recommendation

Page 48: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Product recommendation

Page 49: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Visual Product

recommender

Page 50: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Model Choice

– What type of classifier shall we use? How shall we select its parameters? Is there best classifier...?

– How do we train...? How do we adjust the parameters of the model (classifier) we picked so that the model fits the data?

Page 51: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Features

• Features: a set of variables believed to carry discriminating and characterizing information about the objects under consideration

• Feature vector: A collection of d features, ordered in some meaningful way into a d- dimensional column vector, that represents the signature of the object to be identified.

• Feature space: The d-dimensional space in which the feature vectors lie. A d-dimensional vector in a d-dimensional space constitutes a point in that space.

Page 52: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Features

Feature space (3D)

Page 53: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Features

• Feature Choice

– Good Features

• Ideally, for a given group of patterns coming from the same class, feature values should all be similar

• For patterns coming from different classes, the feature values should be different.

– Bad Features

• irrelevant, noisy, outlier?

Page 54: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Features

“Good” features “Bad” features

Linear separability Non-linear separability Highly correlated features Multi-modal

Page 55: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,
Page 56: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Readings from Book (3rd Edn.)

• Chapter – 1

Page 57: Data Mining - biomisa.orgbiomisa.org/wp-content/uploads/2019/10/Lect-1-DM.pdf · • The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani,

Acknowledgments

• Lecture slides are adopted from Data mining-Concepts and Techniques by Han, Kamber and Pei https://hanj.cs.illinois.edu/bk3/bk3_slidesindex.htm

• Lecture slides are adopted from lectures of Dr. Aman Ullah, SS CASE IT, Islamabad

• Lecture series https://www.youtube.com/watch?v=h-q582wpb4Q&list=PLYwpaL_SFmcChP0xiW3KK9elNuhfCLVVi

• Lecture series https://www.youtube.com/watch?v=wAbyG4M2gns&t=1751s

• http://www.cs.uoi.gr/~tsap/teaching/2012f-cs059/slides-en.html

57