data mining teaching experience at the fib. what is data mining? a broad set of techniques and...
TRANSCRIPT
Data Mining
Teaching experience at the FIB
What is Data Mining?
A broad set of techniques and algorithms brought from machine learning and statistics to make decisions based on data … plus a lot of experience and common sense
What subset is taught at the FIB?
Introduction to DM Multivariate statistics Clustering Association rules Regression and GLMs Decision trees Bayesian gaussian classifiers Nearest neighbours Neural networks Support vector machines
Additional subjects
Feature selection and extraction Model evaluation, selection and combination Data mining in the real world ….
– Does it need professional software?– A talk by a professional
Practical side
No exam Three practical homeworks:
– 1. Multivariate statistics (15%)– 2. Clustering & association rules (15%)– 3. Full Data Mining Project (70%)
Involves a 20’ oral defense!
Lab sessions (2 hours/week) Language selected: R
Now for the funny part …
Why two teachers?– One from Software department (a computer
scientist)– One from Statistics department (a statistician)
An equivalence table
Some tips for the students
A successful data mining project has four components:
1. Good data2. Clean goals3. Good algorithms4. Human expertise
The students’ replies (and subtext)
R is not a programming language– (it can’t be programmed as if C)
The results are not as good as we expected– (although we are very clever)
Too much theory– (but later on, we shall be missing it)
All in all, we have enjoyed it– (?)
In conclusion
The students like the course, specially the practical work
They tend to work autonomously, but not always for the good
In the end, no two results are identical This is a course with much room to improve
on the teacher’s part