weka: brief introductionweka: brief introduction

13
Weka: Brief Introduction Weka: Brief Introduction Features Covered in this Lecture { Preprocessing Examining datasets and { Preprocessing Examining datasets and using filters. { Classification selecting and running classifiers. { Visualization Tools brief exposure Explorer: Preprocessing the data Explorer: Preprocessing the data Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL database (using JDBC) Pre-processing tools in WEKA are called filtersfilters WEKA contains filters for: { Discretization normalization resampling attribute { Discretization, normalization, resampling, attribute selection, transforming and combining attributes, … WEKA only deals with flatfiles @relation heart-disease-simplified WEKA only deals with flat files @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ...

Upload: others

Post on 12-Sep-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Weka: Brief IntroductionWeka: Brief Introduction

Weka: Brief IntroductionWeka: Brief Introduction

Features Covered in this LecturePreprocessing – Examining datasets andPreprocessing Examining datasets and using filters.Classification – selecting and running g gclassifiers.Visualization Tools – brief exposure

Explorer: Preprocessing the dataExplorer: Preprocessing the data

Data can be imported from a file in various formats: ARFF, CSV, C4.5, binaryData can also be read from a URL or from an SQL database (using JDBC)Pre-processing tools in WEKA are called “filters”filtersWEKA contains filters for:

Discretization normalization resampling attributeDiscretization, normalization, resampling, attribute selection, transforming and combining attributes, …

WEKA only deals with “flat” files@relation heart-disease-simplified

WEKA only deals with flat files

@attribute age numeric@attribute sex { female, male}@attribute chest_pain_type { typ_angina, asympt, non_anginal,

atyp_angina}@attribute cholesterol numeric@attribute exercise_induced_angina { no, yes}@attribute class { present, not_present}

@data63,male,typ_angina,233,no,not_present67,male,asympt,286,yes,presenty p y p67,male,asympt,229,yes,present38,female,non_anginal,?,no,not_present...

Page 2: Weka: Brief IntroductionWeka: Brief Introduction

WEKA only deals with “flat” files@relation heart-disease-simplified

WEKA only deals with flat files

@attribute age numeric@attribute sex { female, male}@attribute chest_pain_type { typ_angina, asympt, non_anginal,

atyp_angina}@attribute cholesterol numeric@attribute exercise_induced_angina { no, yes}@attribute class { present, not_present}

@data63,male,typ_angina,233,no,not_present67,male,asympt,286,yes,presenty p y p67,male,asympt,229,yes,present38,female,non_anginal,?,no,not_present...

Page 3: Weka: Brief IntroductionWeka: Brief Introduction
Page 4: Weka: Brief IntroductionWeka: Brief Introduction
Page 5: Weka: Brief IntroductionWeka: Brief Introduction
Page 6: Weka: Brief IntroductionWeka: Brief Introduction
Page 7: Weka: Brief IntroductionWeka: Brief Introduction

Explorer: building “classifiers”p g

Classifiers in WEKA are models for predicting nominal or numeric quantitiesqImplemented learning schemes include:

Decision trees and lists instance-based classifiersDecision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, …g g , y ,

“Meta”-classifiers include:Bagging boosting stacking error-correcting outputBagging, boosting, stacking, error-correcting output codes, locally weighted learning, …

Page 8: Weka: Brief IntroductionWeka: Brief Introduction
Page 9: Weka: Brief IntroductionWeka: Brief Introduction
Page 10: Weka: Brief IntroductionWeka: Brief Introduction
Page 11: Weka: Brief IntroductionWeka: Brief Introduction
Page 12: Weka: Brief IntroductionWeka: Brief Introduction
Page 13: Weka: Brief IntroductionWeka: Brief Introduction

Homework #1 – Due Feb. 11

A l th d t t f th UCI it i th W kAnalyze the zoo dataset from the UCI repository using the Weka Explorer.

For each of the attributes feathers, predators, tail, and domestic, report on the types and numbers of animals having the attribute trueon the types and numbers of animals having the attribute true.Remove instances whose “type” attribute is larger than or equal to 4. Use the classifier J48graft to derive the corresponding decision tree. Draw the corresponding tree. p gUse the rules classifier PART to derive the rules on the zoo dataset. List the rules obtained.Remove the “type” attribute from the dataset and run the default clustering algorithm SimpleKMeans. How many clusters do you obtain? Can you relate these clusters to the initial class values?