multiclassification system
TRANSCRIPT
Porkhun Olena, Taras Shevchenko National University of Kiev
Multiclassification System Applications and benefits
Problems requiring developing
Multiclassification System
Medical diagnostics; Technical diagnostics; Image segmentation; Patterns recognition: face, speech, handwriting, barcode recognition etc.; Prognosticating deposit of commercial minerals; Problems of documents classification; NLP problems; Text attribution problems ….
This system resolves classification problems with the number of classes ≥ 2
3. Possibility of correcting errors occurring in the process of classification, thus obtaining results more better than with use existing approaches
1. Processing a great number of different data sets independently of the number of features and sample size
2. Possibility of paralleling the learning process of system, thus opportunity of constructing a set of classifiers with a large potency
Benefits of Multiclassification System
Applications Applications in Medical Diagnosticsin Medical Diagnostics
• CardiologyCardiology• DermatologyDermatology
• OncologyOncology• VirologyVirology
• MicrobiologyMicrobiology
Examples of features in Examples of features in Heart DiseaseHeart Disease • chest pain location chest pain location • chest pain type chest pain type • resting blood pressure resting blood pressure • serum cholestoral in mg/dl serum cholestoral in mg/dl • resting electrocardiographic results resting electrocardiographic results • Beta blocker used during exercise ECG Beta blocker used during exercise ECG • nitrates used during exercise ECG nitrates used during exercise ECG • calcium channel blocker used during exercise ECGcalcium channel blocker used during exercise ECGetc.etc.
Examples of features in Dermatology DiagnosticExamples of features in Dermatology Diagnostic
Click to edit Master subtitle style
- melanin incontinencemelanin incontinence;; - - eosinophils in the eosinophils in the infiltrateinfiltrate;; - - PNL infiltratePNL infiltrate;; - - fibrosis of the papillary fibrosis of the papillary dermisdermis;; - - exocytosisexocytosis;; - - acanthosisacanthosis;;- - hyperkeratosishyperkeratosis;; - - parakeratosis parakeratosis etc.etc.
- erythemaerythema;; - - scalingscaling;; - - definite bordersdefinite borders;; - - itchingitching;; - - koebner phenomenonkoebner phenomenon;; - - polygonal papulespolygonal papules;; - - follicular papulesfollicular papules;; - - scalp involvementscalp involvement;; - - family historyfamily history;; - a- agege etc. etc.
Histopathological Histopathological Attributes:Attributes:
Clinical Attributes: Clinical Attributes:
Examples of features in Image SegmentationExamples of features in Image Segmentation- the column of the center pixel of the regionthe column of the center pixel of the region;; - the row of the center pixel of the regionthe row of the center pixel of the region;;- the number of pixels in a regionthe number of pixels in a region;;- measures the contrast of vertically adjacent pixelsmeasures the contrast of vertically adjacent pixels;; - the average over the region of (R + G + B)/3the average over the region of (R + G + B)/3;;- the average over the region of the R valuethe average over the region of the R value;; - 3-d nonlinear transformation of RGB3-d nonlinear transformation of RGB;;- the average over the region the average over the region - of the B valueof the B value;; - the average over the region the average over the region - of the G valueof the G value;; - measure the excess redmeasure the excess red, , - blue and green; etc.blue and green; etc.•
Features in Handwriting RecognitionFeatures in Handwriting Recognition
• Fourier coefficients of the character shapes; Fourier coefficients of the character shapes; • profile correlations; profile correlations; • Karhunen-Love coefficients; Karhunen-Love coefficients; • pixel averages in 2 x 3 windows; pixel averages in 2 x 3 windows; • Zernike moments; Zernike moments; • morphological featuresmorphological features;;• features of segments (lines):features of segments (lines):• the initial and final coordinatesthe initial and final coordinates; length of segment;; length of segment;• length length of the diagonal of the smallest rectangle of the diagonal of the smallest rectangle • etc.etc.
Data Sets used by SystemData Sets used by System UCIUCIMachine Learning RepositoryMachine Learning Repository
• The basic idea of the approach - decomposition of task into subtasks of binary classification and finding effective combination of binary classifiers using Error-Correcting Output Codes (ECOC) to obtain the best result. • The methods of constructing effective codes was realized in this system. Example of good code for the number of classes = 5Example of good code for the number of classes = 5
Approach and model underlying Multiclassification SystemApproach and model underlying Multiclassification System
0101010101010105
1001100110011004
1110000111100003
1111111000000002
1111111111111111
f14f13f12f11f10f9f8f7f6f5f4f3f2f1f0Class
0101010101010105
1001100110011004
1110000111100003
1111111000000002
1111111111111111
f14f13f12f11f10f9f8f7f6f5f4f3f2f1f0Class
0101010101010105
1001100110011004
1110000111100003
1111111000000002
1111111111111111
f14f13f12f11f10f9f8f7f6f5f4f3f2f1f0Class
0101010101010105
1001100110011004
1110000111100003
1111111000000002
1111111111111111
f14f13f12f11f10f9f8f7f6f5f4f3f2f1f0Class • Model of neural network perceptron is applied as binary classifier
Learning of binary classifiers can be Learning of binary classifiers can be parallelizedparallelized
Multiclassification SystemMulticlassification System
Learning Multiclassification SystemLearning Multiclassification System
Classification using developed systemClassification using developed systemClassification using developed systemClassification using developed system
Pen-Based Recognition of Handwritten Digits Artificial Characters RecognitionImage SegmentationDermatology Diagnostic
Precision – 97,7%Precision – 97,7%
Precision – 99,96%Precision – 99,96% Precision – 85,71%Precision – 85,71%
Precision – 98,6%Precision – 98,6%
SOME COMPARISONSSOME COMPARISONSFOR UCI DATA SETSFOR UCI DATA SETS
• Precision for DERMATOLOGY DATA SET: - using Voting Feature Intervals - 96,2%(Bilkent University, Department of Computer Engineering and Information Science, Gazi University, School of Medicine, Department of Dermatology, Ankara, Turkey)
- using Multiclassification System (with ECOC) – 98,6%(Taras Shevchenko National University of Kiev, Faculty of Cybernetics )
• Precision for PEN-BASED RECOGNITION DATA SET:- using MLP – 95,26% (Bo˘gazi.ci University, Istanbul, Turkey)
- using Boost-NN – 96,1% (Computer Science Department, Boston University, USA)
- using Multiclassification System (with ECOC) – 97,5%
Precision of classification using One-Against-All and ECOC
0 10 20 30 40 50 60 70 80 90 100
ArtCharacters
GlassIdentification
ImageSegmentation
PenDigits
Vehicle
Wine
HeartDisease
Dermatology
Precision of classification
Precision of ExhaustiveCode/Column Selection
Precision of One_Against_All
0 10 20 30 40 50 60 70 80 90 100
ArtCharacters
GlassIdentification
ImageSegmentation
PenDigits
Vehicle
Wine
HeartDisease
Dermatology
Precision of classification
Precision of ExhaustiveCode/Column Selection
Precision of One_Against_All
Precision of classification for all data sets using One_Against_All and Exhaustive Code Models
61,81781474; 55%
50,70852087; 45%
Exhaustive Code/Column Selection
One_Against_All
Precision of classification for all data sets using One_Against_All and Exhaustive Code Models
61,81781474; 55%
50,70852087; 45%
Exhaustive Code/Column Selection
One_Against_All
Thank you for your attention!Thank you for your attention!
Porkhun Olena, Phd., assistant of Cybernetics Faculty of Taras Shevchenko National University of Kiev, e-mail: [email protected]