filtering noisy continuous labeled examples

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation

6.Conclusions

x

f(x)

NCLEFDC

Filtering noisy continuous labeled examples

José Ramón Quevedo

María Dolores García

Elena Montañés

Artificial Intelligence CentreOviedo University (Spain)

IBERAMIA 2002

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation

6.Conclusions

x

f(x)

NCLEFDCIndex

1. Introduction

2. The Principle

3. The Algorithm

4. Divide and Conquer

5. Experimentation

6. Conclusions

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation

6.Conclusions

x

f(x)

NCLEFDC

INDEX

1.Introduction

INDEX

Introduction

x

f(x)

Good examplesNoisy examples

Machine LearningSystem

Noisy Continuous Labeled

Examples Filter x

f(x)

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation

6.Conclusions

x

f(x)

NCLEFDC

The examples whose neighbour is a noisy onewould improve their k-cnn errors if the noisyexample was removed.

If removing a example gets an improvement inthe k-cnn errors of the rest of examples in thedata set, that example is, probably, a noisy one.

INDEX

1.Introduction

2.The Principle

The Principle

INDEX

1.Introduction

If removing a example gets an improvement inthe k-cnn errors of the rest of examples in thedata set, that example is, probably, a noisy one.

1

0

Example: Step Function

2-cnn error

e3 e6

1

0With out e3

1

0

With out e6

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation

6.Conclusions

x

f(x)

NCLEFDCThe Algorithm

INDEX

1.Introduction

2.The Principle

3.The Algorithm

INDEX

1.Introduction

2.The Principle

Noisy ContinuousLabeled Examples Filter

OriginalData Set

FilteredData Set

for each example e sorted by more k-cnnError {if(k-cnnError(e)<=MinError) break;if(prudentNoisy(DS-{e})

DS=DS-{e}else break;

}return DS;

)()( ii ecnnErrorkecnnErrorkMinError

)()(')( 1 eEN

movedExamplesReNeEesyprudentNoi NN

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation

6.Conclusions

x

f(x)

NCLEFDCDivide and Conquer

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

INDEX

1.Introduction

2.The Principle

3.The Algorithm

Problem : High Computational CostO(NCLEF)=N·O(LOO(A-cnn))=O(A2N3)

Solution : Use Divide & Conquer over the data set:• Split : choose a example with || ||1 that splits the

data set in two with similar number of examples• Stop : constant threshold: M, max. number of examples

Result : O(NCLEFDC)=O(N·log(N)+NA2)

OriginalData Set

D&C

Data SubSet

Data SubSet

NCLEF

NCLEF

Filtered SubSet

Filtered SubSet

FilteredData Set

NCLEFDC

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation

6.Conclusions

x

f(x)

NCLEFDC

40%

45%

50%

55%

60%R

elat

ive

Err

or

without filter 0% 46% 42% 48%

NCLEFDC 0% 47% 42% 47%


NCLEFDC 10% 0% 0% 0%

Cubist 1.10 m5' RT4.1

Experimental Results

• Experimentation data Sets: Torgo’s Repository• 29 Continuous Data Sets• High Diversity : Examples and Attributes

• Experiment : Cross Validation with 10 folders

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

40%

45%

50%

55%

60%R

elat

ive

Err

or


NCLEFDC 0% 47% 42% 47%


NCLEFDC 10% 51% 49% 55%

Cubist 1.10 m5' RT4.1

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation

6.Conclusions

x

f(x)

NCLEFDCConclusions

• NCLEFDC:– Filter Noisy Continuous Examples

– O(NCLEFDC)=O(Nlog2(N)+NA2)

• Use of NCLEFDC:– Without noisy examples: similar error– With noise : significant improvement

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation

6.Conclusions

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation • Future work:– Filter Noisy Discrete Examples– Filter at same time noisy examples and

noisy attributes

INDEX

1.Introduction

2.The Principle

3.The Algorithm

4.Divide & Conquer

5.Experimentation

6.Conclusions

x

f(x)

NCLEFDC

Filtering noisy continuous labeled examples

José Ramón Quevedo

María Dolores García

Elena Montañés

Artificial Intelligence CentreOviedo University (Spain)

IBERAMIA 2002

filtering noisy continuous labeled examples

Documents