real-time computer vision with scanning n-tuple grids simon lucas computer science dept
Post on 20-Dec-2015
217 views
TRANSCRIPT
Real-time Computer VisionReal-time Computer Visionwith with
Scanning N-Tuple GridsScanning N-Tuple Grids
Simon LucasSimon Lucas
Computer Science DeptComputer Science Dept
OutlineOutline
Background: N-Tuple ClassifiersBackground: N-Tuple Classifiers
The scanning n-tuple gridThe scanning n-tuple grid
Isolated Character RecognitionIsolated Character Recognition
Isolated Face RecognitionIsolated Face Recognition
Convolutional Mode OCRConvolutional Mode OCR
Real time vision demoReal time vision demo
ConclusionsConclusions
N-Tuple ClassifiersN-Tuple Classifiers
Work by randomly sampling input spaceWork by randomly sampling input spaceFirst applied to binary imagesFirst applied to binary imagesVery fast; reasonable accuracyVery fast; reasonable accuracyScanning N-Tuple classifier (Lucas, 1995)Scanning N-Tuple classifier (Lucas, 1995) Applied to sequence recognitionApplied to sequence recognition Fast and accurateFast and accurate
Current workCurrent work SNT GridSNT Grid Specially developed for convolutional (sliding window) Specially developed for convolutional (sliding window)
applicationsapplications Recognise patterns independent of locationRecognise patterns independent of location
LikelihoodImage
SNT-Grid System ArchitectureSNT-Grid System Architecture
Binarise(e.g. Niblack)
Scanning Index
(SNT-Grid)
IntegratedLikelihoods
LikelihoodImage
FurtherProcessing
(e.g. Dictionary orLanguage Model)
Simple OperationSimple Operation
Slide grid over imageSlide grid over image Interpret each position Interpret each position
as binary numberas binary number
Efficient ImplementationEfficient Implementation
Very simple ideaVery simple idea
Decompose one 2-d scanDecompose one 2-d scan
Into two 1-d scans!Into two 1-d scans!
Reduces time complexityReduces time complexity Suppose image is n x nSuppose image is n x n Window is m x mWindow is m x m Reduce from O(nReduce from O(n22mm22)) To O(nTo O(n22))
Well worth the effort!Well worth the effort!
SNTGrid Speed on MNistSNTGrid Speed on MNist
Java ImplementationJava Implementation
Chars are 28 x 28 grey level imagesChars are 28 x 28 grey level images
Training (60,000 chars)Training (60,000 chars) 8s (> 7,000 cps)8s (> 7,000 cps)
Testing (10,000 chars)Testing (10,000 chars) 3.8s (> 2,600 cps)3.8s (> 2,600 cps)
ORL Face DataORL Face Data
40 subjects40 subjects
10 images from each10 images from each
Using 5 for training, 5 for testingUsing 5 for training, 5 for testing
Average around 97.5% accuracyAverage around 97.5% accuracy
Competitive with other methodsCompetitive with other methods
Much faster!Much faster!
Museum Archive CardsMuseum Archive Cards
Hard to read with conventional OCRHard to read with conventional OCR
‘‘2’ Detector – Integrated2’ Detector – Integrated OP OP(Uses Integral Array of Viola + Jones)(Uses Integral Array of Viola + Jones)
Real-time DemoReal-time Demo
Very efficientVery efficient
Can use it for real-time expression Can use it for real-time expression recognitionrecognition
Or a ‘video’ joystick!Or a ‘video’ joystick!
Bit like EyeToy – but potentially more Bit like EyeToy – but potentially more sophisticatedsophisticated
ConclusionsConclusions
Basis of simple and efficient computer visionBasis of simple and efficient computer vision
Trick is the scan decompositionTrick is the scan decomposition
Also use of integral image to accumulate Also use of integral image to accumulate likelihoodslikelihoods
Currently being applied to reading text in natural Currently being applied to reading text in natural scenesscenes
Many other applications alsoMany other applications also
Further reading: ICDAR 2005 Paper (on my web Further reading: ICDAR 2005 Paper (on my web page)page)