g. cauwenberghs520.776 learning on silicon silicon kernel learning “machines” gert cauwenberghs...
TRANSCRIPT
![Page 1: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/1.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Silicon Kernel Learning “Machines”
Gert CauwenberghsJohns Hopkins University
520.776 Learning on Siliconhttp://bach.ece.jhu.edu/gert/courses/776
![Page 2: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/2.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Silicon Kernel Learning “Machines”
OUTLINE
• Introduction– Kernel Machines and array processing– Template-based pattern recognition
• Kerneltron– Support vector machines: learning and
generalization– Modular vision systems– CID/DRAM internally analog, externally digital array
processor– On-line SVM learning
• Applications– Example: real-time biosonar target identification
![Page 3: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/3.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Massively Parallel Array Kernel “Machines”
• “Neuromorphic”– distributed
representation– local memory and
adaptation– sensory interface– physical computation– internally analog,
externally digital
• Scalablethroughput scales linearly
with silicon area
• Ultra Low-Powerfactor 100 to 10,000 less
energy than CPU or DSP
Example: VLSI Analog-to-digital vector quantizer(Cauwenberghs and Pedroni, 1997)
![Page 4: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/4.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Time
fro
m c
och
lea
Acoustic Transient Processor (ATP)with Tim Edwards and Fernando Pineda
– Models time-frequency tuning of an auditory cortical cell (S. Shamma)
– Programmable template (matched filter) in time and frequency
– Operational primitives: correlate, shift and accumulate– Algorithmic and architectural simplifications reduce
complexity to one bit per cell, implemented essentially with a DRAM or SRAM at high density...
Fre
qu
ency
![Page 5: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/5.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
– Channel differenced input, and binarized {-1,+1} template values, give essentially the same performance as infinite resolution templates.
– Correlate and shift operations commute, implemented with one shift register only.
Acoustic Transient Processor (ATP) Cont’d...Algorithmic and Architectural Simplifications (1)
![Page 6: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/6.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
– Binary {-1,+1} template values can be replaced with {0,1} because of normalized inputs.
– Correlation operator reduces to a simple one-way (on/off) switching element per cell.
Acoustic Transient Processor (ATP) Cont’d...Algorithmic and Architectural Simplifications (2)
![Page 7: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/7.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
– Channel differencing can be performed in the correlator, rather than at the input. The cost seems like a factor of two in complexity. Not quite:
– Analog input is positive, simplifying correlation to single-quadrant, implemented efficiently with current-mode switching circuitry.
– Shift-and-accumulate is differential.
Acoustic Transient Processor (ATP) Cont’d...Algorithmic and Architectural Simplifications (3)
![Page 8: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/8.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Acoustic Transient Processor (ATP)Memory-Based Circuit Implementation
Shift-and-Accumulate
Correlation
![Page 9: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/9.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
2.2mm
2.25
mm
– 2.2mm X 2.2mm in 1.2m CMOS
– 64 time X 16 frequeny bins– 30 W power at 5V
Acoustic Transient Processor (ATP)with Tim Edwards and Fernando Pineda
“Can” template
“Can” response “Snap” response
calc.
calc. meas.
meas.
correlation64 time X 16 freq
shift-accumulate
![Page 10: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/10.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Generalization and Complexity
– Generalization is the key to supervised learning, for classification or regression.
– Statistical Learning Theory offers a principled approach to understanding and controlling generalization performance.
• The complexity of the hypothesis class of functions determines generalization performance.
• Support vector machines control complexity by maximizing the margin of the classified training data.
![Page 11: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/11.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Kernel Machines
x X
)(
)(
)(
xX
xX
ii
),( K ),()()( xxxx ii K
)),((sign bKyySi
iii
xx
))()((sign byySi
iii
xx
)()( xxXX ii
Mercer’s Condition
)(sign byySi
iii
XX
Mercer, 1909; Aizerman et al., 1964Boser, Guyon and Vapnik, 1992
![Page 12: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/12.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Some Valid Kernels
– Polynomial (Splines etc.)
– Gaussian (Radial Basis Function Networks)
– Sigmoid (Two-Layer Perceptron)
)1(),( xxxx iiK
)exp(),( 2
2
2
xxxx i
iK
)tanh(),( xxxx ii LK only for certain L
k
k
x1x
2x
11y
22 ysign
y
Boser, Guyon and Vapnik, 1992
![Page 13: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/13.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Trainable Modular Vision Systems: The SVM Approach
– Strong mathematical foundations in Statistical Learning Theory (Vapnik, 1995)
– The training process selects a small fraction of prototype support vectors from the data set, located at the margin on both sides of the classification boundary (e.g., barely faces vs. barely non-faces)
SVM classification for pedestrian and face object detection
Papageorgiou, Oren, Osuna and Poggio, 1998
![Page 14: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/14.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Trainable Modular Vision Systems: The SVM Approach
– The number of support vectors and their dimensions, in relation to the available data, determine the generalization performance
– Both training and run-time performance are severely limited by the computational complexity of evaluating kernel functionsROC curve for various
image representations and dimensions
Papageorgiou, Oren, Osuna and Poggio, 1998
![Page 15: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/15.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Scalable Parallel SVM Architecture
– Full parallelism yields very large computational throughput
– Low-rate input and output encoding reduces bandwidth of the interface
![Page 16: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/16.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
The Kerneltron: Support Vector “Machine”
• 512 inputs, 128 support vectors
• 3mm X 3mm in 0.5um CMOS
• Fully parallel operation using “computational memories” in hybrid DRAM/CCD technology
• Internally analog, externally digital
• Low bit-rate, serial I/O interface
• Supports functional extensions on SVM paradigm
Genov and Cauwenberghs, 2001
512 X 128CID/DRAM array
128 ADCs
![Page 17: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/17.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Mixed-Signal Parallel Pipelined Architecture
– Externally digital processing and interfacing• Bit-serial input, and bit-parallel storage of matrix elements
• Digital output is obtained by combining quantized partial products
![Page 18: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/18.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
CID/DRAM Cell and Analog Array Core
– Internally analog computing• Computational memory
integrates DRAM with CID
All “1” stored All “0” stored
Linearity of parallel analog summation
input, shifted serially
![Page 19: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/19.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
1
0
),(,
1
0
)(),()(,
N
n
nmji
N
n
nj
nmi
mji yxwY
1
0
)(),()(, )1(
N
n
nj
nmi
mji xwY
1
0
)(),( )1(N
n
nj
nmi xw
1
0
)(1
0
)(),(N
n
nj
N
n
nj
nmi xxw
Feedthrough and Leakage Compensationin an extendable multi-chip architecture
![Page 20: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/20.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Oversampled Input Coding/QuantizationOversampled Input Coding/Quantization
• Binary support vectors are stored in bit-parallel form• Digital inputs are oversampled (e.g. unary coded) and presented bit-serially
1
0
)()(),(
1
0
),()(
1
0
)(11
0
1
0
)(1
0
)(1
,2
2 ;
N
n
jn
imn
jim
J
j
jim
im
I
i
im
iN
nnmnm
J
j
jnn
I
i
imn
imn
xwY
andYY
whereYXWY
xXwW Data encoding
Digital accumulation
Analog charge-mode accumulation
Analog delta-sigma accumulation
![Page 21: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/21.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Oversampling ArchitectureOversampling Architecture
– Oversampled input coding (e.g. unary)
– Delta-sigma modulated ADCs accumulate and quantize row outputs for all unary bit-planes of the input
eYQJ
j
jim
im
1
0
),()(
![Page 22: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/22.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Kerneltron Kerneltron IIII
• 3mm x 3mm chip in 0.5m CMOS• Contains 256 x 128 cells and 128 8-bit delta-sigma algorithmic ADCs• 6.6 GMACS throughput• 5.9 mW power dissipation• 8 bit full digital precision • Internally analog, externally digital• Modular; expandable • Low bit-rate serial I/O
Genov, Cauwenberghs, Mulliken and Adil, 2002
![Page 23: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/23.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Delta-Sigma Algorithmic ADCDelta-Sigma Algorithmic ADC
resV
resV
shV
shV
osQ
osQ
Residue voltage
S/H voltage
Oversampled digital output
8-bit resolution in 32 cycles
![Page 24: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/24.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Signed Multiply-Accumulate CellSigned Multiply-Accumulate Cell
All “01” pairs stored
input, shifted serially
“01” pairs “10” pairs
Linearity of parallel analog summation in XOR CID/DRAM cell configuration
![Page 25: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/25.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Stochastic ArchitectureStochastic Architecture
),0()(, NNY mji
- ADC resolutionrequirements are relaxed by
N
),( jimY
Range of isreduced by
),( jimY
N
![Page 26: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/26.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Stochastic ArchitectureStochastic Architecture
),0()(, NNY mji
- ADC resolutionrequirements are relaxed by
N
),( jimY
),( jimY
Range of isreduced by
),( jimY
N
- Analog addition nonlinearity does not affect the precision of computation
![Page 27: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/27.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Stochastic Input EncodingStochastic Input Encoding
1
0
)(12K
k
kk uU
;UU)(XX
Bernoulli
- random, uniform )(ku - random, Bernoulli
XWUWU)(XWY mmmm
If (range of U) >> (range of X), binary coefficients of (X+U) are ~ Bernoulli
UWm
)( UXW m
Bernoulli
Normal
Normal
![Page 28: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/28.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Stochastic Encoding: Image DataStochastic Encoding: Image Data
All bit planes
MSB plane
),()()( jim
jim YXW ),(
~)()( )( ji
mji
m Y UXWrandom
![Page 29: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/29.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Stochastic Encoding: Experimental Results Stochastic Encoding: Experimental Results
Worst-case mismatch
),( jimY
![Page 30: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/30.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Sequential On-Line SVM Learningwith Shantanu Chakrabartty and Roman Genov
zc0 = zc
Qcczc0>1
ccccc
cc
Qzz
zCg
0
)('
![Page 31: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/31.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Sequential On-Line SVM Learning
zc0
zc0<1 zc
Qcc
ccccc
cc
Qzz
zCg
0
)('
margin/error support vector
![Page 32: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/32.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Effects of Sequential On-Line Learning and Finite Resolution
• Matched Filter Response
• On-Line Sequential Training
• Kerneltron II
testtrain
-1
+1
• Batch Training
• Floating-Point Resolution
![Page 33: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/33.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Biosonar Signal Processor and Classifier
– Constant-Q filterbank emulates a simplified cochlear model
– Kerneltron VLSI support vector machine (SVM) emulates a general class of neural network topologies for adaptive classification
– Fully programmable, scaleable, expandable architecture
– Efficient parallel implementation with distributed memory
Signal detection &extraction
Welch SFT& decimation
Principal component analysis
2-Layer
Neural Network
Feature/aspect fusion
15-150kHz
16 (freq) X 15 (time)
30
6
30 X 22
22 X 6
sonar
240
15-150kHzsonar
32 (freq)
Continuous-Time Constant-Q Filterbank
32 (freq) X 32 (time)
KerneltronVLSI Classifier
SVM
... 5X128
Digital Postprocessor
5 (chips)
OrinconHopkins
Kerneltron--- Massively Parallel SVMHopkins, 2001
(adjustable)
256 X 128 processors
with Tim Edwards, APL; and Dave Lemonds, Orincon
![Page 34: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/34.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Real-Time Biosonar Signal Processor
– Analog continuous-time input interface at sonar speeds
• 250kHz bandwidth
• 32 frequency channels
– Digital programmable interface
• In-site programmable and reconfigurable analog architecture
![Page 35: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/35.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Frontend Analog Signal Processing
time (continuous)
freq
uen
cy
– Continuous-time filters
• Custom topology
• Individually programmable Q and center frequencies
• Linear or log spacing on frequency axis
– Energy, envelope or phase detection
• Energy, synchronous
• Zero-crossings, asynchronous
![Page 36: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/36.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Real-Time LFM2 Frontend Data
training test
![Page 37: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/37.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
LFM2 Kernel Computation
• Inner-product based kernel is used for SVM classification
• PCA is used to enhance features prior to SVM training
![Page 38: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/38.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Measured vs. Simulated PerformanceLFM2 mines/non-mines classification
– Hardware and simulated biosonar system perform close to the Welch STF/NN classification benchmark.
• Classification performance is mainly data-limited.
– Hardware runs in real time.
• 2msec per ping.• 50 times faster than
simulation on a 1.6GHz Pentium 4.
ROC curve obtained on test set by varying the SVM classification threshold
![Page 39: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/39.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
System Integration
512 X 128CID/DRAM array
128 ADCs
PC Interface• classification output; digital sonar input• programming and configuration
AnalogSonarInput
Frontend• 32 channels• 100Hz-200kHz• Smart A/D
Xilinx FPGA• I/O and dataflow control• reconfigurable computing
Kerneltron SVM• 1010- 1012 MAC/s• internally analog• externally digital
![Page 40: G. Cauwenberghs520.776 Learning on Silicon Silicon Kernel Learning “Machines” Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776 Learning](https://reader036.vdocuments.us/reader036/viewer/2022062423/56649f1b5503460f94c310cc/html5/thumbnails/40.jpg)
G. Cauwenberghs 520.776 Learning on Silicon
Conclusions
• Parallel charge-mode and current-mode VLSI technology offers efficient implementation of high-dimensional kernel machines.– Computational throughput is a factor 100-10,000 higher
than presently available from a high-end workstation or DSP, owing to a fine-grain distributed parallel architecture with bit-level integration of memory and processing functions.
– Ultra-low power dissipation and miniature size support real-time autonomous operation.
• Applications include unattended adaptive recognition systems for vision and audition, such as video/audio surveillance and intelligent aids for the visually and hearing impaired.