andrew ng cs228: deep learning & unsupervised feature learning andrew ng texpoint fonts used in...
TRANSCRIPT
![Page 1: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649db95503460f94aa9035/html5/thumbnails/1.jpg)
Andrew Ng
CS228: Deep Learning &
Unsupervised Feature Learning
Andrew Ng
![Page 2: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649db95503460f94aa9035/html5/thumbnails/2.jpg)
Andrew Ng
How is computer perception done?
Image Low-levelvision features
Recognition
Object detection
Computer vision is hard!
![Page 3: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649db95503460f94aa9035/html5/thumbnails/3.jpg)
Andrew Ng
How is computer perception done?
Image Vision features Recognition
Object detection
Audio Audio features Speaker ID
Audio classification
NLP
Text Text features
Text classification, MT, IR, etc.
![Page 4: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649db95503460f94aa9035/html5/thumbnails/4.jpg)
Andrew Ng
Sensor representations
Input Learning/AIalgorithm
Low-level features
![Page 5: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649db95503460f94aa9035/html5/thumbnails/5.jpg)
Andrew Ng
A plethora of sensors
Camera array
3d range scan (laser scanner)
3d range scans (flash lidar)
Audio
A general-purpose algorithm for good sensor representations?
Visible light image
Thermal Infrared
![Page 6: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649db95503460f94aa9035/html5/thumbnails/6.jpg)
Andrew Ng
Sensor representation in the brain
[BrainPort; Martinez et al; Roe et al.]
Seeing with your tongueHuman echolocation (sonar)
Auditory cortex learns to see.
Auditory Cortex
![Page 7: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649db95503460f94aa9035/html5/thumbnails/7.jpg)
Andrew Ng
Learning abstract representations
pixels
edges
object parts(combination of edges)
object models
[Related work: Deep learning, Hinton, Bengio, LeCun, and others.]
![Page 8: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649db95503460f94aa9035/html5/thumbnails/8.jpg)
Andrew Ng
Feature learning for audio
Learned features correspond tophonemes and other “basic units”of sound.
Learned features
Algorithm:
![Page 9: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649db95503460f94aa9035/html5/thumbnails/9.jpg)
Andrew Ng
TIMIT Phone classification AccuracyPrior art (Clarkson et al.,1999) 79.6%
Stanford Feature learning 80.3%
TIMIT Speaker identification AccuracyPrior art (Reynolds, 1995) 99.7%Stanford Feature learning 100.0%
Audio
Images
Multimodal (audio/video)
CIFAR Object classification Accuracy
Prior art (Yu and Zhang, 2010) 74.5%
Stanford Feature learning 79.6%
NORB Object classification Accuracy
Prior art (Ranzato et al., 2009) 94.4%
Stanford Feature learning 97.0%
AVLetters Lip reading Accuracy
Prior art (Zhao et al., 2009) 58.9%
Stanford Feature learning 65.8%
Galaxy
Other feature learning records: Different phone recognition task (Hinton), PASCAL VOC object classification (Yu)
Hollywood2 Classification Accuracy
Prior art (Laptev et al., 2004) 48%
Stanford Feature learning 53%
KTH Accuracy
Prior art (Wang et al., 2010) 92.1%
Stanford Feature learning 93.9%
UCF Accuracy
Prior art (Wang et al., 2010) 85.6%
Stanford Feature learning 86.5%
YouTube Accuracy
Prior art (Liu et al., 2009) 71.2%
Stanford Feature learning 75.8%
Video