deep learning for educational innovations - actnextetcps.actnext.info › etcpsabstract ›...
TRANSCRIPT
Outline
• From AI to Machine Learning to Deep Learning
• Why we need Deep Learning (DL)
• Different Deep Learning models
• Deep Learning in educational applications
From AI to ML to DL
Artificial Intelligence
Dartmouth Summer
Workshop on Artificial
Intelligence 1955 General AI: machines
capable of sensing and
reasoning…think just
like we do
Narrow AI: technologies that
are able to perform specific
tasks as well as, or better than,
we humans can
… even Narrow AI was mostly out of reach with early machine learning approaches
Input
signal
Pre-
processing
Feature selection &
extractionInference, prediction,
recognition…
Most Efforts in early ML (esp. before 2010):
Most critical for accuracy, Most time-consuming in development cycle, often hand-crafted
Limitations of hand-crafted features • Hand-crafted features
• Low adaptability to various applications: limits performance
• How to hand engineer for new domains?
• Kinect, Video, Multi spectral
• Feature computation time• Getting prohibitive for large datasets (several sec/image)
Instead of designing features, can we train an
end to end system (parameterized function) in
which features are extracted and learnt
efficiently and implicitly?
Why not shallow learning?
BAD -- it may require an exponential nr. oftemplates!!!
ShallowKernel learningBoosting……
Why do we need deep structure?
GOOD -- (exponentially) more
efficient:
intermediate computations can
be re-used
distributed representations which are shared across
classes
• Function composition is at the core of deep
learning methods
• The composition makes a highly non-linear system
• Each “simple function” will have parameters subject
to training
Biological inspiration
Axon
Terminal Branches
of AxonDendrites
- The visual cortex is hierarchical
- The brain uses billions of slow and
Unreliable processors (neurons)
acting in parallel
- Thousands of incoming connections per neurons
Input nodes Hidden
nodes (neurons)
Output nodesConnection
s (with weights)
1 2 3
1 2 3( )
( ) ( , , )
m
i i i i i m
j
i j
j
y f w x w x w x w x b
f w x b f X W b
= + + + + +
= + =
f could be:
Sigmoid tanh Rectified linear
Deep Learning
What is Deep Learning:
Cascade of non-linear transformations
End to end learning
Distributed representations
Compositionality
Deep Neural Networks contain a large number of neurons which can be
computed distributedly or parallelly.Each neuron is a simple non-linear function.
GPU (Graphics
Processing Unit)
: weights and biases in all layers
Learning consists of minimizing the loss w.r.t.
parameters over the whole training set
The DL Boom around 2010
Family of deep learning: BIG
Convolutional
Neural
Networks (CNN)Google nets
Residual networks
Recurrent
Neural
Networks (RNN)
LSTM
Deep
Reinforcement
Learning
“Alpha-Go”
Generative Adversarial Networks
(GAN)
Convolutions in CNN
Convolution with a kernel
Convolution with multiple kernels
Learn these kernels during training
Applications of CNN
About sensing “what, where, when” from visual/acoustic/text/depth…
Recurrent Networks
Change RNN architecture: long short term memory (LSTM),
or Gated Recurrent unit (GRU)Attention model
Applications of RNN
Applications: word/sentence completion, translation, time series prediction,
image captioning…among others;
Take sequences as input, output could be a single unit (e.g. predicting next
movement of a human) or a sequence (e.g. translation, seq. of words -> seq.
of words)
Deep Reinforcement Learning
Reinforcement learning (RL) is about an agent interacting
with the environment, learning an optimal policy, by trial
and error, for sequential decision
the combination of deep neural networks and reinforcement learning = deep reinforcement learning
Future of AI
Chess (AlphaGo or AlphaZero)Robotics
Self-driving carsComputer games
Generative Adversarial Networks (GANs)
D: distinguishes genuine data from
forgeries created by G
Two networks compete with each other!
Conditional Info could be added to G and D
G: turns random noise into imitations of
the real data, in an attempt to fool the D
Discriminator
Open source libraries
Reference and computational resources• A full reading list:
• http://deeplearning.net/reading-list/
• Evaluation of deep learning toolkits• https://github.com/zer0n/deepframeworks/blob/master/README.md
• Tutorial:• http://deeplearning.net/tutorial/
• http://deeplearning.stanford.edu/tutorial/
• https://www.tensorflow.org
• High-end computers with decent CPU, RAM, GPU
• Online deep learning platform:
• AWS deep learning instance
• CLOUD AI of Google
Limitations
Main criticism: the lack of theory surrounding many of the methods Most of the learning is just some form of gradient descent
Often looked at as a black box, with most confirmations done empirically
Lack of mechanisms for complex reasoning, search, and inference Generate structured prediction? (a long text, or a label map)
Lack of memory some applications require a way to store isolated facts (natural language
understanding)
LSTM, Memory Networks, Neural Turing Machines, and Stack-Augmented RNN: far from mature
Lack of the ability to perform unsupervised learning Animals/humans learn the perceptual world in an unsupervised manner
Deep Learning in educational applications
• Facial expression generation in dyadic interactions Generative Adversarial Networks
• Facial biometrics for test centers Convolutional Neural Networks
• Generation of micro multimodal content (videos) Convolutional Neural Networks
Recurrent Networks
• Automatic passage generation Convolutional Neural Networks
Facial Expression Generation in Dyadic Interactions
Given the facial expressions of humans, generate facial expressions of
agents
Applications: autonomous realistic avatars for interviews
Generate dynamic facial expression for one agentInterviewee
Joy ==>
Interviewee
Anger ==>
Interviewee
Fear ==>
Interviewee
Contempt ==>
Interviewee
Disgust ==>
Interviewee
Surprise ==>
Interviewee
Sad ==>
Interviewee
Neutral ==>
In International Conference
on Multimodal Interaction
2018
Generate facial expression for multi-agents
In British Machine Vision
Conference 2018
Facial biometrics for test centers
Test-taking fraud (Cheating) happens in all level tests
Solution: deep learning based face and speech recognition based identity verification.
With an Equal Error Rate of 5.6% on a tester dataset, our algorithm outperforms a third-party face recognition system (which has an EER of 7.4%)
Generation of micro multimodal content (videos)
• Content is key: promote engagement, increase interaction and boost efficacy
• But developing ‘good’ content at scale is difficult
• Utilize AI/Machine Learning to generate effective content from existing material
Scale up generation of educational video content for precision learning
Animals of Africa Natural scenery of Africa
atomic clips
Semantic Ordering (based on text caption and visual features)+
Visual effect alignment
Wild Africa
atomic clips atomic clips atomic clips
Archive of atomic
construct video clips
Video Segmentation Video Segmentation
Video Segmentation
S1 S2 S3 S4 S5 S6
p1 p2 p3 p4 p5
S: sentences of video caption
Caption text features (e.g. word2Vec embedding)
Extracted Visual features from frames (e.g. CNN based features)
“Text Segmentation as a Supervised Learning Task”, Omri Koshorek et al. 2018
(Semi) Automated Passage Generation (APG/SAPG)
Goals
• Help writers create testing passages in a more efficient way
• Provide adaptively searched and summarized material to learners
APG/SAPG - Framework
Passage Summarization
Searchingrelated passages
Semantic Ordering and integration
Extractive Abstractive“TextRank: Bringing Order into Texts” Rada Mihalcea
Multi source
passages
ExtractiveSummarization
Coherence measuring of
extracted sentences
Merge & Order
Paraphrasing
“Abstractive Paraphrasing”
Seq2seq Model
Old sentence
New sentence
Why nobody answer my questions?
Why does no one response to my questions?
LSTM + Attention
Sum Up
Deep learning is a powerful tool in machine learning producing the best results in most of sub-fields of applied machine learning
Deep learning has not been widely used for Education Create new/smart content, Personalized/Customized learning, Support teachers, Virtual
lecturers and learning environment, the automation of administrative tasks
Deep learning is far from maturity works but lack of theory