llnl-pres-671957 this work was performed under the auspices of the u.s. department of energy by...
TRANSCRIPT
LLNL-PRES-671957This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344. Lawrence Livermore National Security, LLC
Application of Machine LearningPatterns and Behaviors in Complex Systems
9-8-10-SSCI
James M. BraseDeputy Associate Director, ComputationLawrence Livermore National Laboratory
Lawrence Livermore National Laboratory LLNL-PRES-6719572
Machine learning is applied to a broad set of applications at LLNL
Document analysis – Is this document relevant to topic Y? Topics are defined as distributions of terms, phrases, phrase graphs ….
Cybersecurity – How many network connections do we expect node A to make in the next minute?
Materials science – Discovery of patterns in component material attributes and critical reaction parameters to produce custom-designed properties
Adaptive mesh simulation- Will this simulation parameter set cause the mesh to tangle?
Image and multimedia analysis – Can we label the objects in this image? Can we find other, similar videos?
Lawrence Livermore National Laboratory LLNL-PRES-6719573
Machine learning – statistical inference of patterns in data
Training data
Feature vectors
1-1
-1-1
11
1-11-11
-1-1
Labels
Training set Supervised learning – Mapping feature vectors to labels• Discrete labels –
classifiers• Continuous labels –
regression• Function mapping
• Logistic regression• Random forests• Neural networks
Unsupervised learning – Finding structure in data• Association rules• Clustering• Density estimation• Autoencoders
New dataFeature vector
Training….
Applying….
Lawrence Livermore National Laboratory LLNL-PRES-6719574
Learning language models for estimating document relevance
New documen
ts Keyphrase
extractor
Weak filtering
Entity extractor
Collocation filter
New document
graph
Training graph
models
Graph classifierRelevant graphs vs backround
graphsRelevance
score
Forced migration reference
documents
Lawrence Livermore National Laboratory LLNL-PRES-6719575
Document relevance for the NYT corpus
Relevance to forced migration reference
document set
Lawrence Livermore National Laboratory LLNL-PRES-6719576
Cybersecurity uses machine learning and graph analysis to model network behavior
Applications• Inferring node and group roles• Prediction of activity distributions• Cueing analysts to anomalous behaviors• Functional network discovery and
characterization
Collect packets, flow and process data from the full
physical network
Build a dynamic graph representation
of activity
Machine learning on the dynamic graph
• Node and group classification algorithms
• Temporal activity models – dynamic Bayesian networks
• Anomaly detection algorithms
Stream processing for feature and
signature extraction
Lawrence Livermore National Laboratory LLNL-PRES-6719577
Cyber mapping and activity models for improved activity prediction and anomaly detection
Ryan Rossi, Brian Gallagher, Jennifer Neville, Keith Henderson. Modeling Dynamic Behavior in Large Evolving Graphs. ACM International Conference on Web Search and Data Mining (WSDM), 2013.
Learning Markov models for behavior
forecasting
Host role learning
Anomaly Detection in host role distribution
Dynamic IP-IP graph
Reduced prediction error using host roles
Host roles are local characteristics of the IP-IP graph structure e.g. “center of star”, end node, …
Lawrence Livermore National Laboratory LLNL-PRES-6719578
Some R&D directions in machine learning
Training data
Feature vectors
1-1
-1-1
11
1-11-11
-1-1
Labels
Training setTraining….
Features have traditionally been hand engineered. Is there a principled approach to finding a good set of features?
Deep learning
We usually deal with N>>D. In emerging app’s we can have N<<D. (e.g. genomics, ...). Can we regularize (constrain the solutions) with mechanistic models?
N
D
Lawrence Livermore National Laboratory LLNL-PRES-6719579
Deep learning provides an unsupervised approach to learning feature sets from data
Lawrence Livermore National Laboratory LLNL-PRES-67195710
Deep machine learning research is extending pattern recognition and discovery beyond human capabilities
Learning patterns in 100M random images from Flickr
Airplanes neuron
“Fireworks” neuron
Images w. text neuron
• Discovering complex patterns in massive multisource intelligence data sets guided by science-based models – not exact keywords
• Image recognition performance now surpasses human accuracy
• Partnership with Stanford and UC Berkeley on algorithms, NVIDIA on large GPU implementations, and IBM on neurosynaptic architectures
100B synapse deep learning
networks
Lawrence Livermore National Laboratory LLNL-PRES-67195711
Data movement is the limiting factor for analytics – supplementing the memory hierarchy
Partnership with Intel and Cray to develop a 150 TF/s data analytics computer
Technical focus on NVRAM layers in memory hierarchy supporting 24 core node – prototyping analytics in new environment
Initial applications will focus on Prototyping exascale
simulation analysis architectures
Bioinformatics algorithms Graph analytics
Over 5GB DRAM & 36GB NVRAM per core