interactive learning using manifold geometry eric eaton, gary holness, and daniel mcfarlane lockheed...
TRANSCRIPT
Interactive Learning using Manifold Geometry
Eric Eaton, Gary Holness, and Daniel McFarlane
Lockheed Martin Advanced Technology LaboratoriesArtificial Intelligence Research Group
This work was supported by internal funding from Lockheed Martin and the National Science Foundation under NSF ITR #0325329.
2Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Introduction: Motivation
Information monitoring systems use a scoring function ff to focus user attention
– ff is customized to the current situation
– Often, no data are available to learn ff
Maritime Situational Awareness
Network Security Monitoring
– Users require fine control over the scoring function
We propose an interactive interactive learninglearning method that enables the user to iteratively refine ff
3Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Introduction: Interactive Refinement
Uses a combination of manual input and machine learning:
1. The user manually selects and repositions a data point
2. The system relearns the model ff, and updates the scatterplot
Key idea: each adjustment should generalize naturally to the model
We use least squares with Laplacian regularization to learn ff, based on the manifold underlying the data
1D Projection of Data
Rel
evan
cy
User View Model View
4Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Related Work: Interactive Learning
Crayons tool for interactive object classification (Fails & Olsen, 2003)
Interactive decision tree construction (Ware et al., 2001)
Interactive visual clustering (desJardins et al., 2008)
Feature selection(Dy & Brodley, 2000)
Hierarchical clustering (Wills, 1998)
Crayons by Fails & Olsen(Figure used with permission)
Interactive Visual Clustering by desJardins et al.(Figure used with permission)
Initial viewAfter 2
adjustmentsAfter 14
adjustments
5Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Related Work: Interactive vs Active Learning
Active learning – selects instances for labeling by an oracle (Cohn et al., 1996; McCallum & Nigam, 1998; Tong, 2001)
Interactive ML Active Learning
Starts with… Unlabeled data IncorrectIncorrect model
Unlabeled data NoNo model
Selection of instances
UserUser determines adjustments
SystemSystem selects instances for labeling
GoalCollaborate with Collaborate with the userthe user to define or adjust a model
Minimize number of Minimize number of labelslabels needed to learn a model
6Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Data setwhere
The user supplies the initialscoring function
– We used a linear function for
Current scoring function is givenby f (initially )
The user adjusts the score of individual data points to change f until it matches the true (hidden) function F
– Details of each instance are available in a side panel
– User selects and drags an instance up or down to change its score
Future work: similarity metric updates, qualitative feedback
Mechanisms for User Interaction
User View
1D Projection of Data
Rel
evan
cy
Score: 55 Id: dmaskes2Event: ACL-MonitorSystem: Julius-laptop-------------------------------Freq: 8 (1hr)
8 (24hr)-------------------------------DETAILS:UID: dmaskes2Role: App_UpdatePolicy: finCloseLockStartTime: 0 17 * * 5EndTime: 0 8 * * 1Res_type: triggerOverrideView_type: AcctClerkDS_name: tbl_wklyTotalsError: unauth_updateValue: (2 3 -2334 conf)
7Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Approach: Learning the Scoring Function
Key Idea: each adjustment should generalize naturally to the model
– Adjustments should affect similar instances
– Generalizations should be based on the geometry underlying the data
Our approach:
– Construct the manifold underlying the data
– Learn/update f using the manifold’s basis
v5
v4
v 7
v11
v 13
v1
v2
v 3v10
v 12
v 6
v15v14
v8
8Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Approach: Constructing the Manifold
Represent data set X as an undirected graph G = (V,A), with vertex vi representing instance xi
Adjacency matrix A is given by:
– Weighting each edge (vi, vj) by a radial basis function of the distance
– Connecting each instance to its k nearest neighbors
G is a discrete approximation of the continuous manifold
?
?
?
? ?
?
??
??
?
?
?
?
??
?
?
?
?
?
??
?
?
?
0.4
0.9
0.8
initial scoringfunction
= Λλ1λ2λ3
λn
QTQ
9Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Approach: Learning the Function on the Manifold
Form the graph Laplacian of G (Chung, 1994)
Take the eigendecomposition of
Q = [q1 … qn] forms a complete orthonormal basis for G
where
q1 q2
q5 q10
q2
0
q50
Meshes provided by Gabriel Peyré
The first eigenvectoris constant
λ1 = 0
10Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Approach: Learning the Function on the Manifold
The scoring function f : V → [0,1] is given by f = QW
Fit W by least squares with Laplacian regularization:
– This is a special case of Belkin et al.’s (2006) Manifold Regularization
– Eigenvalues ¤ increasingly regularize the higher-order components
A slider bar controls the weight of adjustments
12Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Complete Algorithm for Interactive Refinement
Given: the data X, the user’s initial scoring function
Set
Construct the manifold underlying X, represented by G = (V,A)
Compute the graph Laplacian of G
Compute the eigenvectors Q and eigenvalues ¤ of
Repeat
– Display the scatterplot of X using the scores given by f
– (Optional) The user adjusts the score of data instance xi
– (Optional) The user updates the adjustment weight ! via a slider bar
– If there were changes, update the scoring function as f = QW, where W is given by
13Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Scaling to Large Volumes of Data
A can be stored efficiently as a symmetric banded matrix
is also a symmetric banded matrix
– Use sparse eigensolvers (e.g., Lanczos methods) for efficiency
Nyström method (Baker 1977) extends the eigenvectors to new vertices for inductive learning
– Learn on a sample , with Laplacian
– Extend eigenvectors to new instances by
– Score for a new instance x (represented by vertex v) is then given by
14Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Evaluation
Simulate user by adjusting the current “most incorrect” instance to the correct score
– Users are adept at identifying outliers, motivating our approach
– is a linear model fit to X using ridge regression
Compared against interactive learning using:
– SMO support vector regression with an RBF kernel
– Least squares regularized with a ridge parameter of 10E-8
Name #Inst #Dim SourceCPU 209 6 UCI repositoryHeart Disease 303 13 UCI repositoryPharynx 195 10 Kalbfleisch & Prentice (1980)
Pyrimidines 74 27 King et al. (1992)
Sleep 62 7 StatLib archiveWisconsin Breast Cancer 194 32 UCI repository
Data Sets
15Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Evaluation: Adjusting the “most incorrect” instance
16Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Evaluation: Adjusting a random instance (100 trials)
17Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Related Work: Manifold Learning
Belkin et al.’s (2006) Manifold Regularization
– We use a special case regularizing only the solution’s smoothness
Semi-supervised learning using Gaussian random fields (Zhu et al., 2003; Cai et al., 2006)
Zhou et al.’s (2004; 2005) “Distribution Regularization”
– Uses a regularized form of the graph Laplacian as the basis
– Learns a function
Spectral Graph Transduction (Joachims, 2003)
18Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
Conclusion and Future Work
We presented a method for interactive learning based on least squares with Laplacian regularization
Manifold-based interactive learning continuously improves with each correction
In practice, the technique shows an interactive response time for hundreds of data instances
Future Work:
– User adjustment of the similarity metricbetween data instances
– Incorporate passive observation of the user
– Handling drifting or recurring concepts
Thank You!Questions?
Eric [email protected]
This work was supported by internal funding from Lockheed Martin and the National Science Foundation under NSF ITR #0325329.
20Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry
References
Baker, C. T. H. 1977. The Numerical Treatment of Integral Equations. Oxford: Clarendon Press.
Belkin, M.; Niyogi, P.; and Sindhwani, V. 2006. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Artificial Intelligence Research 7:2399-2434.
Cai, D., He, X., and Han, J. 2007. Spectral regression: a unified subspace learning framework for content-based image retrieval. In Proceedings of the 15th International Conference on Multimedia, p. 403-412. ACM Press.
Chung, F. R. K. 1994. Spectral Graph Theory. Number 92 in CBMS Regional Conference Series in Mathematics. Providence, RI: American Mathematical Society.
Cohn, D. A.; Ghahramani, Z.; and Jordan, M. I. 1996. Active learning with statistical models. Journal of Artificial Intelligence Research 4:129-145.
desJardins, M.; MacGlashan, J.; and Ferraioli, J. 2008. Interactive visual clustering for relational data. In Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall. 329-356.
Dy, J. G., and Brodley, C. E. 2000. Visualization and interactive feature selection for unsupervised data. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 360-364. ACM Press.
Fails, J. A., and Olsen, Jr., D. R. 2003. Interactive machine learning. In Proceedings of the Eighth International Conference on Intelligent User Interfaces, 39-45. Miami, FL: ACM Press.
Joachims, T.: 2003. Transductive Learning via Spectral Graph Partitioning. In Proceedings of the International Conference on Machine Learning, p. 290-297.
McCallum, A., and Nigam, K. 1998. Employing EM in pool-based active learning for text classification. In Proceedings of Fifteenth International Conference on Machine Learning, 359-367. San Francisco, CA: Morgan Kaufmann.
Tong, S. 2001. Active Learning: Theory and Applications. Ph.D. Dissertation, Stanford University.
Ware, M.; Frank, E.; Holmes, G.; Hall, M.; and Witten, I. H. 2001. Interactive machine learning: Letting users build classifiers. International Journal of Human Computer Studies 55(3):281-292.
Wills, G. J. 1998. An interactive view for hierarchical clustering. In Proceedings of the 1998 IEEE Symposium on Information Visualization (INFOVIS), Washington, DC, USA: IEEE Computer Society.
Zhou, D.; Huang, J.; and Scholkopf, B. 2005. Learning from labeled and unlabeled data on a directed graph. In Proceedings of the International Conference on Machine Learning, p. 1036-1043. Bonn, Germany: ACM Press.