nonlinear adaptive distance metric learning for clustering

Post on 15-Jan-2016

71 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Nonlinear Adaptive Distance Metric Learning for Clustering. Jianhui Chen, Zheng Zhao, Jieping Ye, Huan Liu KDD, 2007 Reported by Wen-Chung Liao, 2010/01/19. Outlines. Motivation Objective ADAPTIVE DISTANCE METRIC LEARNING: THE LINEAR CASE - PowerPoint PPT Presentation

TRANSCRIPT

1Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

Nonlinear Adaptive Distance Metric Learning for Clustering

Jianhui Chen, Zheng Zhao, Jieping Ye, Huan Liu

KDD, 2007

Reported by Wen-Chung Liao, 2010/01/19

2

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Outlines Motivation Objective ADAPTIVE DISTANCE METRIC LEARNING:

THE LINEAR CASE ADAPTIVE DISTANCE METRIC LEARNING:

THE NONLINEAR CASE NAML Experiment Conclusions Comments

In distance metric learning, the goal is to achieve better

compactness (reduced dimensionality) separability (inter-cluster distance)

on the data, in comparison with usual distance metrics, such as Euclidean distance.

3

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Motivation

Traditionally, dimensionality reduction and clustering are applied in two separate steps. ─ If distance metric learning (via dimensionality reductio

n) and clustering can be performed together, the cluster separability in the data can be better maximized in the dimensionality-reduced space.

Many real-world applications may involve data with nonlinear and complex patterns.─ Kernel methods

4

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objectives Propose NAML for simultaneous distance metric le

arning and clustering. NAML

─ first maps the data to a high-dimensional space through a kernel function;

─ next applies a linear projection to find a low-dimensional manifold;

─ and then perform clustering in the low-dimensional space. The key idea of NAML is to integrate

─ kernel learning, ─ dimensionality reduction, ─ clustering

in a joint framework so that the separability of the data is maximized in the low-dimensional space.

5

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.ADAPTIVE DISTANCE METRIC LEARNING: THE LINEAR CASE

Data set: Data matrix:

Mahalanobis distance measure:

Linear transformation W:

≡MaxMin

Cluster indicator matrix Weighted cluster indicator matrix

6

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.ADAPTIVE DISTANCE METRIC LEARNING: THE NONLINEAR CASE

Kernel method:Hilbert space (feature space)nonlinear mapping

Symmetric kernel function K:kernel Gram matrix G:

For a given kernel function K, the nonlinear adaptive metric learning problem can be formulated as :

(inner product)

ψK (X) : the data matrix in the feature space

convex combination of p kernel matrices

7

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.ADAPTIVE DISTANCE METRIC LEARNING: THE NONLINEAR CASE

3.1 The computation of L for given Q and G

8

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.ADAPTIVE DISTANCE METRIC LEARNING: THE NONLINEAR CASE

3.2 The computation of Q for given L and G

Max≡

9

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.ADAPTIVE DISTANCE METRIC LEARNING: THE NONLINEAR CASE

3.3 The computation of G for given Q and L

Min≡

MOSEK

10

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.NAML

O(pk3n3)

11

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiment

• K-means algorithm as the baseline for comparison• three representative unsupervised distance metric learning algor

ithms: Principle Component Analysis (PCA), Local Linear Embeddin

g (LLE), and Laplacian Eigenmap (Leigs)

Performance Measures

ci the obtained cluster indicator yi the true class label

C the set of cluster indicators Y be the set of class labels

12

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.ExperimentsExperimental Results (10 RBF kernels for NAML)

13

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Sensitivity Study: the effect of the input kernels and the regularization parameter λ

NAML provides a way to learn from multiple input kernelsand generate a metric, with which an unsupervised learningalgorithm, like K-means, is more likely to perform as wellas with the best input kernel

K-means using 10 kernels for NAML

the quality of the initial kernel is low

14

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.ExperimentsA series of different λ values ranging from 10−8 to 105

•λ value in the range of [10−4, 102] is helpful in most cases

15

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusion NAML The joint kernel learning, metric learning, and clustering can b

e formulated as a trace maximization problem, which can be solved iteratively in an EM framework.

NAML is effective in learning a good distance metric and improving the clustering performance.

multiple biological data, e.g., amino acid sequences, hydropathy profiles, and gene expression data.

A future work is to study how to combine a set of pre-specified Laplacian matrices to achieve better performance in spectral clustering.

16

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Comments

Advantage─ A joint framework for kernel learning, distance

metric learning and clustering Shortage

Applications─ clustering

top related