qiang guan, ziming zhang and song fu university of north texas

Post on 23-Feb-2016

37 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Efficient and Accurate Anomaly Identification Using Reduced Metric Space in Cloud Computing Systems. Qiang Guan, Ziming Zhang and Song Fu University of North Texas. Introduction. Anomaly detection is a vital element of operations in large scale datacenter. - PowerPoint PPT Presentation

TRANSCRIPT

Efficient and Accurate Anomaly Identification Using Reduced Metric Space in Cloud Computing SystemsQiang Guan, Ziming Zhang and Song FuUniversity of North Texas

IntroductionAnomaly detection is a vital

element of operations in large scale datacenter.◦Detecting patterns in a given data

set that do not conform to an established normal behavior.

ChallengesContinuous monitoring and large

system scale lead to the overwhelming volume of data collected by health monitoring tool.

The large number of metrics that are measured make the data model extremely complex.◦High metric dimensionality will cause

low detection accuracy and high computational complexity.

This paperPresents a metric selection

framework for online anomaly detection in utility cloud.◦Select most essential metrics by

applying metric selection and extraction methods.

◦Identify anomalies using an incremental clustering approach.

◦Implement a prototype and evaluate the performance.

Dimensionality ReductionTransforms the collected health-

related performance data to a new metric space with only the most important metrics preserved.

In this paper:◦Metric selection using mutual

information.◦Metric extraction by metric space

combination and separation.

Metric SelectionSelect the best subset of the

original metric set based on mutual information.◦The mutual information of two

random variables is a quantity that measures the mutual dependence of the two random variables.

Metric Selection(Cont.)

Sm

ii

cmIS

relevanceSrelevance );(1),(max

Smm

jiji

mmIS

redundancySredundancy,

);(1),(min 2

)()(),(max SredundancySrelevancedependencySdependency

However, finding the optimal metric subset id NP-hard.

=>

Incremental Search MethodGiven Sk-1, try to select the kth

metric that maximizes dependency() from the remaining metrics in (M-Sk-1).

→S1 ⊂ S2 ⊂ ... ⊂ Sn

11

);(11);(max

kjki Sm

jiiSMmmmI

kcmI

Incremental Search Method(Cont.)Sn*

◦Find the range of i, where the cross-validation error erri has small mean and small variance.

◦err* = Min(erri)◦n* equals to the smallest i, for which

Si has err*.

Metric ExtractionCreates new metrics by

transformation or combination of the original metrics.

Two methods:◦Metric space combination◦Metric space separation

Metric Space CombinationDataset D = [x1, x2, …, xL]Record xi = [x1,i, x2,i, …, xn,i] T

Covariance matrix of D: V=DDT

Calculate the eigenvalues {λi} of V and sort them in descending order.

Choose n’ by:)1,0(,

1

'

1

n

ii

n

ii

Metric Space Combination(Cont.)The corresponding n’

eigenvectors are the new metrics.

Apply Gram-Schmidt orthogonalization process to compute eigenvectors {ej}.

Metric Space SeparationSeparate desired data from

mixed data.

Record x = [x1, x2, …, xL] T

Component e =[e1, e2, …, en’] T

x = Ae → e = Wx

Find an optimal transformation matrix W so that {ej} are maximally independent.

Metric Space Separation(Cont.)Independent component analysis

(ICA)◦A computational method for

separating a multivariate signal into additive subcomponents.

◦A special case of blind source separation.

Incremental ClusteringData points are considered one

at a time, and assigned to existing groups without affecting the existing group significantly.◦“A data point goes into the nearest

group if the Euclidean distance between this point and the centroid of the group smaller than δ; else create a new group.”

◦Update centroid after new point comes in.

◦Adjust δ if cloud operators find false-negative. Normal but assigned to anomaly.

Experiment Setting362 servers.Each server hosts up to ten VMs.Benchmarks:

◦RUBiS distributed online service benchmark

◦MapReduce jobsFault injection

◦CPU, memory, disk, and network faults.

Experiment Setting(Cont.)Monitoring tools

◦sysstat: runtime performance data in Dom0

◦Modified perf: performance counters from hypervisor.

Total 518 metrics.◦182 + 336◦However, only 406 non-constant.

Monitor every minute from 2011/01/20 to 2011/08/11.

Metric Selection Result406→14

◦Metric space reduced by 96.6%

Metric Extraction ResultsMetric extraction and metric

selection v.s. Metric extraction only.

Detection Precision

ConclusionAnomaly detection is important.

◦self-managing cloud resources and enhancing system dependability.

They present a metric selection framework with metric selection and extraction mechanisms.

The selected and extracted metric set contributes to highly efficient and accurate anomaly detection.

top related