sfscon16 - susanne greiner: "machine learning and advanced statistics for performance...

25
1 Susanne Greiner Machine Learning and Advanced Statistics for Performance Monitoring © Würth Phoenix 2016 … more than software SFScon South Tyrol Free Software Conference

Upload: south-tyrol-free-software-conference

Post on 16-Apr-2017

265 views

Category:

Technology


1 download

TRANSCRIPT

Page 2: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

IT and Consulting Company of the Würth-Group

Headquarter in Italy, European-wide presence, more than 130 highly skilled

employees

International experience in Business Software and IT Management

Core competencies in trading processes, wholesale distribution and logistics

Microsoft Gold Certified Partner, ITIL certified, OTRS Preferred Partner

2

About Würth Phoenix

Facts & Figures

More than 1.000

customers worldwide

Over 500.000 service

checks with NetEye

25.000 monitored hosts

4 offices in 3 countries

HQ in Italy

We create the right balance

between technology and services

for our customers

to support their IT operations and

deliver in that way a better

business result

© Würth Phoenix 2016 … more than software

Page 3: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

3© Würth Phoenix 2016 … more than software

PERFORMANCE MONITORING

COMMON PRACTICE

Page 4: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

4

Performance and User Experience

© Würth Phoenix 2016 … more than software

test

mys

pee

d.c

om

YOU want

• your applications to run smoothly

• highest speed possible

• no unexpected behavior or errors

Network/Application Performance refers to measures

of service quality of a network as seen by the customerQuality of Service (QoS)

User Experience (UX) is a person’s entire experience

using a particular product, system or serviceQuality of Experience (QoE)

Page 5: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

5

Performance and User Experience

© Würth Phoenix 2016 … more than software

Optimal Performance what is the vision?

STATIC where are we now?

DYNAMIC changes: where do we want to be? did we get there?

“It is not the strongest of the species that survive, nor the most intelligent, but the one most responsive to change…” Charles Darwin

YOUR SERVICE PROVIDER wants

• your applications to run as smoothly as needed

• lowest speed necessary to “keep you happy”

• no unexpected behavior or errors

• solve problems/inhibit outages proactively

• increase employee productivity

• avoid the ‘blame game’ (bottleneck detection)

• determine SLA levels

• optimize user experience to ensure user satisfaction

let us

MONITORPERFORMANCE

Page 6: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

6

Performance Monitoring

© Würth Phoenix 2016 … more than software

How to Monitor Performance? Data Collection

Problem

Solution

The right decision at each step is not trivial!

Example: thresholds

How to characterize standard behavior?

The quality of a monitoring approach strongly depends

on the choice of your threshold.

Historical data and domain knowledge are of advantage!

hits, misses, false alarms all

depend on the threshold

Page 7: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

7

Metrics and KPIs

© Würth Phoenix 2016 … more than software

metric + threshold = Key Performance Indicator (KPI)

Many metrics - not all relevant

Few meaningful KPIs better than many inadequate KPIs

Selection process time consuming - experience & expertise can help

Page 8: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

8

(Big) Data

© Würth Phoenix 2016 … more than software

Performance Monitoring

Virtual Machines

several counters every 2-5 seconds

Small Company Network

multiple requests each second

Application

task depending request frequency

Page 9: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

9

Common Practice & Related Problems

© Würth Phoenix 2016 … more than software

MONITORING, not STORING

Visualization

Number: too many

Alarms

Quality: misses and false alarms

Insights

Interpretation is still very manual → counteraction(s) take time

Page 10: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

10© Würth Phoenix 2016 … more than software

ALTERNATIVES TO

COMMON PRACTICE

Page 11: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

11

Alternatives: More (Advanced) Statistics

© Würth Phoenix 2016 … more than software

Densitiy vs. Mean

not all

traffic changes

affect the average

Page 12: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

12

Alternatives: Anomaly Detection

© Würth Phoenix 2016 … more than software

Alarm quality improvement

Mathematical characterization of standard traffic

Page 13: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

13

Alternatives: Machine Learning

© Würth Phoenix 2016 … more than software

Supervised learning:Is our data predictableMultidimensional data analysis

Unsupervised learning:Cluster traffic Multidimensional data analysis

Page 14: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

14© Würth Phoenix 2016 … more than software

EXAMPLES

Page 15: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

15

Example I: Density vs. Mean

© Würth Phoenix 2016 … more than software

Same mean does NOT mean same distribution

Even same mean & same std does NOT mean same distribution

Distribution and changes to it over time can contain important information

Advanced stats are an optimal addition to common practice

Page 16: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

16

Example I: Density vs. Mean

© Würth Phoenix 2016 … more than software

Application Latency

ServerLatency

ClientLatency

Throughput

Page 17: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

17

Example II: Anomaly vs. Threshold

© Würth Phoenix 2016 … more than software

Page 18: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

18

Example II: Anomaly vs. Threshold

© Würth Phoenix 2016 … more than software

Automatic detection

of relevant changes

Page 19: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

19

Example III: Machine Learning

© Würth Phoenix 2016 … more than software

• is close to the max and mean of metric 1

• is close to the max and mean of metric 2

• is not a very probable request in a

multidimensional view

• a 1D view is not the perfect option for

certain types of networks or applications

Multidimensionality within data

can only be respected by

multidimensional methods

To ignore multidimensionality

means more false alarms and

misses

metric 1

metric 2

Page 20: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

20

Example III: Machine Learning

© Würth Phoenix 2016 … more than software

Supervised learning:

Is our data predictable

Multidimensional data analysis

• Use maths to improve alarm quality

• Use maths to detect what is most probably responsible for

the problem that is currently experienced

Unsupervised learning:

Cluster traffic into dense and sparse activity

Multidimensional data analysis

• Know which part of the traffic to suspect first for causing a

problem

• Know how what percentage of your users is potentially

experiencing a problem

NetCla Challenge

Page 21: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

21© Würth Phoenix 2016 … more than software

OPEN SOURCE TOOLS

FOR PERFORMANCE MONITORING

Page 22: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

22

Available Open Source Tools

© Würth Phoenix 2016 … more than software

• Storage/ database

• General data manipulation

• Machine learning

• Presentation/ Visualization

Result:

Page 23: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

23

Available Open Source Tools + X

© Würth Phoenix 2016 … more than software

The best tools do NOT solve

problems without guidance

Page 24: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

24

Conclusion

© Würth Phoenix 2016 … more than software

Domain knowledge and experience help to combine

to improve performance.

Notable reduction

of operational

monitoring costs

Proactive

prevention of

outages

YOU?

data + advanced methods + open source tools

Page 25: SFScon16 - Susanne Greiner: "Machine Learning and Advanced Statistics for Performance Monitoring"

25

GRAZIE PER

LA VOSTRA ATTENZIONE!

www.wuerth-phoenix.com

© Würth Phoenix S.r.l.

All rights reserved. The text, images and graphics as well as their arrangement on these slides are all subject to copyright and other intellectual property protection. These objects may not be copied for commercial use or distribution, nor may these objects be

modified or reposted on any platform. Some slides also contain images that are subject to the copyright rights of their providers.

© Würth Phoenix 2016 … more than software

continuously looking fortalented programmers

THANKS FOR YOUR ATTENTION

www.wuerth-phoenix.com

www.neteye-blog.com