online monitoring of i/o - uni-hamburg.de · online monitoring of i/o eugen betke and julian kunkel...

33
Introduction On-line Monitoring Framework Evaluation Summary References Online Monitoring of I/O Eugen Betke and Julian Kunkel Research Group German Climate Computing Center 23-03-2017 Eugen Betke and Julian Kunkel Online Monitoring of I/O

Upload: ngohuong

Post on 01-Jul-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Introduction On-line Monitoring Framework Evaluation Summary References

Online Monitoring of I/O

Eugen Betke and Julian Kunkel

Research GroupGerman Climate Computing Center

23-03-2017

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References

1 Introduction

2 On-line Monitoring FrameworkComponents

FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana

Architecture

3 EvaluationScalabilityOverhead

4 Summary

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References

Table Of Content

1 Introduction

2 On-line Monitoring FrameworkComponents

FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana

Architecture

3 EvaluationScalabilityOverhead

4 Summary

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References

Introduction

Why monitoring?

Monitoring is important to find inefficient applications

What I/O levels to monitor?

node I/OOverview of total I/O traffic on a nodeAvailable in user space

file I/OFiltered I/O traffic for a specific fileAvailable in user space

mmap I/OI/O traffic done by virtual memory in thebackgroundHidden in the kernel space

How do monitoring tools get data?

Capturing of proc-files statisticsInstrumentation code injection

Static approachInjection of new compiled C code into abinary executable or dynamic library fileRe-compilation necessary

Interception with LD_PRELOAD

Dynamic approachWorks with dynamic libraries onlyStatic linked functions can not bemanipulated

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References

Virtual Memory

mmap()

mmap() is a system call to map the contents of a fileinto memory.

1 void *mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t off);

fildes file descriptoroff offset within the file to be mappedlen length of data from the offset to be mapped

Problematic

Virtual memory run in kernel space. Non-priviligedapplications have no access. Figure: Virtual Memory[4]

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

Table Of Content

1 Introduction

2 On-line Monitoring FrameworkComponents

FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana

Architecture

3 EvaluationScalabilityOverhead

4 Summary

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

Table Of Content

1 Introduction

2 On-line Monitoring FrameworkComponents

FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana

Architecture

3 EvaluationScalabilityOverhead

4 Summary

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

FUSE File System - Overview

File System in User Space

Software interface for Unix-likecomputer operating systems

Non-privileged file systems runwithout editing kernel code

File system code runs in user space

FUSE module provides only a"bridge" to the actual kernelinterfaces

USER

SPA

CE

KER

NEL

SPA

CE

Application

VirtualFile System

FUSEKernel Module

User LevelFile System

linked against libfuse

Built-InFile System

e.g. BTRFS, EXT4

Storage

1 63 4

2

5

VirtualMemory

1’

mmap()

Figure: FUSE architecture

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

FUSE File System - IOFS

Auxiliary toolMounts a directory to a mount point

FeaturesNo cacheNo mmap() operationsNo root priviledges required

USER

SPA

CE

KER

NEL

SPA

CE

Application

VirtualFile System

FUSEKernel Module

User LevelFile System

linked against libfuse

Built-InFile System

e.g. BTRFS, EXT4

Storage

1 63 4

2

5

VirtualMemory

1’

mmap()

Figure: FUSE architecture

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

SIOX - Scalable I/O for Extreme Performance [3]

Performance Analysis Framework

Open-Source-Framework published under LGPL 1

Supports POSIX-, MPI-, HDF5- and NETCDF4-Layers

Modular design

Online Analysis

Analyse activities during program execution

Offline Analysis

Analyse activities after program termination

1https://github.com/JulianKunkel/siox.gitEugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

SIOX - On-line Monitoring Plug-in

Plug-in

Aggregates I/O traffic to statistics

Uses I/O categories

write: pwrite(), write(), . . .read: get(), read(), . . .

Sends I/O statistics in specifiedtime intervals

I/O Statistics

Typed for visualization

Types

metrics - y-axistimestamp - x-axistags - filtering

I/O StatisticsName Type Valuewrite_duration metric (basic) time spent for writingwrite_bytes metric (basic) bytes writtenwrite_calls metric (basic) number of I/O operationswrite_bytes_per_call metric (derived) write_bytes, write_callsread_duration metric (basic) time spent for readingread_bytes metric (basic) bytes readread_calls metric (basic) number of I/O operationsread_bytes_per_call metric (derived) read_bytes, read_callsfilename tag I/O operationsaccess tag I/O operationsusername tag SLURM_USERhostname tag HOSTNAMEprocid tag SLURM_PROCIDjobid tag SLURM_JOBIDlayer tag user definedtimestamp date system clock

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

Elasticsearch

Scalable, real-time search and analytics engine

Apache 2 license

Indexing of all field allow fast look-ups

Highly scalable

Runs on laptops as well as on large-scaled super computers

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

Grafana

Rich graphingInteractive, editable graphsMultiple Y-axes, Logarithmic scales andoptions

Mixed StylingMix lines, points and barsMix stacked w/ isolated series

Template VariablesVariables are automatically filled withvalues from DB

Repeating PanelsAutomatically repeat rows or panels foreach selected variable value

AnnotationsShow events from datasources in thegraphs

Figure: Grafana [1]

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

Table Of Content

1 Introduction

2 On-line Monitoring FrameworkComponents

FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana

Architecture

3 EvaluationScalabilityOverhead

4 Summary

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

On-line Monitoring Framework

High scalability

Almost real-time on-line monitoring

Non-intrusive framework

No changes need to be done in applications

Components

Interception of mmap I/O: FUSEI/O statistics: SIOX + OnlineMonitoringPluginDB back-end: ElasticsearchVisualization: Grafana

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

On-line Monitoring Architecture 1/4

USER

SPA

CE

KER

NEL

SPA

CE

(optional)

SIOX( Application )+ Online-Monitoring-Plugin

Virtual File SystemVirtual Memory FUSE Kernel Module

(optional)

SIOX( IOFS )+ Online-Monitoring-Plugin

Built-in File System

Storage

Elasticsearch

Grafana

I/O statistics I/O statistics

I/O statistics

mmap()

file I/O

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

On-line Monitoring Architecture 2/4

USER

SPA

CE

KER

NEL

SPA

CE

(optional)

SIOX( Application )+ Online-Monitoring-Plugin

Virtual File SystemVirtual Memory FUSE Kernel Module

(optional)

SIOX( IOFS )+ Online-Monitoring-Plugin

Built-in File System

Storage

Elasticsearch

Grafana

I/O statistics I/O statistics

I/O statistics

mmap() mmap I/O

here still hidden

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

On-line Monitoring Architecture 3/4

USER

SPA

CE

KER

NEL

SPA

CE

(optional)

SIOX( Application )+ Online-Monitoring-Plugin

Virtual File SystemVirtual Memory FUSE Kernel Module

(optional)

SIOX( IOFS )+ Online-Monitoring-Plugin

Built-in File System

Storage

Elasticsearch

Grafana

I/O statistics I/O statistics

I/O statistics

mmap()

mmap I/O

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

On-line Monitoring Architecture 4/4

USER

SPA

CE

KER

NEL

SPA

CE

(optional)

SIOX( Application )+ Online-Monitoring-Plugin

Virtual File SystemVirtual Memory FUSE Kernel Module

(optional)

SIOX( IOFS )+ Online-Monitoring-Plugin

Built-in File System

Storage

Elasticsearch

Grafana

I/O statistics I/O statistics

I/O statistics

mmap()

file I/O

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

Grafana Web-Interface (User Perspective)

Interactive web interfaceZoom, time shift, filtering, . . .

Elaborated filteringBased on templatesAuto update of templates

FlawsNo Auto range finderTemplate update functionalitynot user friedly

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture

On-line monitoringlive demo

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead

Table Of Content

1 Introduction

2 On-line Monitoring FrameworkComponents

FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana

Architecture

3 EvaluationScalabilityOverhead

4 Summary

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead

Table Of Content

1 Introduction

2 On-line Monitoring FrameworkComponents

FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana

Architecture

3 EvaluationScalabilityOverhead

4 Summary

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead

Elasticsearch performance

Elasticsearch was deployed on an office PC

Test setup

Nodes: 10Processes per Node: 20Metrics were

generated on our HPC “Mistral” [2] with apython scriptsent in 100 metrics packages

Result

100 x 7500 metrics per second

Package

1 {2 ’metric1’: ’1’,3 ’metric2’: ’2’,4 ’metric3’: ’3’,5 ...6 ’metric100’: ’100’7 }

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead

Table Of Content

1 Introduction

2 On-line Monitoring FrameworkComponents

FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana

Architecture

3 EvaluationScalabilityOverhead

4 Summary

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead

Overhead - Test Setup

IOR, IOZone, SIOX, IOFSIntel Core i5-660, 4M Cache, 3.33 GHz12 GB DDR3 RAM2 TB HDD (Test)500 GB HDD (System)

Computer1: Test System

Elasticsearch, Grafana-

Computer2: DB and Visualization

I/O statistics

over1 GB/s network

Experiment configuration

Block sizes 1 KiB, 100 KiB, 128 KiB, 1000 KiB, 1024 KiB, 16384 KiB

1 nodes and 1 processes per node (in SLURM)

4 GiB test file

10 test runs for each block size

IOR for file I/OIOZone for mmap I/O

Scenarios without monitoring and with monitoring (application, mount point, both)

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead

Overhead [1/4] - Write

FILE I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB

●●

1.00

1.04

1.08

1.12

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario

Rel

ativ

e w

rite

perf

orm

ance

MMAP I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB

●●

1.00

1.04

1.08

1.12

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario

Rel

ativ

e w

rite

perf

orm

ance

Prel =mean(Pno_monitoring)

P<scenario>

Scenarios0 no monitoring1 monitoring of application2 monitoring of mount point3 both, (1) and (2)

Exp. configurationnodes/processes per node 1/1test file 4 GiBtest runs 10

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead

Overhead [2/4] - Write (zoomed)

FILE I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB

●● ●●

●●●

●1

2

3

4

5

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario

Rel

ativ

e w

rite

perf

orm

ance

MMAP I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB

●●

1.00

1.04

1.08

1.12

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario

Rel

ativ

e w

rite

perf

orm

ance

Prel =mean(Pno_monitoring)

P<scenario>

Scenarios0 no monitoring1 monitoring of application2 monitoring of mount point3 both, (1) and (2)

Exp. configurationnodes/processes per node 1/1test file 4 GiBtest runs 10

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead

Overhead [3/4] - Read

FILE I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB

●●

0.98

0.99

1.00

1.01

1.02

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario

Rel

ativ

e re

ad p

erfo

rman

ce

MMAP I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB

●●

0.98

0.99

1.00

1.01

1.02

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario

Rel

ativ

e re

ad p

erfo

rman

ce

Prel =mean(Pno_monitoring)

P<scenario>

Scenarios0 no monitoring1 monitoring of application2 monitoring of mount point3 both, (1) and (2)

Exp. configurationnodes/processes per node 1/1test file 4 GiBtest runs 10

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead

Overhead [4/4] - Read (zoomed)

FILE I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB

●●

0.98

0.99

1.00

1.01

1.02

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario

Rel

ativ

e re

ad p

erfo

rman

ce

MMAP I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB

●●

● ●●

1.0

1.1

1.2

1.3

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario

Rel

ativ

e re

ad p

erfo

rman

ce

Prel =mean(Pno_monitoring)

P<scenario>

Scenarios0 no monitoring1 monitoring of application2 monitoring of mount point3 both, (1) and (2)

Exp. configurationnodes/processes per node 1/1test file 4 GiBtest runs 10

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References

Table Of Content

1 Introduction

2 On-line Monitoring FrameworkComponents

FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana

Architecture

3 EvaluationScalabilityOverhead

4 Summary

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References

Summary

Non-intrusive On-line Monitoring Framework

Built on top of open source software: FUSE, SIOX, Elasticsearch, GrafanaProvides near real-time on-online monitoringCollects I/O statistics from applications and mount pointsProvides support for file I/O and mmap I/O

file I/O: Detailed information about file accessesmmap I/O: Non-intrusive way for instrumenting virtual memory (novelity)

Scalability (office PC)

100 x 7500 metrics/second

Overhead (office PC)

Write: file I/O < 1%/12% (+outlier) and Read mmap I/O < 1%/6%Read: file I/O < 1%/1% and Read mmap I/O < 1%/30% (+outlier)

Results for our HPC “Mistral” [2] are coming soon

Eugen Betke and Julian Kunkel Online Monitoring of I/O

Introduction On-line Monitoring Framework Evaluation Summary References

References

Grafana. https://grafana.com/. Accessed: 2017-03-22.

HLRE-3 "Mistral". https://www.dkrz.de/Klimarechner/hpc. Accessed:2017-03-22.

SIOX.https://wr.informatik.uni-hamburg.de/research/projects/siox.Accessed: 2017-03-22.

Virtual Memory. https://en.wikipedia.org/wiki/Virtual_memory.Accessed: 2017-03-22.

Eugen Betke and Julian Kunkel Online Monitoring of I/O