document oriented database infrastructure for monitoring hep data systems applications carlos...
TRANSCRIPT
Document Oriented Database Infrastructure for MonitoringHEP Data Systems Applications
Carlos Fernando GamboaRACF, BNL
HEPiX Brookhaven National Laboratory, NY, USA
October 2015
2
Overview
1. Brief ELK framework review2. ELK test deployment to monitor storage related applications
3
The Elasticsearch, Logstash, Kibana (ELK) Ecosystem
Logstash
data collection
formatting
Elasticsearch
data storage
Kibana
Visualization and data analysis
4
Logstash
The Elasticsearch, Logstash, Kibana (ELK) Ecosystem
Logstash-forwarder ()(lumberjack)
Output
elasticsearch
Filter
Grok()
Date()
GeoIP()
Visualization
Kibana
An event is shipped via logstash forwarder client, collected, and processed sequentially at the logstash server, i.e.
Client Input
Server
File
Logstash
Logstash-forwarder ()(lumberjack)
Compression, encryption
5
The Elasticsearch, Logstash, Kibana (ELK) Ecosystem
Logstash
data collection
formatting
Elasticsearch
data storage
Kibana
Visualization and data analysis
6
A Document Oriented database horizontally scalable:
- Built on Apache’s Lucene (Java).- Mapping is comparable to a schema definition in SQL databases. - If the mapping has not been created the server will assume the type of document based on field
values.- Language query is based on JSON called Query DSL or via URL API, i.e.:
[user@racprodb07 ~]# curl -XGET 'http://localhost:9200/aws*/secure_bestman/_search?q=sec_target:"/mnt/atlasproddisk/rucio/mc15_13TeV/4a/32/EVNT.05192704._003739.pool.root.1"&pretty=true'{ "took" : 6, "timed_out" : false, "_shards" : { "total" : 44, "successful" : 44, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 47.9029, "hits" : [ { "_index" : "aws-se-2015.09.23", "_type" : "secure_bestman", "_id" : "AU_6UcW9_b_e2-r1bS0Y", "_score" : 47.9029, "_source":{"message":"Sep 23 08:57:54 aws01 sudo: bestman : TTY=unknown ; PWD=/tmp ; USER=usatlas3 ; COMMAND=/bin/rm /mnt/atlasproddisk/rucio/mc15_13TeV/4a/32/EVNT.05192704._003739.pool.root.1","@version":"1","@timestamp":"2015-09-23T13:08:27.193Z","type":"secure_bestman","file":"/var/log/secure","host":"aws01.racf.bnl.gov","offset":"3116045","sec_timestamp":"Sep 23 08:57:54","sec_host":"aws01","sec_oper":"sudo","sec_sudo_user":"bestman","sec_path":"/tmp","sec_user":"usatlas3","sec_command":"/bin/rm","sec_target":"/mnt/atlasproddisk/rucio/mc15_13TeV/4a/32/EVNT.05192704._003739.pool.root.1","syslog_received_at":"2015-09-23T13:08:27.193Z","received_from":"aws01.racf.bnl.gov"}
The Elasticsearch, Logstash, Kibana (ELK) EcosystemElasticsearch
Index (database)
DocumentType(table)
File
Field
7
The Elasticsearch, Logstash, Kibana (ELK) ecosystem
Logstash
data collection
formatting
Elasticsearch
data storage
Kibana
Visualization and data analysis
8
The Elasticsearch, Logstash, Kibana (ELK) EcosystemKibanaIs an analytics and visualization platform designed to work with Elasticsearch.Input field allows to issue interactive queries.Discover page:
DASHBORAD
Visualization 1
Visualization 2
Visualization 3Visualization N
Index
Fields
Results
Input field
9
Dashboard
Visualization 1 Pie charts
Visualization 2 histograms
Visualization
tile maps
Provides a dynamic creation of individual visualizations:- Based on individual searches (interactive or searched) or other visualization - Pie charts, histograms, bar chart, tile maps available to create the visualization
Dashboard Displays a group of stored visualizations. A search field and time filter is enabled by default in the dashboard.
Visualization 3bar chart
Search field Time filter
The Elasticsearch, Logstash, Kibana (ELK) EcosystemKibana
10
The Elasticsearch, Logstash, Kibana (ELK) Ecosystem
[root@aws01 ~]# tail -1 /var/log/secureSep 23 08:57:54 aws01 sudo: bestman : TTY=unknown ; PWD=/tmp ; USER=usatlas3 ; COMMAND=/bin/rm /mnt/atlasproddisk/rucio/mc15_13TeV/4a/32/EVNT.05192704._003739.pool.root.1
The event
filter { if [type] == "secure_bestman" { grok { patterns_dir => "/etc/logstash/patterns" match => { "message" => "%{SECURE}"} add_field => [ "syslog_received_at", "%{@timestamp}" ] add_field => [ "received_from", "%{host}" ] } }} Visualized on Kibana
(events aggregated)
The Filter
The output (Kibana)
11
ELK test deployment to monitor storage related applications.
12
Monitoring selected storage services
Simple Storage Service
(S3)
Amazon Web Services
BNL ELKmonitoring
AWS SE Bestman Bestman
Gridftp 2
Gridftp 1
SRM
BNL dCache SE
Consolidated into the BILLING logs
WAN
LAN
Application logfiles monitoredusing the Elasticsearch, Logstash and Kibana (ELK) framework.
No central collection of information
13
BNL ELK test server
Server 1
BNL Test ELK layout
3 AWS VMs and 3 Physical Servers Monitored
Logstash-forwarder
logfileServer 2
Logstash-forwarder
logfile
Logstash filters
Logstash input(lumberjack)
Logstash outputelasticsearch
Server N
Logstash-forwarder
logfile
KIBANA
WAN
LAN
14
BNL Test ELK layout
Test DashboardsIntended to be used by the site admin. Nginx is used to serve/proxy access to the dashboards.
Link to interactive query dashboard
15
dCache Billling Monitoring Dashboard
Dashboard ported to kibana 4.1 using as a reference previous work done for Kibana 3 [2]
Data collected using grok filter patterns published [2]
Integrated tile maps and errors charts and stats among other improvements.
Read/Writes per Sunit
type
15 Top Pools
Event Dist.per
Transfer Protocol
Top Errors per Transfer Protocol
Detail record
16
AWS SE Bestman Monitoring Dashboard
Visualization created using grok filter patterns [1]
Total size buckets
Gridftp transfers
SRM File Deletion
17
dCache Billing Dashboard 5 minutes refreshing period performance
Current stable configuration
No major client overhead on the monitored hosts.
Concentrating tuning effort on elasticsearch and kibana working with different parameters, such as:
- Thread pool search memory - Kibana timeouts
ELK Test server
18
dCache Billing dashboard aggregated report performance
Last 7 days
Last 30 days
Last 60
days
Last 90
days
dCache Billing document size is ~400MTotal size 320 GB
19
BNL Test ELK Software/Hardware
1 ELK node deployedELK Software : - Logstash 1.5.4- Elasticsearch 1.5.2-1 1.7 - Kibana 4.1.1- Logstash-forwarder 0.40OS
RHEL 6.6Legacy hardware used:- Head node: IBM x3650 M3 node, CPUs: 16 x 2.53GHz,
49GB Memory, 10Gbps Network interconnectivity
- External storage IBM DS3500
ELKNode 1
Node 2
DS3500
DS3500Expansion
DS3500Expansion
DS3500Expansion
12 SAS 15krpm 600 GB/disk
20
Sources/References
1. Peter Love’s https://github.com/ptrlv/logstash
2. dCache Development Team https://github.com/dCache/logstash4dcache
3. General reference information https://www.elastic.co
Rich presentation about ELK4. Johan Guldmyrhttps://indico.desy.de/contributionDisplay.py?contribId=4&confId=11773
Example of Elasticsearch, Kibana with a different data collector infrastructure5. Ilija Vukotichttps://docs.google.com/presentation/d/1oFWLLCP7XxUxrccEH45JYDORsFEQQxtyrsZ9fs677bM/edit#slide=id.p
21
Thank you
22
Backup slide
23
Logstash
stdin () : -Testing, troubleshooting
Logstashforwarder() -Compression, transmission
Reddis(), Rabbitqm() -Large clusters, queuing
file () , Syslog (), Rsyslog()
Grok(): - extract data using pattern matchingDate(): - parse timestapms from fieds, allow assigned time format processed event
Mutate():Manipulate,
modify event field dataGeoip() :
Find IP address geo-location using MaxMin database
Storage:FileS3
MongoDBElasticsearch
…Relay:
RabbitMQ,TCP
Notifications: email Nagios
…
INPUT FILTER OUTPUT
Software Functionality distributed as a modular pluggable pipeline infrastructure