night owl by boyd meyer of pros
Post on 10-May-2015
181 Views
Preview:
TRANSCRIPT
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Night OwlLog Monitoring using Elasticsearch and Hadoop
Boyd Meier (bmeier@pros.com)Hadoop Meetup – October 16, 2013
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Problem
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Application Performance Monitoring
● Many servers● Many applications● Many log formats● Many places to go look for information
● What if we could just look in one place and see everything?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Advanced Analysis
● The logs are too low-level● The servers need the existing capacity● The amount of data to be analyzed is huge● Some analysis needs to be across multiple servers● What if we want to change the analysis algorithms?
● How we can do analysis in the most flexible way possible?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Proactive Support
● See problems coming before they become crises● Watch for errors and exceptions● Track performance of the application● Track usage of the application● Enable checks we haven’t thought of yet
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Some Analysis Questions
● What errors happen, and how often?● Who did what, when?● How long did it take to do a task?● What else was happening on the server?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Constraints
● Very little budget – as much free stuff as possible● Can’t use client machines● Communications need to be secure● Large amounts of data (Gb/day/client)● Minimize support’s dependence on client IT
Approach
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Hadoop
● We have a lot of data (~2 GB day with 3 clients)● We need to process it in reasonable time● We can’t afford a big machine for this● We have lots of old machines lying around
● Sounds like a job for the elephant! But what about query?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Elasticsearch
● Query performance on base Hadoop is painful● Ad-hoc queries are required● Hadoop integration● Cluster deployment
● Looks promising! How do we get the data into the server?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Logstash
● Handle many sources, not just logs● Fan-in architecture to server● Compressed, SSL encrypted data● Can offload some logic on the client if desired● Massively configurable● Output to Elasticsearch
● Great! Now how about visualization?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Kibana
● Backed by Elasticsearch● Supports dynamic queries● View information over time● Built-in support for Logstash ● Configurable, shareable dashboards
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
High Level Software Architecture
KibanaElasticsearch
LogstashClient
LogstashServer
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Hadoop Processing
● Pig scripts process the data● Wonderdog from InfoChimps to integrate Pig and
Elasticsearch– There are issues:• Cluster stability using Wonderdog• Wonderdog Pig interface has not been updated in a while• Currently evaluating elasticsearch-hadoop project from Elasticsearch.org
● Analysis results are stored in Elasticsearch for ease of access
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Demo
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Configuration Details
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Software
● Ubuntu 12.04.2 LTS (Precise)● Cloudera CDH 4.3.1–Hadoop 2.0.0–Hbase 0.94–Hive 0.10– Pig 0.11
● Elasticsearch 0.90.3● Logstash 1.1.12● Kibana 3 M3
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Hardware Architecture
● 27 node cluster of commodity machines ● 42 TB of disk space● Connected via 10 gigabit switch● Each machine has:– 8 GB RAM– 2 TB SATA HDD–Gigabit Ethernet
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Performance
● Over the month of September:– 188 million events ingested from 3 clients– 57.5 GB storage used (1.92 GB / day)
● At that rate, 42 TB is enough space for:– 142 billion events– 60 years of data from these clients– 1 year of data from 180 clients at the same volume per client
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Resources
● Elasticsearch - http://www.elasticsearch.org/overview/• http://github.com/elasticsearch/elasticsearch
● Logstash - http://www.elasticsearch.org/overview/logstash/• https://github.com/logstash/logstash
● Kibana - http://www.elasticsearch.org/overview/kibana/• https://github.com/elasticsearch/kibana
● ES – Hadoop - http://www.elasticsearch.org/overview/hadoop/• http://github.com/elasticsearch/elasticsearch-hadoop
● Cloudera - http://www.cloudera.com/
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
PROS Germany GmbHFeringastrasse 685774 UnterfoehringMunichTel.: +49 89 99216 270 Fax: +49 89 99216 200
Regional Office - Austin, TX3600 Parmer Lane, Suite 205Austin, Texas 78727Regional Office - Cary, North Carolina1000 Centre Green Way, #200Cary, NC 27513Phone:+1 919-228-6334
World Headquarters3100 Main Street, Suite #900Houston, TX 77002 Phone: +1 713-335-5151Sales: +1 855-846-0641Fax: +1 713-335-8144
European Headquarters - United KingdomLakeside House1 Furzeground WayStockley ParkHeathrowUB11 1BDPhone: +44 (0) 208 622 3555Fax: +44 208 622 3230
top related