logging for production systems in the container era
TRANSCRIPT
![Page 1: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/1.jpg)
Logging for Production Systems
in The Container Era
Sadayuki Furuhashi Founder & Software Architect
DOCKER MOUNTAIN VIEW
![Page 2: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/2.jpg)
A little about me…
Sadayuki Furuhashi
github: @frsyuki
A founder of Treasure Data, Inc. located in Silicon Valley.
Fluentd - Unifid log collection infrastracture Embulk - Plugin-based ETL tool
OSS projects I founded:
An open-source hacker.
![Page 3: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/3.jpg)
It's like JSON. but fast and small.
A little about me…
![Page 4: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/4.jpg)
The Container EraServer Era Container Era
Service Architecture Monolithic Microservices
System Image Mutable Immutable
Managed By Ops Team DevOps Team
Local Data Persistent Ephemeral
Log Collection syslogd / rsync ?
Metrics Collection Nagios / Zabbix ?
![Page 5: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/5.jpg)
Server Era Container Era
Service Architecture Monolithic Microservices
System Image Mutable Immutable
Managed By Ops Team DevOps Team
Local Data Persistent Ephemeral
Log Collection syslogd / rsync ?
Metrics Collection Nagios / Zabbix ?
The Container Era
How should log & metrics collection be done in The Container Era?
![Page 6: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/6.jpg)
Problems
![Page 7: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/7.jpg)
The traditional logrotate + rsync on containers
Log Server
Application
Container A
File FileFile
Hard to analyze!!Complex text parsers
Application
Container C
File FileFile
Application
Container B
File FileFile
High latency!!Must wait for a day
Ephemeral!!Could be lost at any time
![Page 8: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/8.jpg)
Server 1
Container AApplication
Container BApplication
Server 2
Container CApplication
Container DApplication
Kafka
elasticsearch
HDFS
Container
Container
Container
Container
Small & many containers make storages overloadedToo many connections from micro containers!
![Page 9: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/9.jpg)
Server 1
Container AApplication
Container BApplication
Server 2
Container CApplication
Container DApplication
Kafka
elasticsearch
HDFS
Container
Container
Container
Container
System images are immutableToo many connections from micro containers!
Embedding destination IPsin ALL Docker images makes management hard
![Page 10: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/10.jpg)
Combination explosion with microservicesrequires too many scripts for data integration
LOG
script to parse data
cron job forloading
filteringscript
syslogscript
Tweet-fetching
script
aggregationscript
aggregationscript
script to parse data
rsyncserver
![Page 11: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/11.jpg)
A solution: centralized log collection service
LOG
Log Service
![Page 12: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/12.jpg)
The centralized log collection service
LOG
![Page 13: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/13.jpg)
The centralized log collection service
LOG
We Released!(Apache License)
![Page 14: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/14.jpg)
What’s Fluentd?
Simple core + Variety of plugins
Buffering, HA (failover), Secondary output, etc.
Like syslogd
AN EXTENSIBLE & RELIABLE DATA COLLECTION TOOL
![Page 15: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/15.jpg)
How to collect logs from Docker containers
![Page 16: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/16.jpg)
Text logging with --log-driver=fluentdServer
Container
App
FluentdSTDOUT / STDERR
docker run \ --log-driver=fluentd \ --log-opt \ fluentd-address=localhost:24224
{ “container_id”: “ad6d5d32576a”, “container_name”: “myapp”, “source”: stdout}
![Page 17: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/17.jpg)
Metrics collection with fluent-loggerServer
Container
App
Fluentd
from fluent import senderfrom fluent import event
sender.setup('app.events', host='localhost')event.Event('purchase', { 'user_id': 21, 'item_id': 321, 'value': '1'})
tag = app.events.purchase{ “user_id”: 21, “item_id”: 321 “value”: 1,}fluent-logger library
![Page 18: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/18.jpg)
Logging methods for each purpose• Collecting log messages
> --log-driver=fluentd
• Application metrics
> fluent-logger
• Access logs, logs from middleware
> Shared data volume
• System metrics (CPU usage, Disk capacity, etc.)
> Fluentd’s input plugins(Fluentd pulls those data periodically)
![Page 19: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/19.jpg)
Deployment Patterns
![Page 20: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/20.jpg)
Server 1
Container AApplication
Container BApplication
Server 2
Container CApplication
Container DApplication
Kafka
elasticsearch
HDFS
Container
Container
Container
Container
Primitive deployment…Too many connections from many containers!
Embedding destination IPsin ALL Docker images makes management hard
![Page 21: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/21.jpg)
Server 1
Container AApplication
Container BApplication
Fluentd
Server 2
Container CApplication
Container DApplication
Fluentd Kafka
elasticsearch
HDFS
Container
Container
Container
Container
destination is always localhost from app’s point of view
Source aggregation decouples config from apps
![Page 22: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/22.jpg)
Server 1
Container AApplication
Container BApplication
Fluentd
Server 2
Container CApplication
Container DApplication
Fluentd
active / standby /load balancing
Destination aggregation makes storages scalable for high traffic
Aggregation server(s)
![Page 23: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/23.jpg)
Aggregation servers• Logging directly from microservices makes log
storages overloaded. > Too many RX connections > Too frequent import API calls
• Aggregation servers make the logging infrastracture more reliable and scalable. > Connection aggregation > Buffering for less frequent import API calls > Data persistency during downtime > Automatic retry at recovery from downtime
![Page 24: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/24.jpg)
Fluentd Internal Architecture
![Page 25: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/25.jpg)
Internal Architecture (simplified)
Plugin
Input Filter Buffer Output
Plugin Plugin Plugin
2012-02-04 01:33:51myapp.buylog{
“user”:”me”,“path”: “/buyItem”,“price”: 150,“referer”: “/landing”}
TimeTag
Record
![Page 26: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/26.jpg)
Architecture: Input Plugins
HTTP+JSON (in_http)File tail (in_tail)Syslog (in_syslog)…
Receive logs
Or pull logs from data sources
In non-blocking manner
Plugin
Input
![Page 27: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/27.jpg)
Filter
Architecture: Filter Plugins
Transform logs
Filter out unnecessary logs
Enrich logs
Plugin
Encrypt personal dataConvert IP to countriesParse User-Agent…
![Page 28: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/28.jpg)
Buffer
Architecture: Buffer Plugins
Plugin
Improve performance
Provide reliability
Provide thread-safety
Memory (buf_memory)File (buf_file)
![Page 29: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/29.jpg)
Architecture: Output Plugins
Output
Write or send event logs
Plugin
File (out_file)Amazon S3 (out_s3)MongoDB (out_mongo)…
![Page 30: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/30.jpg)
Buffer
Architecture: Buffer Plugins
Chunk
Plugin
Improve performance
Provide reliability
Provide thread-safety
Input
Output
Chunk
Chunk
![Page 31: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/31.jpg)
Retry
Error
Retry
Batch
Stream Error
Retry
Retry
Divide & Conquer for retry
![Page 32: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/32.jpg)
Divide & Conquer for recoveryBuffer (on-disk or in-memory)
Error
Overloaded!!
recovery
recovery + flow control
queued chunks
![Page 33: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/33.jpg)
Example Use Cases
![Page 34: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/34.jpg)
Streaming from Apache/Nginx to Elasticsearch
in_tail /var/log/access.log
/var/log/fluentd/buffer
but_file
![Page 35: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/35.jpg)
Error Handling and Recovery
in_tail /var/log/access.log
/var/log/fluentd/buffer
but_file
Buffering for any outputs Retrying automatically With exponential wait and persistence on a disk and secondary output
![Page 36: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/36.jpg)
Tailing & parsing files
Supported built-in formats:
Read a log file Custom regexp Custom parser in Ruby
• apache • apache_error • apache2 • nginx
• json • csv • tsv • ltsv
• syslog • multiline • none
pos fileevents.log
?(your app)
![Page 37: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/37.jpg)
Out to Multiple Locations
Routing based on tags Copy to multiple storages
bufferaccess.log
in_tail
![Page 38: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/38.jpg)
Example configuration for real time batch combo
![Page 39: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/39.jpg)
Data partitioning by time on HDFS / S3
access.logbuffer
Custom file formatter
Slice files based on time
2016-01-01/01/access.log.gz 2016-01-01/02/access.log.gz 2016-01-01/03/access.log.gz …
in_tail
![Page 40: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/40.jpg)
3rd party input plugins
dstat
df AMQL
munin
jvmwatcher
SQL
![Page 41: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/41.jpg)
3rd party output plugins
AMQL
Graphite
![Page 42: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/42.jpg)
Real World Use Cases
![Page 43: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/43.jpg)
Microsoft
Operations Management Suite uses Fluentd: "The core of the agent uses an existing open source data aggregator called Fluentd. Fluentd has hundreds of existing plugins, which will make it really easy for you to add new data sources."
Syslog
Linux Computer
Operating SystemApache
MySQLContainers
omsconfig (DSC)PS DSC
Prov
ider
s
OMI Server(CIM Server)
omsagent
Fire
wal
l / p
roxy
OM
S Se
rvic
e
Upload Data(HTTPS)
Pullconfiguration
(HTTPS)
![Page 44: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/44.jpg)
Atlassian
"At Atlassian, we've been impressed by Fluentd and have chosen to use it in Atlassian Cloud's logging and analytics pipeline."
Kinesis
Elasticsearchcluster
Ingestionservice
![Page 45: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/45.jpg)
Amazon web services
The architecture of Fluentd (Sponsored by Treasure Data) is very similar to Apache Flume or Facebook’s Scribe. Fluentd is easier to install and maintain and has better documentation and support than Flume and Scribe.
Types of DataStoreCollectTransactional • Database reads & write (OLTP)• Cache
Search • Logs• Streams
File • Log files (/val/log)• Log collectors & frameworks
Stream • Log records• Sensors & IoT data
Web Apps
IoT
Appl
icat
ions
Logg
ing
Mobile AppsDatabase
Search
File Storage
Stream Storage
![Page 46: Logging for Production Systems in The Container Era](https://reader034.vdocuments.us/reader034/viewer/2022052406/58788dc41a28ab375f8b524d/html5/thumbnails/46.jpg)
Thank you!