fluentd - set up once, collect more

31
Sadayuki Furuhashi Founder & Software Architect Set Up Once, Collect More. Treasure Data, inc.

Upload: sadayuki-furuhashi

Post on 08-May-2015

810 views

Category:

Software


0 download

DESCRIPTION

Fluentd meetup @ Rackspace San Francisco 2014-02-19

TRANSCRIPT

Page 1: Fluentd - Set Up Once, Collect More

Sadayuki Furuhashi

Founder & Software Architect

Set Up Once, Collect More.

Treasure Data, inc.

Page 2: Fluentd - Set Up Once, Collect More

Self-introduction

> Sadayuki Furuhashigithub/twitter: @frsyuki

> Treasure Data, Inc.Founder & Software Architect

> Open source projectsMessagePack - e!cient object serializer

Fluentd - data collection toolServerEngine - Ruby framework to build multiprocess servers

LS4 - distributed object storage system (suspended)kumofs - distributed key-value data store (suspended)

Page 3: Fluentd - Set Up Once, Collect More

What’s Fluentd?

An extensible & reliable data collection tool

Page 4: Fluentd - Set Up Once, Collect More

What’s Fluentd?

An extensible & reliable data collection tool

simple core + plugins

bu!ering, HA (failover),load balance, etc.

like syslogd

Page 5: Fluentd - Set Up Once, Collect More

Blue"ood

MongoDB

Hadoop

Metrics

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Your systemfilter / buffer / routing

Page 6: Fluentd - Set Up Once, Collect More

Blue"ood

MongoDB

Hadoop

Metrics

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Your systemfilter / buffer / routing

Page 7: Fluentd - Set Up Once, Collect More

Blue"ood

MongoDB

Hadoop

Metrics

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Your systemfilter / buffer / routing

Page 8: Fluentd - Set Up Once, Collect More

Blue"ood

MongoDB

Hadoop

Metrics

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Your systemfilter / buffer / routing

Page 9: Fluentd - Set Up Once, Collect More

Input Plugins Output Plugins

Buffer Plugins(Filter Plugins)

Page 10: Fluentd - Set Up Once, Collect More

# logs from a file<source> type tail path /var/log/httpd.log format apache2 tag web.access</source>

# logs from client libraries<source> type forward port 24224</source>

# store logs to MongoDB and S3<match **> type copy

<match> type mongo host mongo.example.com capped capped_size 200m </match>

<match> type s3 path archive/ </match></match>

Fluentd

Page 11: Fluentd - Set Up Once, Collect More
Page 12: Fluentd - Set Up Once, Collect More

API servers

FluentdRails app

Fluentd

Queue

PerfectQueue

Ruby appFluentd

FluentdRails app

worker servers

Ruby appFluentd

"uent-logger-ruby+ in_forward

watch server

scriptout_forwardin_exec

Fluentd in Treasure Data

Page 13: Fluentd - Set Up Once, Collect More

watch server

Librato Metricsfor realtime analysis

Treasure Datafor historical analysis

out_tdlog out_metricsense

✓ streaming aggregation

Fluentd in Treasure Data

Fluentd

Page 14: Fluentd - Set Up Once, Collect More

Internal Architecture

Input Bu!er Output

Plugin Plugin Plugin

2012-02-04 01:33:51myapp.buylog { “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing”}

timetag

record

Page 15: Fluentd - Set Up Once, Collect More

Architecture :: Input plugins

Input

HTTP+JSON (in_http)File tail (in_tail)Syslog (in_syslog)...

Plugin

✓ Receive logs

✓ Or pull logs from data sources

✓ in non-blocking manner

Page 16: Fluentd - Set Up Once, Collect More

Architecture :: Output plugins

Plugin

✓ Write or send event logs

Output

File (out_"le)Amazon S3 (out_s3)MongoDB (out_mongo)...

Page 17: Fluentd - Set Up Once, Collect More

Architecture :: Bu!er plugins

Plugin

✓ Improve performance

✓ Provide reliability

✓ Provide thread-safety

Bu!er

Memory (buf_memory)File (buf_"le)

Page 18: Fluentd - Set Up Once, Collect More

Architecture :: Bu!er plugins

Plugin

✓ Improve performance

✓ Provide reliability

✓ Provide thread-safety

chunk

chunk

chunk output

Input

Page 19: Fluentd - Set Up Once, Collect More

in_tail

Apache

buf_#lein_tailFluentd

/var/log/access.log

/var/log/"uentd/bufer

Page 20: Fluentd - Set Up Once, Collect More

in_tail

Apache

buf_#lein_tailFluentd

/var/log/access.log

/var/log/"uentd/bufer

✓ retrying automatically,✓ with exponential wait,✓ and persistence on a disk.

Page 21: Fluentd - Set Up Once, Collect More

in_tail

Apache

buf_#lein_tailFluentd

/var/log/access.log

/var/log/"uentd/bufer

✓ bu!ering for any outputs,✓ with exponential wait,✓ and persistence on a disk.Amazon S3 Hadoop

Page 22: Fluentd - Set Up Once, Collect More

Fluentd

Fluentd Fluentd

!uentd

applications, log #les, HTTP, etc.

FluentdFluentd Fluentd Fluentd

Heartbeat

Page 23: Fluentd - Set Up Once, Collect More

Fluentd

Fluentd Fluentd

!uentd

applications, log #les, HTTP, etc.

FluentdFluentd Fluentd Fluentd

Heartbeat

✓ load balancing or active-backup

Page 24: Fluentd - Set Up Once, Collect More

class CassandraOutput < BufferedOutput Fluent::Plugin.register_output('cassandra', self)

require 'cassandra'

config_param :keyspace, :string config_param :columnfamily, :string config_param :host, :string, :default => 'localhost' config_param :port, :int, :default => 9160

def start super @connection = Cassandra.new(@keyspace, “#{@host}:#{@port}”) end

def format(tag, time, record) record['tag'] = tag record['time'] = time record.to_msgpack end

def write(chunk) chunk.msgpack_each do |record| @connection.insert(@columnfamily, "#{record["tag"]}_#{record["time"]}", record) end endend

out_cassandra

Page 25: Fluentd - Set Up Once, Collect More

Use cases

http://www.slideshare.net/tagomoris/rubykaigi-2013-111130

“Complex Event Processing on Ruby, Fluentd and Norikra” TAGOMORI Satoshi, RubyKaigi 2013

http://www.slideshare.net/tagomoris/log-analysis-with-hadoop-in-livedoor-2013

“Log analysis system with Hadoop” NHN Japan Corp., Hadoop Conference Japan 2013

http://www.slideshare.net/sylvainkalache/"uentd-at-slideshare

“"uentd at slideshare” @SylvainKalache, Fluentd meetup

Page 26: Fluentd - Set Up Once, Collect More

Use cases

http://www.slideshare.net/frsyuki/how-24042353

“How we use Fluentd in Treasure Data” Sadayuki Furuhashi, Fluentd meetup at slideshare

http://www.slideshare.net/sematext/solr-for-indexing-and-searching-logs

“Using Solr to Search and Analyze Logs” Radu Gheorghe

http://docs."uentd.org/articles/free-alternative-to-splunk-by-"uentd

“Free Alternative to Splunk Using Fluentd”

Page 27: Fluentd - Set Up Once, Collect More

Expected discussions...

> Who are using Fluentd?

> What’s the di!erences compared to XYZ?

> Is there a plugin to send/recv data to/from XYZ?

> How can my system XYZ send data to Fluentd?

> Does Fluentd really work in case of XYZ?

Page 28: Fluentd - Set Up Once, Collect More

Links

http://"uentd.org/plugin/

Page 29: Fluentd - Set Up Once, Collect More

class SomeInput < Fluent::Input Fluent::Plugin.register_input('myin', self)

config_param :tag, :string

def start Thread.new { while true time = Engine.new record = {“user”=>1, “size”=>1} Engine.emit(@tag, time, record) end } end

def shutdown ... endend

<source> type myin tag myapp.api.heartbeat</source>

Page 30: Fluentd - Set Up Once, Collect More

class SomeOutput < Fluent::BufferedOutput Fluent::Plugin.register_output('myout', self)

config_param :myparam, :string

def format(tag, time, record) [tag, time, record].to_json + "\n" end

def write(chunk) puts chunk.read endend

<match **> type myout myparam foobar</match>

Page 31: Fluentd - Set Up Once, Collect More

class MyTailInput < Fluent::TailInput Fluent::Plugin.register_input('mytail', self)

def configure_parser(conf) ... end

def parse_line(line) array = line.split(“\t”) record = {“user”=>array[0], “item”=>array[1]}

time = Engine.now return time, record endend

<source> type mytail</source>