apache flume ng

18
APACHE FLUME NG Kai Voigt, Cloudera Inc London, Hadoop User Group, 10 Oct 2012 Donnerstag, 11. Oktober 12

Upload: huguk

Post on 04-Dec-2014

3.559 views

Category:

Technology


0 download

DESCRIPTION

Talk given by Kai Voigt, Cloudera Inc, at the Hadoop User Group UK meetup on 10 Oct 2012 in London

TRANSCRIPT

Page 1: Apache Flume NG

APACHE FLUME NGKai Voigt, Cloudera IncLondon, Hadoop User Group, 10 Oct 2012

Donnerstag, 11. Oktober 12

Page 2: Apache Flume NG

FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA

”Donnerstag, 11. Oktober 12

Page 3: Apache Flume NG

FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA

”Donnerstag, 11. Oktober 12

Page 4: Apache Flume NG

httpd

/var/log/htaccess

HDFS

Flume

Donnerstag, 11. Oktober 12

Page 5: Apache Flume NG

5

Donnerstag, 11. Oktober 12

Page 6: Apache Flume NG

6

mysource

mychannel

mysink

myagent.sources = mysourcemyagent.sinks = mysinkmyagent.channels = mychannel

Donnerstag, 11. Oktober 12

Page 7: Apache Flume NG

7

myagent.sources.mysource.type = execmyagent.sources.mysource.command = tail -F /var/log/htaccessmyagent.sources.mysource.channels = mychannel

mysource

mychannel

mysink

Donnerstag, 11. Oktober 12

Page 8: Apache Flume NG

8

myagent.sinks.mysink.type = hdfsmyagent.sinks.mysink.hdfs.path = /user/cloudera/htaccessmyagent.sinks.mysink.hdfs.fileType = DataStreammyagent.sinks.mysink.channel = mychannel

mysource

mychannel

mysink

Donnerstag, 11. Oktober 12

Page 9: Apache Flume NG

9

myagent.channels.mychannel.type = memorymyagent.channels.mychannel.capacity = 1000myagent.channels.mychannel.transactionCapactiy = 100

mysource

mychannel

mysink

Donnerstag, 11. Oktober 12

Page 10: Apache Flume NG

10

$ flume-ng agent --conf-file simple.conf --name myagent$ hadoop fs -ls htaccess-rw-r--r-- 1 cloudera cloudera 1001 2012-09-30 05:58 htaccess/FlumeData.1348999108529-rw-r--r-- 1 cloudera cloudera 993 2012-09-30 05:58 htaccess/FlumeData.1348999108530-rw-r--r-- 1 cloudera cloudera 997 2012-09-30 05:59 htaccess/FlumeData.1348999108531-rw-r--r-- 1 cloudera cloudera 1009 2012-09-30 05:59 htaccess/FlumeData.1348999108532...

Donnerstag, 11. Oktober 12

Page 11: Apache Flume NG

FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA

”Donnerstag, 11. Oktober 12

Page 12: Apache Flume NG

12

MULTI HOP

Donnerstag, 11. Oktober 12

Page 13: Apache Flume NG

13

myagent1.sinks = mysinkmyagent1.sinks.mysink.type = avromyagent1.sinks.mysink.bind = 10.10.10.20myagent1.sinks.mysink.port = 4141

myagent2.sources = mysourcemyagent2.sources.mysource.type = avromyagent2.sources.mysource.bind = 10.10.10.10myagent2.sources.mysource.port = 4141

Donnerstag, 11. Oktober 12

Page 14: Apache Flume NG

14

CONSOLIDATION

Donnerstag, 11. Oktober 12

Page 15: Apache Flume NG

15

MULTIPLEXING

Donnerstag, 11. Oktober 12

Page 16: Apache Flume NG

16

Sources Sinks Channels

Avro Avro Memory

Exec Logger JDBC

NetCat IRC File

Sequence Generator File

Syslog HBase

Scribe

Donnerstag, 11. Oktober 12

Page 17: Apache Flume NG

DEMODEMODEMODEMODEMO

Donnerstag, 11. Oktober 12

Page 18: Apache Flume NG

Thank [email protected]://www.cloudera.com/

Donnerstag, 11. Oktober 12