storm - altamira university presentation

Post on 26-Jan-2015

113 Views

Category:

Technology

5 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Objectives1.Their Motivation2.Our Motivation3.Storm Basics4.Demo

Their MotivationHow Storm Came To Be

What They Wanted• Guaranteed data processing• Horizontal scalability• Fault-tolerance• No intermediate message brokers!• Higher level abstraction than message passing• “Just works”

Our MotivationWhy We Chose Storm

eventua

ll

y^

Lumify IngestRaw Data

Text Extraction

Entity Extraction

Text Highlighting

Location Extraction

Full Text Indexing

Issues

• No Reducers• High DB Read/Writes• Batch-style processing• M/R Overhead• Zero Fault Tolerance

What We Really Wanted

• Distributed, Stream-type Processing• Simple Logical DAG• Better Fault Tolerance

Text

Storm Ingest Workflow

Documents

Video

Images

Raw Data Content Sorter

Text Extraction

Video Frame

Splitting

Video Frame Text Extraction

Image Text Extraction

Storm BasicsWhat the heck’s a Topology?

Storm Cluster

Nimbus

Zookeeper

Zookeeper

Zookeeper

Supervisor

Supervisor

Supervisor

Supervisor

Supervisor

Storm Cluster

Nimbus

Zookeeper

Zookeeper

Zookeeper

Supervisor

Supervisor

Supervisor

Supervisor

Supervisor

Storm Cluster

Nimbus

Zookeeper

Zookeeper

Zookeeper

Supervisor

Supervisor

Supervisor

Supervisor

Supervisor

Storm Cluster

Nimbus

Zookeeper

Zookeeper

Zookeeper

Supervisor

Supervisor

Supervisor

Supervisor

Supervisor

Storm Data Concepts• Tuples• Streams• Spouts• Bolts• Topologies

Tuples

• Single unit of data in Storm• Examples– Tweet– User Activity Log Entry– File Info

Streams

Tuple Tuple Tuple TupleTupleTuple Tuple

An unbound sequence of Tuples

Spouts

TupleTuple

TupleTupleTuple Tuple

Producers of Streams

Tuple

TupleTuple

Tuple

Tuple Tuple

Spout

Bolts

TupleTuple

Tuple Tuple

Process input streams to create new streams

Tuple

Tuple

Tuple Tuple

Tuple Tuple

Examples

Spout Examples• HDFS Filesystem Spout• Kafka Queue Spout

Bolt Examples• Filtering• Aggregation• DB Operations

Topologies

Spout

Spout

Spout

Demo

Demo Topology

Twitter Hosebird

Spout

SentenceSplitter

Accumulo

WordCount

Twitter

Demo Topology

Twitter Hosebird

Spout

SentenceSplitter

Accumulo

WordCount

Twitter ShuffleGrouping Field

Grouping

top related