apache hadoop goes realtime at facebook borthakur, sarma,...

32
Apache Hadoop Goes Realtime at Facebook ~ Borthakur, Sarma, Gray, Muthukkaruppan, Spiegelberg, Kuang, Ranganathan, Molkov, Menon, Rash, Scmidt and Aiyer

Upload: others

Post on 29-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

Apache Hadoop Goes Realtime at Facebook

~ Borthakur, Sarma, Gray, Muthukkaruppan, Spiegelberg, Kuang, Ranganathan, Molkov,

Menon, Rash, Scmidt and Aiyer

Page 2: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

2

Problem and Context● Ever increasing data at Facebook● Launch of Facebook Messages● Other Young Turks at Facebook● Leaving MySQL and its sharding ● Migration challeneges● Problem in words: Unpredictable growth, write

throughput and latency requirements

Page 3: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

3

Problem and Context (contd.)● The Usual Suspects

● Cassandra● Other NoSQL

● Other considerations● Solution: A near realtime Hadoop/HBase that is

modified from the vanilla versions to provide scalability, consistency, availability and a compatible data model.

Page 4: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

4

Key contributions● Making Hadoop and HBase more real-time● Adapting Hadoop and HBase to Facebook's

unique requirements● Implementation of RealTime HDFS● Implementation of Production HBase● Operational Optimizations

Page 5: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

5

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase● Enter the Hs● Realtime HDFS● Production HBase● Operational Optimization● The present future

Page 6: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

6

Facebook's unique requirements

● Facebook and the Hadoop ecosystem● Offline and sequential

● Requirement Type 1 – Realtime concurrent read access to large stream of realtime data

● Example: Scribe

● Requirement Type 2 - Dynamically index a rapidly growing data set for fast random lookups

● Example: Facebook Messages

Page 7: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

7

Facebook's unique requirements● Facebook Messaging:

● Unweildly tables● High Write Throughput● Data Migration

● Facebook Insights● Realtime Analytics● Aggregators

● Facebook Metrics System● Quick reads● Automatic Sharding

Page 8: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

8

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase(Small) Introduction to Hadoop and HBase● Enter the Hs● Realtime HDFS● Production Hbase● Operational Optimization● The present future

Page 9: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

9

Introduction to Hadoop

Page 10: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

10

Introduction to HBase● Hbase: A NoSQL database that utilizes an on-disk

column storage format.

● Hbase USP: Provides fast key-based access to a specific cell or data or a range of cells.

● Based on Google's BigTable but extends it

● Has Row atomicity and read-modify-write consistency

● Simplifies a lot of tasks related to distributed databases.

● Tagline: Random access to web-scale data

Page 11: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

11

Introduction to Zookeeper● Zookeeper: A software service for a distributed

environment that coordinates and configures different machines in a centralized way.

● A change is not considered successful until it has been written to a quorum

● A leader is elected within the ensemble for conflicts

● In HBase, ZooKeeper coordinates and shares state between the Masters and RegionServers.

● Tagline: Enables highly reliable distributed coordination

Page 12: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

12

HBase + Zookeeper

Page 13: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

13

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase● Enter the HsEnter the Hs● Realtime HDFS● Production HBase● Operational Optimization● The present future

Page 14: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

14

The Why Hadoop/HBase question● Scalability● Range Scans● Efficient low-latency strong consistency● Atomic Read-Modify-Write● Random reads● Fault Isolation

Page 15: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

15

The Why Hadoop/HBase question● High write throughput● Data model● High Availability

● Non-requirements● Tolerance of network partitions● Individual data centre failure zero downtime● Federation comfort

Page 16: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

16

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase● Enter the Hs● Realtime HDFSRealtime HDFS● Production HBase● Operational Optimization● The present future

Page 17: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

17

Realtime HDFS - AvatarNode

Page 18: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

18

Realtime HDFS - AvatarNode

Page 19: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

19

Realtime HDFS – AvatarNode view

Page 20: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

20

Realtime HDFS – Logging

● Enhancements to Transcation logging:● Conventional HDFS● Change: Let the StandbyNode always know about

block ids.● Avoidance of partial reads between Active and

Standby node

Page 21: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

21

Improved block availability

● Challenge: Placement of non-local blocks is not optimal; can be on any rack or within any node therein.

● Soution: A new block placement policy which has reduced the probabilty of data loss by orders of magnitude.

● Define a 'window' of logical racks and logical machines around the original block.

Page 22: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

22

Hadoop performance improvements

● RPC Timeout● Live free or fail fast

● File Lease recovery● Local replica awareness● New tricks:

● HDFS sync● Concurrent readers

Page 23: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

23

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase● Enter the Hs● Realtime HDFS● Production HBaseProduction HBase● Operational Optimization● The present future

Page 24: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

24

HBase – ACID compliance

● Requirement: Row-level atomicity and consistency of ACID compliance

● RegionServer failure during log write for row transactions.

● Consistency of replicas

● Solution:

● WAL edits ~ Write Ahead Log policy● Immediate rollback

Page 25: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

25

HBase – Availability Improvements

● Master Rewrite● Store transient state in Zookeeper

● Rolling upgrades● Handled by reassigning of regions

● Distributed Logsplitting● Outsource to Zookeeper

Page 26: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

26

Hbase – Performance Improvement

● Compaction Improvement● put latency dropped from 25 ms to 3 ms!

● Read Optimization – Skipping certain unnecessary files for certain queries, reducing I/O

● Using Bloom filters ● A new special timestamp file selection algorithm

● Ensuring that Regions are local to their data

Page 27: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

27

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase● Enter the Hs● Realtime HDFS● Production Hbase● Operational OptimizationOperational Optimization● The present future

Page 28: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

28

Operational Optimizations● Facebook's HBase testing program● HBase Verify● HBCK● Added metrics for long running operations too!● Manual split instead of automatic

Page 29: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

29

Operational Optimizations

● Dark Launch● Dashboard/ODS integration

● Cross-cluster dashboards for higher analysis● Visualize version differences

● Backups● Do it using Scribe as an alternate application log● Piggyback on the date sent for Hive analytics

Page 30: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

30

Operational Optimizations

● Importing the data● Challenge: Importing legacy data in HBase from a

Hadoop job saturates the production network● Solution: Use Bulk Import with compression

– Enhanced by GZIP of the intermediate map output

● Reducing Network I/O:● Decreased the periodicity of major compactions● Certain column families excluded from logging

Page 31: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

31

The present future● Apache Hadoop 2.0 was released in 2012● One addition was YARN

● A powerful cluster resource management● Added the High Availability feature to NameNode by

introducing the Hot/Standby NameNode.● Greater integration with Zookeeper, especially for

the ZKFC (Implementation of failover in DAFS)

Page 32: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma, …courses.cs.vt.edu/cs5204/fall14-butt/lectures/fbrealtime.pdf · 2014-09-15 · Facebook's unique requirements Facebook

32

Thank you and GG!