facebook[the nuts and bolts technology]

32
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING MADANAPALLE INSTITUTE OF TECHNOLOGY AND SCIENCE (UGC-AUTONOMOUS) A Seminar Presentation On FACEBOOK [The Nuts and Bolts – Technology] By M.Koushik reddy 12691A0546 Under the guidance N.Sudhakar Yadav M.Tech Asst.professor

Upload: koushik-reddy

Post on 27-Jan-2017

459 views

Category:

Engineering


4 download

TRANSCRIPT

Page 1: Facebook[The Nuts and Bolts Technology]

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERINGMADANAPALLE INSTITUTE OF TECHNOLOGY AND SCIENCE

(UGC-AUTONOMOUS)

A Seminar Presentation On

FACEBOOK[The Nuts and Bolts – Technology]

ByM.Koushik reddy12691A0546

Under the guidance N.Sudhakar Yadav

M.TechAsst.professor

Page 2: Facebook[The Nuts and Bolts Technology]

Contents• Introduction• Languages• Databases• Software's and technology

Page 3: Facebook[The Nuts and Bolts Technology]

So what's all the Hype?What exactly is Facebook®?

• Facebook® is a “social networking website”

• Facebook® is a free service that allows you to create an online page to connect with friends, family, or make new friends with anyone anywhere.

• On your Facebook® page you can share pictures, personal information , messages, videos , join groups and add applications.

Page 4: Facebook[The Nuts and Bolts Technology]

Introduction

• Here are a few factoids to give you an idea of the scaling challenge that Facebook has to deal with:

• Facebook serves 570 billion page views per month (according to Google Ad Planner).

• There are more photos on Facebook than all other photo sites combined (including sites like Flickr).

• More than 3 billion photos are uploaded every month.• Facebook’s systems serve 1.2 million photos per second.• More than 25 billion pieces of content (status updates, comments, etc) are

shared every month.• Facebook has more than 30,000 servers (and this number is from last

year!)

Page 5: Facebook[The Nuts and Bolts Technology]

Languages:Front End: (client side) - Java script

Back End: (server side) - Hack, PHP (HHVM)

- C++,Java - Python,Erlang - D,XHP and - Haskell

Page 6: Facebook[The Nuts and Bolts Technology]

• Java script: (Front End)

It is a high-level, dynamic, un typed, and interpreted programming language

- It is supported by all modern web browsers without plug-ins

• .

Sample code:FB.getLoginStatus(function(response) { if (response.status === 'connected') { console.log('Logged in.'); } else { FB.login(); } })

Page 7: Facebook[The Nuts and Bolts Technology]

Back End:(Server side)

Hack: Hack is a programming language for the Hip-hop Virtual

Machine (HHVM), created by face book as a dialect of PHP.• It is open-source, licensed under the BSD License• Hack allows programmers to use both dynamic typing and static typing.• Introduced on march 20,2014 Sample code: <?hh echo 'Hello World';• An important point : Unlike PHP, Hack and HTML code do not mix.

Normally you can mix PHP and HTML code together in the same file.

Page 8: Facebook[The Nuts and Bolts Technology]

PHP VS HACK

• They are both PHP, both run on apache• Hack tries to implement more functionality and features to PHP and helps to clean up some of the inconsistencies

Page 9: Facebook[The Nuts and Bolts Technology]

Back End:(Server side)

Erlang: It is a general purpose, concurrent, garbage collected programming

language and runtime system. • It was originally designed by Ericsson . • It supports hot swapping, thus code can be changed without stopping a

system.• It provides language-level features for creating and managing processes

with the aim of simplifying concurrent programming. • All concurrency is explicit in Erlang, processes communicate

using message passing instead of shared variables, which removes the need for explicit locks.

Page 10: Facebook[The Nuts and Bolts Technology]

Back End:(Server side)

Continue…Sample code:An Erlang function that uses recursion to count to ten

-module(count_to_ten). -export([count_to_ten/0]). count_to_ten() -> do_count(0). do_count(10) -> 10; do_count(Value) -> do_count(Value + 1).

Page 11: Facebook[The Nuts and Bolts Technology]

Back End:(Server side)

Erlang in facebook..?? It is used mainly in facebook

chat.

Page 12: Facebook[The Nuts and Bolts Technology]

Back End:(Server side)Continue…System overview :User Interface-Chat in the browser:

Page 13: Facebook[The Nuts and Bolts Technology]

Back End:(Server side)

Continue…System overview :User Interface-Chat in the browser:• Channel (Erlang): message queuing and delivery . Queue messages in each user’s “channel” Deliver messages as responses to long-polling HTTP requests• Presence (C++): aggregates online info in memory (pull-based presence)• Chatlogger (C++): stores conversations between page loads• Web tier (PHP): serves our vanilla web requests

Page 14: Facebook[The Nuts and Bolts Technology]

Back End:(Server side)

Haskell: Haskell is a standardized, general-purpose purely functional

programming language, with non-strict semantics and strong static typing.

Sample code: ”Hello world program “ module Main where main :: IO () main = putStrLn "Hello, World!"

Page 15: Facebook[The Nuts and Bolts Technology]

Back End:(Server side)

Haskell in facebook…??Fighting spam with Haskell:Sigma:

One of the weapons in the fight against spam, malware, and other abuse on Facebook is a system called Sigma.

• Its job is to proactively identify malicious actions on Facebook, such as spam, phishing attacks, posting links to malware, etc.

• Bad content detected by Sigma is removed automatically so that it doesn't show up in your News Feed.

• Sigma is a rule engine, which means it runs a set of rules, called policies. • These policies make it possible for us to identify and block malicious interactions

before they affect people on Facebook.

Page 16: Facebook[The Nuts and Bolts Technology]

Back End:(Server side)

• Continue…Why Haskell in sigma…??• It was replaced by the FXL(Feature

eXtraction Language) with Haskell. Reasons for replacements:1. Purely functional and strongly typed. 2. Push code changes to production in

minutes.3. Performance. 4. Support for interactive development. 

Page 17: Facebook[The Nuts and Bolts Technology]

Database

What database actually Facebook uses..?• A billion of people are using FACEBOOK, storing every transaction for 800

million users and handling more than 60 million queries per second• Interacting with their peer and friends through wall posts, uploading their photos, passing information’s about events and other meaningful information .• Facebook uses several database techniques.

Databases used in facebook:• MySql• HBase• Cassandra

Page 18: Facebook[The Nuts and Bolts Technology]

Databases

MYSQL: Facebook primarily uses MYSQL for structured data storage such as wall posts, user information, timeline etc.  • This data is replicated between their various data centers.

Page 19: Facebook[The Nuts and Bolts Technology]

Facebook Database Design:

Page 20: Facebook[The Nuts and Bolts Technology]

Database

HBase:Is an open source,  non-relational,  distributed database modeled written

in Java.

• It is developed as part of Apache Software Foundation's Apache Hadoop project

• Runs on top of HDFS (Hadoop Distributed File system), providing BigTable-like capabilities for Hadoop.

• Hbase is now serving several data-driven websites, including Facebook's Messaging Platform

Page 21: Facebook[The Nuts and Bolts Technology]

Hbase Architecture:

• In HBase, tables are split into regions and are served by the region servers. • Regions are vertically divided by column families into “Stores”. • Stores are saved as files in HDFS.

Page 22: Facebook[The Nuts and Bolts Technology]

Continue…

The Master Server -Assigns regions to the region servers and takes the help of Apache

ZooKeeper for this task.• Handles load balancing of the regions across region servers. It unloads the busy servers

and shifts the regions to less occupied servers.• Is responsible for operations such as creation of tables and column families.Regions-

Regions are nothing but tables that are split up and spread across the region servers.

Zookeeper-Zookeeper is an open-source project that provides services like maintaining

configuration information, naming, providing distributed synchronization, etc.• Clients communicate with region servers via zookeeper.

Page 23: Facebook[The Nuts and Bolts Technology]

HBase in facebook Messaging

Messaging Data:

• Small/Medium sized data—Hbase• Search index• Small message bodies

o Attachments and Large messages– Haystack• Used for our exesting photo/video store

Page 24: Facebook[The Nuts and Bolts Technology]

Continue….

Page 25: Facebook[The Nuts and Bolts Technology]

Continue….Write Path in HBase:

•In Hbase, the messages are stored in the file(Hfiles), the messages are directly appended in the HDFS

Read path:•Simillarly messages can be read directly from the Hfiles

Page 26: Facebook[The Nuts and Bolts Technology]

Software And Techniques

The Front End:

• Linux & Apache• Memcache• Haystack• Bigpipe

The Back End:

• Thrift (protocol)• Scribe (log server)• HipHop for PHP

Page 27: Facebook[The Nuts and Bolts Technology]

Software And Techniques

The Front End:

Linux & Apache:Linux is a Unix-like computer operating system kernel.

• It’s open source, very customizable, and good for security.• Facebook runs the Linux operating system on Apache HTTP Servers. • Apache is also free and is the most popular open source web server in use.

Page 28: Facebook[The Nuts and Bolts Technology]

Software And TechniquesMemcache:

• Facebook makes heavy use of Memcached,

• A memory caching system that is used to speed up dynamic database driven websites by caching data and objects in RAM to reduce reading time.

• Having a caching system allows Facebook to be as fast as it is at recalling your data.

• Doesn’t have to go to the database, it will just fetch your data from the cache based on your user ID.

Page 29: Facebook[The Nuts and Bolts Technology]

Software And TechniquesFaceebook-Photos-Haystack:

• The Photos application is one of Facebook’s most popular features. • Users have uploaded over 15 billion photos which make Facebook

the biggest photo sharing website.• For each uploaded photo, Facebook generates and stores four images

of different sizes, which translates to a total of 60 billion images and 1.5PB of storage.

• The current growth rate is 220 million new photos per week, which translates to 25TB of additional storage consumed weekly.

Page 30: Facebook[The Nuts and Bolts Technology]

Haystack in facebook

• Haystack is Facebook’s high-performance photo storage/retrieval system.• A highly scalable object store used to serve Facebook’s immense amount of photos.• Implements a HTTP based photo server which stores photos in a generic object store called Haystack.

Page 31: Facebook[The Nuts and Bolts Technology]

Software And Techniques

BigPipe:Dynamic web page serving system, Facebook has developed.

• BigPipe is a fundamental redesign of the dynamic web page serving system.• BigPipe breaks the page generation process into several stages• The first three stages are executed by the web server, and the last four stages are

executed by the browser.

Page 32: Facebook[The Nuts and Bolts Technology]

Questions?Hope you enjoyed this presentation…