netflixoss meetup season 3 episode 1

90
Season 3 Episode 1 Feb 11, 2015

Upload: ruslan-meshenberg

Post on 15-Jul-2015

1.907 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: NetflixOSS Meetup season 3 episode 1

Season 3 Episode 1Feb 11, 2015

Page 2: NetflixOSS Meetup season 3 episode 1

Ruslan Meshenberg - @rusmeshenberg

Introduction

Page 3: NetflixOSS Meetup season 3 episode 1

● One new way to eval○ Zero To Docker

● Three community users○ IBM Watson○ Nike Digital○ Pivotal

Agenda - Lightning Talks● Eight new projects

○ Atlas○ Prana○ Raigad○ Genie 2○ Inviso○ Dynomite○ Nicobar○ MSL

Page 4: NetflixOSS Meetup season 3 episode 1

AtlasRoy Rapoport - @royrapoport

Page 5: NetflixOSS Meetup season 3 episode 1

In-House Telemetry? Inconceivable!

● Crowded OSS field!○ Cacti, InfluxDB, OpenTSDB, Nagios, Icinga, NeDi, Zabbix, Observium, Sensu, Zenoss,

OpenNMS, Bosun, Prometheus, etc

● Not to mention commercial products

● Some shortcomings ...

Page 6: NetflixOSS Meetup season 3 episode 1

In-House Telemetry? Inconceivable!

● Agility Mismatch● Cloud (and Netflix Ecosystem) Integration● Multiple Data Sources● Scale● No, seriously. Scale

○ 2011: 10M/minute○ ~2x Increase per quarter○ Now up to 1.3B/minute

Page 7: NetflixOSS Meetup season 3 episode 1

If You Build It …

Page 8: NetflixOSS Meetup season 3 episode 1

Also …

● Decent UIs● Alerting

○ And alert threshold analysis and recommendations● Real-Time Analytics● Integration with Hive and EMR● Dashboards frameworks

Page 9: NetflixOSS Meetup season 3 episode 1

Also …

● Composable● So we can change our minds later …

Page 10: NetflixOSS Meetup season 3 episode 1

For now …

● Query layer● Back end

Page 11: NetflixOSS Meetup season 3 episode 1

Soon …

● Improved deployment● Publish client● Alerting● Better UI

Page 12: NetflixOSS Meetup season 3 episode 1

PranaDiptanu Choudhury - @diptanu

Page 13: NetflixOSS Meetup season 3 episode 1

Motivations

● The Netflix Platform stack is JVM based● Platform features are provided to developers

via client libraries○ Service Discovery○ Client Side Load Balancing○ Monitoring and alerting client libraries

Page 14: NetflixOSS Meetup season 3 episode 1

The Netflix Ecosystem

Page 15: NetflixOSS Meetup season 3 episode 1

Meet Prana

● Prana provides the same set of features to non-jvm or non-netflix-platform based software

● It allows applications to gel with the Netflix Ecosystem

Page 16: NetflixOSS Meetup season 3 episode 1

Prana Features

● Easy to use http based api ○ Load Balancing via Ribbon○ Service discovery via Eureka client○ Monitoring via Atlas Client

● Extensible via a plugin framework● Highly Configurable

Page 17: NetflixOSS Meetup season 3 episode 1

RaigadSagar Loke - @sagar_loke

Page 18: NetflixOSS Meetup season 3 episode 1

Raigad - Motivation

● Elasticsearch Side Car – Co-process runs along side ES process

● Helps to automate ES deployment○ ~50 Clusters in test -- ~180TB data○ ~45 Clusters in prod -- ~780TB data

● Node Discovery and Tracking● Automatic Index Management● Scheduled Backup and Restore● Geared towards running in AWS Environment

Page 19: NetflixOSS Meetup season 3 episode 1

Auto ES Deployments

● Based on configuration parameters; tunes Elasticsearch.yml file

● Multi-region support● Currently follows dedicated Master-Data-

Search deployment based on ASG Names

Page 20: NetflixOSS Meetup season 3 episode 1

Node Discovery and Tracking● Sample implementation using Cassandra● C* keeps track of metadata information of ES Clusters● ES instance reads C* to discover other nodes during

bootstrap● Storing metadata in C* helps in Multi-Region

deployments

Page 21: NetflixOSS Meetup season 3 episode 1

Auto Index Management● Provides configuration properties for Auto Index

Management● Based on specific index date suffix (YYYYMMDD), old

indices are cleaned and new indices are created● Before running Index Manager

Page 22: NetflixOSS Meetup season 3 episode 1

Auto Index Management … continued

● After running Index Manager

● Index Manager job can be scheduled

Page 23: NetflixOSS Meetup season 3 episode 1

Running in AWS

● Automatic updates to Security Groups when new nodes are added or removed

● Supports IAM Credentials● Scheduled Snapshot Backup to S3 -- uses

elasticsearch-cloud-aws plugin● Publish ES Metrics to Servo - Centralized

Monitoring System

Page 24: NetflixOSS Meetup season 3 episode 1

Genie 2Tom Gianos

Page 25: NetflixOSS Meetup season 3 episode 1

DataWarehouse

Prod VPCBonusQuery Prod Test

ProcessingClusters

Clients

Service

Tools

Our Current Architecture

CLI’s

Page 26: NetflixOSS Meetup season 3 episode 1

Goals For Genie 2● Develop a generic data model, which would let jobs run

on any multi-tenant distributed processing cluster.● Implement a flexible cluster and command selection

algorithm for running a job.● Provide richer API support.● Implement a more flexible, extensible and robust

codebase.

Page 27: NetflixOSS Meetup season 3 episode 1

{ "user": "tgianos", "name": "PrestoJob.1421807841069", "commandArgs": "-f script.presto ", "clusterCriterias": [ { "tags": [ "presto", "prod" ] }, { "tags": [ "adhoc" ] } ], "commandCriteria": [ "presto" ], "tags": [ "headers", "presto", "BigDataPortal" ]}

Page 28: NetflixOSS Meetup season 3 episode 1

Our Current Deployment

● 19 i2.2xlarge nodes in prod cluster○ Configured to allow room to scale up as needed

● 34 max jobs per node● ~17,000 Jobs Per Day

Page 29: NetflixOSS Meetup season 3 episode 1

Daniel Weeks

Page 30: NetflixOSS Meetup season 3 episode 1
Page 31: NetflixOSS Meetup season 3 episode 1
Page 32: NetflixOSS Meetup season 3 episode 1
Page 33: NetflixOSS Meetup season 3 episode 1

DynomiteMinh Do - @timiblossom

Page 34: NetflixOSS Meetup season 3 episode 1

What is Dynomite?● Dynamo layer on top of a non-distributed system

(Redis/Memcache)○ Peer-to-peer ○ Replication○ Sharding○ Gossipping○ Multi-datacenters and

racks awareness○ Encryption○ Linear scale

Page 35: NetflixOSS Meetup season 3 episode 1

Dynomite Node

Page 36: NetflixOSS Meetup season 3 episode 1

Network Topology

Page 37: NetflixOSS Meetup season 3 episode 1

Operation features

● Florida - sidecar application to manage Dynomite clusters (like Priam for Cassandra)

● Data backup (Redis only)de replacement● Data warm-up (Redis only)● Client failover strategy - Dyno (our java

client)● Atlas/Servo integration for operation metrics

Page 38: NetflixOSS Meetup season 3 episode 1

Incoming features

● Higher read/write consistencies● Data reconciliation or data repair ● Other data storages besides Redis and

Memcache● Better/more generic warm-up method● Spark driver integration● and others

Page 39: NetflixOSS Meetup season 3 episode 1

Performance

● AWS:○ 126 nodes total in us-east-1, us-west-2, eu-west-1○ r3-xlarge○ 1K data payload

● 250K Write RPS, 250K Read RPS

● Client observed latencies○ average less than 1ms○ 99th at ~1.5ms○ 99.5th at ~2.5ms

Page 40: NetflixOSS Meetup season 3 episode 1

NicobarDynamic Scripting Library for Java

Vasanth Asokan

Page 41: NetflixOSS Meetup season 3 episode 1

What is Nicobar?Mainly, two things:

1. A Pluggable, Dynamic Scripting Framework for Java

(powered by)

2. A Modular Classloading System

Page 42: NetflixOSS Meetup season 3 episode 1

Traditional Java Classloader Hierarchy

Page 43: NetflixOSS Meetup season 3 episode 1

Powered by JBoss Modules

Nicobar Module Classloader Hierarchy

Page 44: NetflixOSS Meetup season 3 episode 1

Putting it all together

Page 45: NetflixOSS Meetup season 3 episode 1

MSLMitch Zollinger

Page 46: NetflixOSS Meetup season 3 episode 1

What is MSL?

MSL = Message Security Layer

MSL is a modern security protocol which enables arbitrary application protocols to be secured over arbitrary transport protocols.

Page 47: NetflixOSS Meetup season 3 episode 1

Performance

● sub-second playback start● MSL messaging stacked with app protocol

○ request can have: device authentication, user authentication, key exchange & application message

○ response can have: key exchange, authentication renewal, application message

● Netflix streaming should start faster than changing channels on your cable box!

Page 48: NetflixOSS Meetup season 3 episode 1

Reliability

● We need 4-5 “9s” of reliability● MSL has automatic error recovery● We had to remove reliance on 3rd party PKI● Client time: not needed by MSL

Page 49: NetflixOSS Meetup season 3 episode 1

Modern Protocol Design

● Human readable JSON vs. complex binary format○ ASN.1 security issues go away

● Multiple implementations: Java, JS, C#, …○ JS: updateable in-field

Page 50: NetflixOSS Meetup season 3 episode 1

Flexibility

● Pluggable○ authentication○ crypto algorithms

● Standard porting API○ Can use W3C WebCrypto, for example

Page 51: NetflixOSS Meetup season 3 episode 1

Deployment Models

● Trusted Services Network○ All servers shares a common master key allowing

the same level of trust across the network● Peer-to-Peer

○ Every pair of entities shares connection specific keys & credentials

Page 52: NetflixOSS Meetup season 3 episode 1

Security / Feature List

● encrypt / decrypt● device authentication● user authentication● integrity protection● key exchange

● anti-replay protection● compression● chunked messaging

Page 53: NetflixOSS Meetup season 3 episode 1

Zero To DockerAndrew Spyker - @aspyker

Page 54: NetflixOSS Meetup season 3 episode 1

● Up and running in minutes

● Before - Documented technology that we expected you to assemble

● Now - Running technology that we assembled and validated

● Not - Production Ready. Examples only, not run this way at Netflix (security, HA, monitoring, etc.)

Netflix OSS on your laptop

Docker Host (ex. Virtual Box on OSX)

Ubuntu 14.04

single kernel

Con

tain

er #

1Fi

lesy

stem

+

proc

ess

Eur

eka

Con

tain

er

Zuul

Con

tain

er

Ano

ther

C

onta

iner

...

Page 55: NetflixOSS Meetup season 3 episode 1

Trusted and Transparent Builds

● Start with Dockerhub registry○ Pull images that you know were built securely○ All you need if you just want to run them

● Inspect the linked github Dockerfile○ Want to know how NetflixOSS was configured?○ Want to know how NetflixOSS code was built?○ All code and configuration explicitly documented

Page 56: NetflixOSS Meetup season 3 episode 1

What is available?

From https://hub.docker.com/u/netflixoss/

● asgard● eureka● edda● sketchy● security monkey● exhibitor

● sample karyon application

● zuul● atlas

Page 57: NetflixOSS Meetup season 3 episode 1

Nike DigitalAlan Scherger - @flyinprogrammer

Page 58: NetflixOSS Meetup season 3 episode 1

Where we started...> Datacenter 2 Cloud (AWS)> Cloud native architecture using microservices> Defined a Cloud Blueprint> Pioneered a REST application bootstrap> Maintain a boilerplate to define transitive dependencies.

Page 59: NetflixOSS Meetup season 3 episode 1

How do we do metrics across billions (or 100s) of

microservices?

Page 60: NetflixOSS Meetup season 3 episode 1

● Instrumentation to JMX● Graphite Observer to capture metrics

Page 61: NetflixOSS Meetup season 3 episode 1

Observer Modifications

● Use Eureka to find a Graphite node

● Use a healthcheck to timeout the tcp socket

Page 62: NetflixOSS Meetup season 3 episode 1

How are we going to store all of these metrics?

Page 63: NetflixOSS Meetup season 3 episode 1

● Graphite Carbon compliant● Cassandra metric storage● Elasticsearch metric search● C* and ES cross-region replication

enable a global view of the metrics

Cyanitehttps://github.com/pyr/cyanite

Page 64: NetflixOSS Meetup season 3 episode 1

How do we make these tools Blueprint

compliant?

Page 65: NetflixOSS Meetup season 3 episode 1

Sidecars to the rescue!

Page 66: NetflixOSS Meetup season 3 episode 1

Priam + Cassadra = Done

Page 67: NetflixOSS Meetup season 3 episode 1

Raigad + Elasticsearch ; Prana + Cyanite

Page 68: NetflixOSS Meetup season 3 episode 1

So those didn’t exist - sour.

Page 69: NetflixOSS Meetup season 3 episode 1

Generic Sidecar

● Application daemon

● Convention over configuration groovy scripts.

Page 70: NetflixOSS Meetup season 3 episode 1

configure.groovy

Generate 3 config files off eureka data.

Page 71: NetflixOSS Meetup season 3 episode 1

Add the ingredients that produce code.

Page 72: NetflixOSS Meetup season 3 episode 1

Altas Jr.

Page 73: NetflixOSS Meetup season 3 episode 1
Page 74: NetflixOSS Meetup season 3 episode 1

Kevin Haverlock [email protected] Aroop Pandya [email protected]

Kelly Abuelsaad [email protected] Diamond [email protected]

IBM Watson Developer Cloud

Page 75: NetflixOSS Meetup season 3 episode 1
Page 77: NetflixOSS Meetup season 3 episode 1

Visual RecognitionImage/Video recognition and classification service to provide assessment of a user from their images

Extract information from text: People, Organizations, Locations, Events, and the relationships between them

User ModelingImproved understanding of people's preferences to help engage users on own terms

Language Identification

Machine Translation

Concept Expansion

Message Resonance

Question and Answer

Relationship Extraction

Text to SpeechThe conversion of text to outputted audio stream

Speech to TextConverts speech into text

Tradeoff Analyticshelps people make better choices while taking into account multiple, often conflicting, goals that matter

Concept AnalyticsLinks documents that you provide with a pre-existing graph of concepts

Page 78: NetflixOSS Meetup season 3 episode 1

……

Page 79: NetflixOSS Meetup season 3 episode 1

•––

Page 80: NetflixOSS Meetup season 3 episode 1
Page 81: NetflixOSS Meetup season 3 episode 1

Joshua Long - @starbuxman

Pivotal“bootifuL” microservices

with spring cloud & Netflix oss

Page 82: NetflixOSS Meetup season 3 episode 1

@starbuxman

@Grab("spring-boot-starter-actuator")@RestController class GreetingsController {

@RequestMapping("/hi/{name}")def hi(@PathVariable String name){

[ greeting: "Hello, " + name +"!" ]}

}

> spring run greeting.groovy> spring jar greeting.groovy greeting.jar

Page 83: NetflixOSS Meetup season 3 episode 1
Page 84: NetflixOSS Meetup season 3 episode 1

@starbuxman

import org.springframework.cloud.config.server.EnableConfigServer;

@SpringBootApplication@EnableConfigServerpublic class ConfigurationServerApplication {

public static void main(String[] args) throws Exception { SpringApplication.run(ConfigurationServerApplication.class, args); }}

spring: cloud: config: server: uri: ${MY_CONF:https://github.com/some/git-repository}

@Value("${some.property}")private String someProperty ;

Page 85: NetflixOSS Meetup season 3 episode 1

@starbuxman

@SpringBootApplication@EnableEurekaClientpublic class DogeApplication { // …

// src/main/resources/bootstrap.ymlspring: application: name: doge-service

Page 86: NetflixOSS Meetup season 3 episode 1

@starbuxman

@Componentclass ReliableClient {

@HystrixCommand( fallbackMethod = "defaultDogeLink") public Link buildDogeLink() { // insert volatile

// service-to-service call here }}

@SpringBootApplication@EnableHystrixDashboardpublic class HystrixApplication {

public static void main(String[] args) { SpringApplication.run(HystrixApplication.class, args); }}

Page 87: NetflixOSS Meetup season 3 episode 1

@starbuxman

zuul: proxy: mapping: /api addProxyHeaders: true route: account-service: /accounts doge-service: /doges

zuul: proxy: mapping: /api //: true route: /api/accounts /api/doges

Page 88: NetflixOSS Meetup season 3 episode 1

@starbuxman

spring: oauth2: client: clientId: acme clientSecret: acmesecret resource: tokenInfoUri: http://localhost:8002/auth/oauth/check_token id: openid serviceId: resource

@SpringBootApplication@RestController@EnableOAuth2Resourcepublic class SsoResourceApplication {

public static void main(String[] args) { SpringApplication.run(SsoResourceApplication.class, args); }

@RequestMapping("/hi") String hi(@RequestParam Optional<String> name) { return "Hello" + name.map(n -> ", " + n).orElse("") + "! "; }}

Page 89: NetflixOSS Meetup season 3 episode 1

Josh Long (龙之春)@starbuxman

@springcentral [email protected]

github.com/joshlong

Referencesspring.io/guidesgithub.com/spring-cloud/github.com/spring-cloud-samples/github.com/joshlong/spring-dogegithub.com/joshlong/spring-doge-microservicedocs.spring.io/spring-boot/

Questions?

Page 90: NetflixOSS Meetup season 3 episode 1

Please join us next doorfor mingling, drinks and food!

@netflixoss

Thank you!