slides: a tour through the tensorflow codebase

92
Kevin Robinson @krob Teaching Systems Lab, MIT A tour through the TensorFlow codebase @ODSC OPEN DATA SCIENCE CONFERENCE Boston May 20-22nd 2016

Upload: vokhanh

Post on 14-Feb-2017

241 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Slides: A tour through the TensorFlow codebase

Kevin Robinson@krobTeaching Systems Lab, MIT

A tour through the TensorFlow codebase

@ODSC

OPEN DATA SCIENCE CONFERENCE

Boston May 20-22nd 2016

Page 2: Slides: A tour through the TensorFlow codebase

hello!

Kevin Robinson@krobTeaching Systems Lab, MIT

Page 3: Slides: A tour through the TensorFlow codebase
Page 4: Slides: A tour through the TensorFlow codebase
Page 7: Slides: A tour through the TensorFlow codebase

A tour of the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

Page 8: Slides: A tour through the TensorFlow codebase

A tour of the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

1. Expressing computation

2. Distributing computation

3. Executing computation

Page 9: Slides: A tour through the TensorFlow codebase

A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

1. Expressing graphs

Page 10: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

1. Expressing graphs core:graph, ops, protobuf

python:variables,optimizer

A tour through the TensorFlow codebase

Page 11: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

2. Distributing graphs

A tour through the TensorFlow codebase

Page 12: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

2. Distributing graphs

core: distributed_runtimecommon_runtime

A tour through the TensorFlow codebase

Page 13: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

3. Executing graphs

A tour through the TensorFlow codebase

Page 14: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

3. Executing graphs

A tour through the TensorFlow codebase

stream_executor: cuda

core: op_kernels

core: executor

Page 15: Slides: A tour through the TensorFlow codebase

4. And my favorite TODO

?

A tour through the TensorFlow codebase

Page 16: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

1. Expressing graphs core:graph, ops, protobuf

python:variables,optimizer

A tour through the TensorFlow codebase

Page 17: Slides: A tour through the TensorFlow codebase

Expressing: Graphs and Ops

Graph

Page 18: Slides: A tour through the TensorFlow codebase

OpsGraph

Expressing: Graphs and Ops

Page 19: Slides: A tour through the TensorFlow codebase

Expressing: Graphs and Ops

Page 20: Slides: A tour through the TensorFlow codebase

Expressing: Ops

Page 21: Slides: A tour through the TensorFlow codebase

Expressing: Ops

Page 23: Slides: A tour through the TensorFlow codebase

Expressing: Graph

Page 24: Slides: A tour through the TensorFlow codebase

Expressing: Graph

Page 25: Slides: A tour through the TensorFlow codebase

Expressing: Graph

Graph is built implicitly session.py#L896

Page 27: Slides: A tour through the TensorFlow codebase

Expressing: Graph

Graph is built implicitly session.py#L896

Variables add implicit ops variables.py#L146

In TensorBoard:

Page 28: Slides: A tour through the TensorFlow codebase

Optimizer fns extend the graph optimizer.py:minimize#L155

Expressing: Optimizers

Page 29: Slides: A tour through the TensorFlow codebase

Optimizer fns extend the graph optimizer.py:minimize#L155

Trainable variables collected variables.py#L258

Expressing: Optimizers

Page 30: Slides: A tour through the TensorFlow codebase

Optimizer fns extend the graph optimizer.py:minimize#L155

Trainable variables collected variables.py#L258

Graph is extended with gradients gradients.py#L307

Expressing: Optimizers

Page 31: Slides: A tour through the TensorFlow codebase

Serialized as GraphDef graph.proto

Expressing: Graph

Page 32: Slides: A tour through the TensorFlow codebase

Serialized as GraphDef graph.proto

Expressing: Graph

Page 33: Slides: A tour through the TensorFlow codebase

Expressing: Graph

Serialized as GraphDef graph.proto

Page 34: Slides: A tour through the TensorFlow codebase

Expressing: Graph

Serialized as GraphDef graph.proto

Page 35: Slides: A tour through the TensorFlow codebase

Expressing: Graph

Serialized as GraphDef graph.proto

Page 36: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

1. Expressing graphs core:graph, ops, protobuf

python:variables,optimizer

A tour through the TensorFlow codebase

Page 37: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

2. Distributing graphs

core: distributed_runtimecommon_runtime

A tour through the TensorFlow codebase

Page 38: Slides: A tour through the TensorFlow codebase

Distributing

- Sessions in distributed runtime

- Pruning

- Placing and Partitioning

Page 39: Slides: A tour through the TensorFlow codebase

Distributing: Creating a session

Page 52: Slides: A tour through the TensorFlow codebase

gRPC call to Session::Run in master_session.cc#L835

Distributing: Pruning

Page 53: Slides: A tour through the TensorFlow codebase

gRPC call to Session::Run in master_session.cc#L835

Distributing: Pruning

Page 56: Slides: A tour through the TensorFlow codebase

Constraints from model DeviceSpec in device.py#L24

Distributing: Placing

Page 57: Slides: A tour through the TensorFlow codebase

Constraints from model DeviceSpec in device.py#L24

By device or colocation NodeDef in graph.proto

Distributing: Placing

Page 58: Slides: A tour through the TensorFlow codebase

WorkerService/job:worker/task:0

WorkerService/job:worker/task:1

WorkerService/job:worker/task:2

WorkerService/job:worker/task:0

WorkerService/job:worker/task:2

WorkerService/job:worker/task:1

Placing based on constraints SimplePlacer::Run in simple_placer.cc#L558 described in simple_placer.h#L31

CPU GPU GPU CPU GPU GPU CPU GPU

Distributing: Placing

Page 65: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

2. Distributing the graph

core: distributed_runtimecommon_runtime

A tour through the TensorFlow codebase

Page 66: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

3. Executing the graph

stream_executor: cuda

core: op_kernels

A tour through the TensorFlow codebase

core: executor

Page 67: Slides: A tour through the TensorFlow codebase

Executing: Executor

Parallelism on each worker

WorkerService/job:worker/task:0 RunGraph(graph,feed,fetches) RecvTensor(rendezvous_key)

CPU GPU GPU GPU

Page 68: Slides: A tour through the TensorFlow codebase

Executing: Executor

Parallelism on each worker

WorkerService/job:worker/task:0 RunGraph(graph,feed,fetches) RecvTensor(rendezvous_key)

CPU GPU GPU GPU

Page 74: Slides: A tour through the TensorFlow codebase

Executing: OpKernels

Conditional build for OpKernels

Page 76: Slides: A tour through the TensorFlow codebase

Executing: OpKernels

OpKernels are specialized by device

adapted from matmul_op.cc#L116

Page 77: Slides: A tour through the TensorFlow codebase

Executing: OpKernels

OpKernels are specialized by device

adapted from matmul_op.cc#L116

Page 78: Slides: A tour through the TensorFlow codebase

Executing: OpKernels

OpKernels call into Stream functions

adapted from matmul_op.cc#L71

Page 79: Slides: A tour through the TensorFlow codebase

Executing: OpKernels

OpKernels call into Stream functions

adapted from matmul_op.cc#L71

Page 80: Slides: A tour through the TensorFlow codebase

Executing: OpKernels

OpKernels call into Stream functions

adapted from matmul_op.cc#L71

Page 81: Slides: A tour through the TensorFlow codebase

Executing: Stream functions

OpKernels call into Stream functions

in conv_ops.cc#L292

Page 83: Slides: A tour through the TensorFlow codebase

Platforms provide GPU-specific implementations

cuBLAS BlasSupport in stream_executor/blas.h#L88 DoBlasInternal in cuda_blas.cc#L429

Executing: Stream functions

Page 85: Slides: A tour through the TensorFlow codebase

A tour through the TensorFlow codebase

Page 86: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

1. Expressing the graph core:graph, ops, protobuf

python:variables,optimizer

A tour through the TensorFlow codebase

Page 87: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

2. Distributing the graph

core: distributed_runtimecommon_runtime

A tour through the TensorFlow codebase

Page 88: Slides: A tour through the TensorFlow codebase

http://public.kevinrobinsonblog.com/tensorflow-codebase/

3. Executing the graph

A tour through the TensorFlow codebase

stream_executor: cuda

core: op_kernels

core: executor

Page 89: Slides: A tour through the TensorFlow codebase

4. And my favorite TODO

?

A tour through the TensorFlow codebase

Page 90: Slides: A tour through the TensorFlow codebase

4. And my favorite TODO

in tensor_c_api_test.cc

A tour through the TensorFlow codebase

Page 91: Slides: A tour through the TensorFlow codebase

4. And my favorite TODO

in tensor_c_api_test.cc

https://github.com/tensorflow/tensorflow

A tour through the TensorFlow codebase

Page 92: Slides: A tour through the TensorFlow codebase

thanks!

Kevin Robinson@krobTeaching Systems Lab, MIT