bart j.f. de smet [email protected] software development engineer microsoft corporation session...

56

Upload: sharleen-pearson

Post on 27-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful
Page 2: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

The Manycore Shift: Making Parallel Computing Mainstream

Bart J.F. De [email protected]://blogs.bartdesmet.net/bartSoftware Development EngineerMicrosoft CorporationSession Code: DTL206

Wishful thinking?

Page 3: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Agenda

The concurrency landscapeLanguage headaches.NET 4.0 facilities

Task Parallel LibraryPLINQCoordination Data StructuresAsynchronous programming

Incubation projectsSummary

Page 4: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Moore’s lawThe number of transistors incorporated in a chip willapproximately double every 24 months.

Gordon Moore – Intel – 1965

Let’s sell processors

Page 5: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Moore’s law todayIt can't continue forever.The nature of exponentials is that you push them out and eventually disaster happens.

Gordon Moore – Intel – 2005

Let’s sell even moreprocessors

Page 6: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Problem statement

Shared mutable stateNeeds synchronization primitivesLocks are problematic

Risk for contentionPoor discoverability (SyncRoot anyone?)Not composableDifficult to get right (deadlocks, etc.)

Coarse-grained concurrencyThreads well-suited for large units of workExpensive context switching

Asynchronous programming

Page 7: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Microsoft Parallel Computing Initiative

Applications

Domain libraries

Programming models & languages

Developer Tooling

Runtime, platform, OS, HyperVisor

Hardware

VB C#

F#

Constructing Parallel Applications

Executing fine-grain Parallel Applications

Coordinating system resources/services

Page 8: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Agenda

The concurrency landscapeLanguage headaches.NET 4.0 facilities

Task Parallel LibraryPLINQCoordination Data StructuresAsynchronous programming

Incubation projectsSummary

Page 9: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Languages: two extremes

LISP heritage(Haskell, ML)

No mutable stateMutable state

Fortran heritage(C, C++, C#, VB)

Fundamentalistfunctional programming

F#

Page 10: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Mutability

Mutable by default (C# et al)

Immutable by default (F# et al)

int x = 5;// Share out xx++;

let x = 5// Share out x// Can’t mutate x

let mutable x = 5// Share out xx <- x + 1

Synchronization required

No locking required

Explicit opt-in

Page 11: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Side-effects will kill you

Elimination of common sub-expressions?

Runtime out of controlCan’t optimize codeTypes don’t reveal side-effectsHaskell concept of IO monad

Did you know? LINQ is a monad!Source: www.cse.chalmers.se

let now = DateTime.Nowin (now, now) (DateTime.Now, DateTime.Now)

static DateTime Now { get; }

Page 12: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Languages: two roadmaps?

Making C# betterAdd safety nets?

ImmutabilityPurity constructsLinear types

Software Transactional MemoryKamikaze-style of concurrency

Simplify common patternsMaking Haskell mainstream

Just right? Too academic?Not a smooth upgrade path?

C#

Haskell

Nirvana

Page 13: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Agenda

The concurrency landscapeLanguage headaches.NET 4.0 facilities

Task Parallel LibraryPLINQCoordination Data StructuresAsynchronous programming

Incubation projectsSummary

Page 14: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Parallel Extensions Architecture.NET Program

Proc 1 …

PLINQ Execution Engine

C# Compiler

VB Compiler

C++ Compiler

IL

OS Scheduling Primitives(also UMS in Windows 7 and up)

DeclarativeQueries Data Partitioning

• Chunk• Range• Hash

• Striped• Repartitioning

Operator Types• Map• Scan• Build

• Search• Reduction

Merging• Async (pipeline)

• Synch• Order Preserving

• Sorting• ForAll

Proc p

Parallel Algorithms

Query Analysis

Task Parallel Library (TPL) Coordination Data Structures

Thread-safe CollectionsSynchronization Types

Coordination Types

Task APIsTask Parallelism

FuturesScheduling

PLINQ

TPL or CDS

F# Compiler

Other .NET Compiler

Page 15: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Task Parallel Library – Tasks

System.Threading.TasksTask

Parent-child relationshipsExplicit groupingWaiting and cancelation

Task<T>Tasks that produce valuesAlso known as futures

Parallel

Task 1

Task 2

Task N

Page 16: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Work StealingInternally, the runtime uses

Work stealing techniquesLock-free concurrent task queues

Work stealing has provablyGood localityWork distribution properties

p1 p2 p3 4321 4

Page 17: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

22

Example code to parallelize

void MultiplyMatrices(int size, double[,] m1, double[,] m2, double[,] result){ for (int i = 0; i < size; i++) { for (int j = 0; j < size; j++) { result[i, j] = 0; for (int k = 0; k < size; k++) { result[i, j] += m1[i, k] * m2[k, j]; } } }}

Page 18: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

23

Solution today int N = size; int P = 2 * Environment.ProcessorCount; int Chunk = N / P; // size of a work chunk ManualResetEvent signal = new ManualResetEvent(false); int counter = P; // counter limits kernel transitions for (int c = 0; c < P; c++) { // for each chunk ThreadPool.QueueUserWorkItem(o => { int lc = (int)o; for (int i = lc * Chunk; // process one chunk i < (lc + 1 == P ? N : (lc + 1) * Chunk); // respect upper bound i++) { // original loop body for (int j = 0; j < size; j++) { result[i, j] = 0; for (int k = 0; k < size; k++) { result[i, j] += m1[i, k] * m2[k, j]; } } } if (Interlocked.Decrement(ref counter) == 0) { // efficient interlocked ops signal.Set(); // and kernel transition only when done } }, c); } signal.WaitOne();

Error Prone

High Overhead

Tricks

Static Work Distribution

Knowledge of Synchronization

Primitives

Heavy Synchronization

Lack of Thread Reuse

Page 19: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

24

Solution with Parallel Extensions

void MultiplyMatrices(int size, double[,] m1, double[,] m2, double[,] result){ Parallel.For (0, size, i => { for (int j = 0; j < size; j++) { result[i, j] = 0; for (int k = 0; k < size; k++) { result[i, j] += m1[i, k] * m2[k, j]; } } });}

Structured parallelism

Page 20: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Task Parallel Library – LoopsCommon source of work in programs

System.Threading.Parallel classParallelism when iterations are independent

Body doesn’t depend on mutable state E.g. static variables, writing to local variables used in subsequent iterations

SynchronousAll iterations finish, regularly or exceptionally

for (int i = 0; i < n; i++) work(i);…foreach (T e in data) work(e);

Parallel.For(0, n, i => work(i));…Parallel.ForEach(data, e => work(e));

Why immutability

gains attention

Page 21: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Task Parallel LibraryBart J.F. De SmetSoftware Development EngineerMicrosoft Corporation

demo

Page 22: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Amdahl’s law by example

1 2 4 8 160

20

40

60

80

100

120

Non-linearLinear

Number of processors

Tota

l exe

cutio

n tim

e Theoretical maximum speedup determined by amount of linear code

Page 23: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Performance TipsCompute intensive and/or large data sets

Work done should be at least 1,000s of cycles

Do not be gratuitous in task creationLightweight, but still requires object allocation, etc.

Parallelize only outer loops where possibleUnless N is insufficiently large to offer enough parallelism

Prefer isolation & immutability over synchronizationSynchronization == !ScalableTry to avoid shared data

Have realistic expectationsAmdahl’s Law

Speedup will be fundamentally limited by the amount of sequential computationGustafson’s Law

But what if you add more data, thus increasing the parallelizable percentage of the application?

Page 24: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Enable LINQ developers to leverage parallel hardwareFully supports all .NET Standard Query OperatorsAbstracts away the hard work of using parallelism

Partitions and merges data intelligently (classic data parallelism)

Minimal impact to existing LINQ programming modelAsParallel extension method

Optional preservation of input ordering (AsOrdered)Query syntax enables runtime to auto-parallelize

Automatic way to generate more Tasks, like ParallelGraph analysis determines how to do itVery little synchronization internally: highly efficient

Parallel LINQ (PLINQ)

var q = from p in people        where p.Name == queryInfo.Name && p.State == queryInfo.State && p.Year >= yearStart && p.Year <= yearEnd        orderby p.Year ascending        select p;

.AsParallel() Query

Task 1

Task N

Page 25: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

PLINQBart J.F. De SmetSoftware Development EngineerMicrosoft Corporation

demo

Page 26: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Coordination Data Structures

New synchronization primitives (System.Threading)Barrier

Multi-phased algorithmTasks signal and wait for phases

CountdownEventHas an initial counter valueGets signaled when count reaches zero

LazyInitializerLazy initialization routinesReference type variable gets initialized lazily

SemaphoreSlimSlim brother to Semaphore (goes kernel mode)

SpinLock, SpinWaitLoop-based wait (“spinning”)Avoids context switch or kernel mode transition

Page 27: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Coordination Data Structures

Concurrent collections (System.Collections.Concurrent)BlockingCollection<T>

Producer/consumer scenariosBlocks when no data is available (consumer)Blocks when no space is available (producer)

ConcurrentBag<T>ConcurrentDictionary<TKey, TElement>ConcurrentQueue<T>, ConcurrentStack<T>

Thread-safe and scalable collectionsAs lock-free as possible

Partitioner<T>Facilities to partition data in chunksE.g. PLINQ partitioning problems

Page 28: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Coordination Data StructuresBart J.F. De SmetSoftware Development EngineerMicrosoft Corporation

demo

Page 29: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Asynchronous workflows in F#

Language feature unique to F#Based on theory of monads

But much more exhaustive compared to LINQ…Overloadable meaning for specific keywords

Continuation passing styleNot: ‘a -> ‘bBut: ‘a -> (‘b -> unit) -> unitIn C# style: Action<T, Action<R>>

Core concept: async { /* code */ }Syntactic sugar for keywords inside blockE.g. let!, do!, use!

Function takes computation result

Page 30: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

36

Asynchronous workflows in F#

let processAsync i = async { use stream = File.OpenRead(sprintf "Image%d.tmp" i) let! pixels = stream.AsyncRead(numPixels) let pixels' = transform pixels i use out = File.OpenWrite(sprintf "Image%d.done" i) do! out.AsyncWrite(pixels') }

let processAsyncDemo = printfn "async demo..." let tasks = [ for i in 1 .. numImages -> processAsync i ] Async.RunSynchronously (Async.Parallel tasks) |> ignore printfn "Done!"

Run tasks in parallel

stream.Read(numPixels, pixels -> let pixels' = transform pixels i use out = File.OpenWrite(sprintf "Image%d.done" i) do! out.AsyncWrite(pixels'))

Page 31: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Asynchronous workflows in F#Bart J.F. De SmetSoftware Development EngineerMicrosoft Corporation

demo

Page 32: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Reactive Fx

First-class events in .NETDualism of IEnumerable<T> interface

IObservable<T>Pull versus push

Pull (active): IEnumerable<T> and foreachPush (passive): raise events and event handlers

Events based on functionsComposition at its bestDefinition of operators: LINQ to Events

Realization of the continuation monad

Page 33: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

39

IObservable<T> and IObserver<T>

// Dual of IEnumerable<out T>public interface IObservable<out T>{ IDisposable Subscribe(IObserver<T> observer);}

// Dual of IEnumerator<out T>public interface IObserver<in T>{ // IEnumerator<T>.MoveNext return value void OnCompleted();

// IEnumerator<T>.MoveNext exceptional return void OnError(Exception error);

// IEnumerator<T>.Current property void OnNext(T value);}

Way to unsubscribe

Signaling the last event

Virtually two return types

Contra-variance

Co-variance

Page 34: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

ReactiveFxBart J.F. De SmetSoftware Development EngineerMicrosoft Corporation

demo Visit channel9.msdn.com for info

Page 35: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Agenda

The concurrency landscapeLanguage headaches.NET 4.0 facilities

Task Parallel LibraryPLINQCoordination Data StructuresAsynchronous programming

Incubation projectsSummary

Page 36: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

DevLabs project (previously “Maestro”)Coordination between components

“Disciplined sharing”Actor model

Agents communicate via messagesChannels to exchange data via ports

Language features (based on C#)Declarative data pipelines and protocolsSide-effect-free functionsAsynchronous methodsIsolated methods

Also suitable in distributed setting

Page 37: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

43

Channels for message exchange

agent Program : channel Microsoft.Axum.Application { public Program() { string[] args = receive(PrimaryChannel::CommandLine); PrimaryChannel::ExitCode <-- 0; } }

Page 38: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

44

Agents and channels

channel Adder{ input int Num1; input int Num2; output int Sum; } agent AdderAgent : channel Adder { public AdderAgent() { int result = receive(PrimaryChannel::Num1) + receive(PrimaryChannel::Num2); PrimaryChannel::Sum <-- result; } }

Send / receive primitives

Page 39: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

45

Protocols

channel Adder{ input int Num1; input int Num2; output int Sum;

Start: { Num1 -> GotNum1; } GotNum1: { Num2 -> GotNum2; } GotNum2: { Sum -> End; } }

State transition diagram

Page 40: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

46

Use of pipelines

agent MainAgent : channel Microsoft.Axum.Application { function int Fibonacci(int n) { if (n <= 1) return n; return Fibonacci(n - 1) + Fibonacci(n - 2); }

int c = 10; void ProcessResult(int n) { Console.WriteLine(n); if (--c == 0) PrimaryChannel::ExitCode <-- 0; }

public MainAgent() { var nums = new OrderedInteractionPoint<int>();

nums ==> Fibonacci ==> ProcessResult; for (int i = 0; i < c; i++) nums <-- 42 - i; }}

Description of data flow

Mathematical function

Page 41: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

47

Domains

domain Chatroom { private string m_Topic; private int m_UserCount; reader agent User : channel UserCommunication { // ... } writer agent Administrator : channel AdminCommunication { // ... } }

Unit of sharing between agents

Page 42: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Axum in a nutshellBart J.F. De SmetSoftware Development EngineerMicrosoft Corporation

demo

Page 43: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Another DevLabs projectCutting edge, released 7/28Specialized fork from .NET 4.0 Beta 1

CLR modifications required

First-class transactions on memoryAs an alternative to locking“Optimistic” concurrency methodology

Make modificationsRollback changes on conflict

Core concept: atomic { /* code */ }

Page 44: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Transactional memory

Subtle difference

Problems with locks:Potential for deadlocks…

…and more uglinessGranularity matters a lotDon’t compose well

atomic { m_x++; m_y--; throw new MyException() }

lock (GlobalStmLock) { m_x++; m_y--; throw new MyException() }

Page 45: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

52

Bank account sample

public static void Transfer(BankAccount from, BankAccount backup, BankAccount to, int amount) { Atomic.Do(() => { // Be optimistic, credit the beneficiary first to.ModifyBalance(amount); // Find the appropriate funds in source accounts try { from.ModifyBalance(-amount); } catch (OverdraftException) { backup.ModifyBalance(-amount); } }); }

Page 46: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

The hard truth about STM

Great featuresACIDOptimistic concurrencyTransparent rollback and re-executeSystem.Transactions (LTM) and DTC support

ImplementationInstrumentation of shared state accessJIT compiler modificationNo hardware support currently

Result:2x to 7x serial slowdown (in alpha prototype)But improved parallel scalability

Page 47: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

STM.NETBart J.F. De SmetSoftware Development EngineerMicrosoft Corporation

demoVisit msdn.microsoft.com/devlabs

Page 48: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

DryadLINQ

DryadInfrastructure for cluster computationConcept of job

DryadLINQLINQ over Dryad

Decomposition of queryDistribution over computation nodesRoughly similar to PLINQA la “map-reduce”

Declarative approach works

Page 49: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

DryadLINQ = LINQ + Dryad

C# C# C# C#

Vertexcode

Queryplan(Dryad job)Data

collection

results

Collection<T> collection;bool IsLegal(Key k);string Hash(Key);var results = from c in collection

where IsLegal(c.key) select new { Hash(c.key),

c.value};

Page 50: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

DryadLINQBart J.F. De SmetSoftware Development EngineerMicrosoft Corporation

demoVisit research.microsoft.com/dryad

Page 51: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Agenda

The concurrency landscapeLanguage headaches.NET 4.0 facilities

Task Parallel LibraryPLINQCoordination Data StructuresAsynchronous programming

Incubation projectsSummary

Page 52: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Summary

Parallel programming requires thinkingAvoid side-effectsPrefer immutability

Act 1 = Library approach in .NET 4.0Task Parallel LibraryParallel LINQCoordination Data StructuresAsynchronous patterns (+ a bit of language sugar)

Act 2 = Different approaches are lurkingSoftware Transactional MemoryPurification of languages

Page 53: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

question & answer

Page 54: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

www.microsoft.com/teched

Sessions On-Demand & Community

http://microsoft.com/technet

Resources for IT Professionals

http://microsoft.com/msdn

Resources for Developers

www.microsoft.com/learning

Microsoft Certification & Training Resources

Resources

Page 55: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

Complete an evaluation on CommNet and enter to win!

Page 56: Bart J.F. De Smet bartde@microsoft.com  Software Development Engineer Microsoft Corporation Session Code: DTL206 Wishful

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS,

IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.