dan reed [email protected] corporate vice president ...simonm/conferences/isc09/reed.pdfscholarly...

25
Dan Reed Dan Reed [email protected] [email protected] Corporate Vice President Corporate Vice President Extreme Computing Group (XCG) Extreme Computing Group (XCG)

Upload: others

Post on 07-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

Dan ReedDan [email protected]@microsoft.com

Corporate Vice PresidentCorporate Vice PresidentExtreme Computing Group (XCG)Extreme Computing Group (XCG)

Page 2: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

2

Commodity clustersProliferation of inexpensive hardware

“Attack of the Killer Micros”Race for MachoFLOPSLow level programming challenges

Rise of dataScientific instruments and surveysStorage, management and provenanceData fusion and analysis

Distributed servicesMultidisciplinary collaborationsInteroperability and scalabilityMulti-organizational social engineering

Page 3: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

3

Domain IndependentInfrastructure

Domain IndependentInfrastructure

ComplexApplications

ComplexApplications

Domain SpecificInfrastructure

Domain SpecificInfrastructure

Scientific InstrumentPipelines

Scientific InstrumentPipelines

Diverse DisciplinaryData Archives

Diverse DisciplinaryData Archives

LaboratoryComputingLaboratoryComputing

High-endComputing

Infrastructure

High-endComputing

Infrastructure

Community Building

and Outreach

Community Building

and Outreach

Page 4: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

4

Insatiable demandCycles, storage, software, support

Distributed acquisition/deploymentSometimes, duplicative, non-shared infrastructure

Distributed cost structures Power, space, staff, staff, hardware

Long-term sustainabilityDecades rather than months/years

The shape of the triangleApex versus mainstream users

Page 5: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

5

Petascale/Exascale/…

Mobile/Desktopcomputing

Laboratory clusters

University infrastructure

National infrastructure

Data, data, data

Data, data, data

Page 6: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

6

Bulk computing is almost free… but applications and power are not

Inexpensive sensors are ubiquitous… but data fusion remains difficult

Moving lots of data is {still} hard… because we’re missing trans-terabit/second networks

People are really expensive!… and robust software remains extremely labor intensive

Application challenges are increasingly complex … and social engineering is not our forte

Our political/technical approaches must change… or we risk solving irrelevant problems

Page 7: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

7

Moore’s “Law” favored consumer commoditiesEconomics drove enormous improvementsSpecialized processors and mainframes falteredThe commodity software industry was born

Today’s economicsManycore processors/acceleratorsSoftware as a service/cloud computingMultidisciplinary data analysis and fusion

They will drive change in technical computingJust as did “killer micros” and inexpensive clusters

Page 8: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

8

Manycore

HPC

Clouds

Page 9: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

9

Very few users love technology itselfClusters and parallel programmingDistributed services, grids or cloudsData models and databases

Successful technologies are invisibleThey enable but are unobtrusive

Desktop/mobile accelerationSeamlessly accessibleStandard metaphors/tools

Manycore

HPC

Clouds

Page 10: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

10

Increasing Abstraction

and Invisibility

Page 11: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

11

Scholarly communicationsDomain-specific

services

Instant messaging

Identity

Document store

blogs &social

networking

Mail

Notification

Searchbooks

citations

Visualization and analysis services

Storage/data services

ComputeServices and virtualization

Project management

Reference management

Knowledge management

Knowledge discovery

Source: Tony Hey, Microsoft

Page 12: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

12

Scale data throughput and store capacityFuse and analyze multidisciplinary data

Scale OnScale On--Demand Demand and Cost Effectivelyand Cost Effectively

Protect against data loss and unauthorized accessAddress failure and disaster scenarios

Ensure Service Continuity Ensure Service Continuity in the Cloudin the Cloud

Enable rapid development of new applications/servicesEasy access to consume multiple data sources

Support Support Emerging Applications Emerging Applications RRapidlyapidly

Hardware and software independenceLower operational cost of managing data

Reduce Infrastructure and Management Costs

Page 13: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

13

Infrastructure as a Service

Applications as a Service

Software as a Service

Enables

Page 14: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

www.azure.com

Page 15: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

15

Hypervisor

Guest Partition (VM)

Host Partition(VM)

Guest Partition(VM)

Hardware

VirtualizationStack(VSP)

Drivers

Host OSServer Core

ApplicationsApplications

RD OSVirtualization

Stack(VSC)

Guest OSServer Enterprise

VirtualizationStack(VSC)

Guest OSServer Enterprise

NICNIC Disk1

Disk1

VMBUS VMBUS VMBUS

Disk2

Disk2 CPUCPU

Azure Services

LoadBalancer

Public Internet

Worker Role(s)

Front-endWeb Role

Page 16: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

16

Switches

Highly-availableFabric Controller

Out-of-band communication – hardware control

In-band communication –software control

WS08 Hypervisor

VMVM

VM

Control VM

Service RolesControl Agent

WS08

Node can be a VM or a physical machine

Load-balancers

Page 17: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

17

C#

Page 18: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

18

Experiments Archives LiteratureSimulations

Many PetabytesDoubling every

2 years

Page 19: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

19

Hypothesis-driven“I have an idea, let me verify it.”

Exploratory“What correlations can I glean from everyone’s data?”

Different tools and techniquesExploratory analysis relies on deep data mining

supervised and unsupervised learning“grep” is not a data mining tool

… but an RDBMS really isn't either

Massive, multidisciplinary dataRising rapidly and at unprecedented scale

Page 20: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

20

Metagenomics sample 50 roles, speedup 45100 roles, speedup 94

BLAST user selects DBs and

input sequence

BlastWeb Role Input

SplitterWorker

Role

BLASTExecution

Worker Role #n…

.

CombinerWorker Role

GenomeDB 1

GenomeDB K

BLAST DBConfiguration

Azure Blob Storage

BLASTExecution

Worker Role #1

Basic MapReduce- 2 GB database in each

worker role- 500 MB input file

Map reduce-styleParallel BLASTDNA matching

Page 21: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

21

Query planLINQ query

LINQ: .NET Language Integrated QueryDeclarative SQL-like programming with C# and Visual StudioEasy expression of data parallelismElegant and unified data model

Dryad

select

where

logs

Automatic query plan generation

Distributed query execution by

Dryad

var logentries =from line in logswhere

!line.StartsWith("#")select new

LogEntry(line);

Source: Yuan Yu et al

Page 22: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

22

Each data center is Each data center is ~10X~10X

the size of a the size of a soccer/football soccer/football fieldfield

Page 23: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

23

Cooling technologiesOperating points, heat dissipation, …

New packaging technologiesOptoelectronics, memory stacking, …

New storage models/algorithmsSolid state storage

Locality-aware algorithmsThe speed of light is pretty slow

Programming modelsEffective scale-invariant abstractions

Intelligent power managementAdaptation and power down

System adaptation and integrationReliability and power as first class objects

Page 24: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

24

Manycore

HPC

Clouds

Page 25: Dan Reed reed@microsoft.com Corporate Vice President ...simonm/conferences/isc09/Reed.pdfScholarly Domain-specific communications services Instant messaging ... management Reference

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it

should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.