cloud com foster december 2010

38
What the cloud really means for science Ian Foster Computation Institute University of Chicago & Argonne National Laboratory

Upload: ian-foster

Post on 10-May-2015

1.352 views

Category:

Technology


1 download

DESCRIPTION

We've all heard about how on-demand computing and storage will transform scientific practice. But by focusing on resources alone, we're missing the real benefit of the large-scale outsourcing and consequent economies of scale that cloud is about. The biggest IT challenge facing science today is not volume but complexity. Sure, terabytes demand new storage and computing solutions. But they're cheap. It is establishing and operating the processes required to collect, manage, analyze, share, archive, etc., that data that is taking all of our time and killing creativity. And that's where outsourcing can be transformative. An entrepreneur can run a small business from a coffee shop, outsourcing essentially every business function to a software-as-a-service provider--accounting, payroll, customer relationship management, the works. Why can't a young researcher run a research lab from a coffee shop? For that to happen, we need to make it easy for providers to develop "apps" that encapsulate useful capabilities and for researchers to discover, customize, and apply these "apps" in their work. The effect, I will argue, will be a dramatic acceleration of discovery.

TRANSCRIPT

Page 1: Cloud com foster december 2010

What the cloud really

means for science

Ian FosterComputation Institute

University of Chicago & Argonne National Laboratory

Page 2: Cloud com foster december 2010

Science is merely an extremely powerful method

of winnowing what’s true from what feels good

— Carl Sagan

Page 3: Cloud com foster december 2010

J.C.R. Licklider on thinking (1960)

About 85% of my “thinking” time was spent getting into a position

to think, to make a decision, to learn something I needed to know

Page 4: Cloud com foster december 2010

“At one point, it was necessary to compare six experimental determinations of a function relating speech-intelligibilityto speech-to-noise ratio. No two experimenters had used the same definition or measure of speech-to-noise ratio. Several hours of calculating were required to get the data into comparable form. When they were in comparable form, it took only a few seconds to determine what I needed to know.”

Page 5: Cloud com foster december 2010
Page 6: Cloud com foster december 2010

42%!!

Page 7: Cloud com foster december 2010

Time-consuming tasks in science

Run experimentsCollect dataManage dataMove dataAcquire computersAnalyze dataRun simulationsCompare experiment with simulationSearch the literature

• Communicate with colleagues

• Publish papers• Find, configure, install

relevant software• Find, access, analyze

relevant data• Order supplies• Write proposals• Write reports• …

Page 8: Cloud com foster december 2010

Software-as-a-Service (SaaS)

Platform-as-a-Service (PaaS)

Infrastructure-as-a-Service (IaaS)

Page 9: Cloud com foster december 2010

Software-as-a-Service (SaaS)

Platform-as-a-Service (PaaS)

Infrastructure-as-a-Service (IaaS)

Page 10: Cloud com foster december 2010

Software-as-a-Service (SaaS)

Platform-as-a-Service (PaaS)

Infrastructure-as-a-Service (IaaS)

Page 11: Cloud com foster december 2010

Time-consuming tasks in business

Web presence Email (hosted Exchange) Calendar Telephony (hosted VOIP) Human resources and payroll Accounting Customer relationship mgmt Data analytics Content distribution …

SaaS

Page 12: Cloud com foster december 2010

Time-consuming tasks in business

Web presence Email (hosted Exchange) Calendar Telephony (hosted VOIP) Human resources and payroll Accounting Customer relationship mgmt Data analytics Content distribution …

SaaS

IaaS

Page 13: Cloud com foster december 2010

Time-consuming tasks in science

Run experimentsCollect dataManage dataMove dataAcquire computersAnalyze dataRun simulationsCompare experiment with simulationSearch the literature

• Communicate with colleagues

• Publish papers• Find, configure, install

relevant software• Find, access, analyze

relevant data• Order supplies• Write proposals• Write reports• …

Page 14: Cloud com foster december 2010

Software-as-a-Service (SaaS)

Platform-as-a-Service (PaaS)

Infrastructure-as-a-Service (IaaS)

Page 15: Cloud com foster december 2010

Time-consuming tasks in science

Run experimentsCollect dataManage dataMove dataAcquire computersAnalyze dataRun simulationsCompare experiment with simulationSearch the literature

• Communicate with colleagues

• Publish papers• Find, configure, install

relevant software• Find, access, analyze

relevant data• Order supplies• Write proposals• Write reports• …

Page 16: Cloud com foster december 2010

Time-consuming tasks in science

Run experimentsCollect dataManage dataMove dataAcquire computersAnalyze dataRun simulationsCompare experiment with simulationSearch the literature

• Communicate with colleagues

• Publish papers• Find, configure, install

relevant software• Find, access, analyze

relevant data• Order supplies• Write proposals• Write reports• …

Page 17: Cloud com foster december 2010

A B

Page 18: Cloud com foster december 2010

SaaS defined (Gartner)

1. The application is owned, delivered, and managed remotely by one or more providers 2. The application is based on a single code base that is consumed in a one-to-many model by all contracted customers at any time3. The application is licensed on pay-per-use or subscription basis

————————————————————————————4. The application behind the service is properly web architected—not an existing application web enabled [D. Terrar]

Page 19: Cloud com foster december 2010

Globus ToolkitBuild the Grid

Components for building custom grid solutions

globustoolkit.org

Globus OnlineUse the Grid

Cloud-hostedfile transfer service

globusonline.org

Page 20: Cloud com foster december 2010
Page 21: Cloud com foster december 2010
Page 22: Cloud com foster december 2010
Page 23: Cloud com foster december 2010
Page 24: Cloud com foster december 2010
Page 25: Cloud com foster december 2010
Page 26: Cloud com foster december 2010

“CLI 2.0”

Page 27: Cloud com foster december 2010

scp go#ep1:/share/godata/file1.txt \ go#ep2:~/myfile.txt

Command Endpoints

Page 28: Cloud com foster december 2010

canceldetailsendpoint-activateendpoint-addendpoint-deactivateendpoint-listendpoint-modifyendpoint-removeendpoint-rename

eventslsprofilescpstatustransferversionswait

Page 29: Cloud com foster december 2010

28.6 Terabytes31,000 files56h 44mNo human intervention

Astrophysics simulation data generated in Tennessee, moved to Illinois for visualization (Enzo, UCSD; Futures Lab, Argonne)

Page 30: Cloud com foster december 2010

Datastore

A peek inside Globus Online

GridFTP

GridFTP

Profiles+ state

ConsumerConsumer

ConsumerConsumerRequest

collector

Notificationtarget

WorkerWorker

WorkerWorker

Worker

Page 31: Cloud com foster december 2010

A B

Page 32: Cloud com foster december 2010

32

11 x 125 files200 MB each

11 users12 sites

Page 33: Cloud com foster december 2010

Coming soon

Lightweight transfer agentFor firewalls, sites without GridFTP installed

Higher-level data management capabilitiesGroup managementData publication, replication, etc.Workflow

Additional protocol supportHTTP, SRM, …

Condor integration (version 7.6.0)Stage in and stage out

Page 34: Cloud com foster december 2010

Time-consuming tasks in science

Run experimentsCollect dataManage dataMove dataAcquire computersAnalyze dataRun simulationsCompare experiment with simulationSearch the literature

• Communicate with colleagues

• Publish papers• Find, configure, install

relevant software• Find, access, analyze

relevant data• Order supplies• Write proposals• Write reports• …

Page 35: Cloud com foster december 2010

# U

sers

Application

Page 36: Cloud com foster december 2010

Research App store

Distributedbig data

Third-partyWeb

apps

Students PartnersResearcher

Resources

Page 37: Cloud com foster december 2010

Acknowledgements

Numerous people have contributed to the Globus Online work, including:

Bryce Allen, Joshua Boverhof, John Bresnahan, Lisa Childers, Paul Dave’, Fred Dech, Ian Foster, Dan Gunter, Gopi Kandaswany, Nick Karonis, Raj Kettimuthu, Jack Kordas, Lee Liming, Mike Link, Stu Martin, JP Navarro, Karl Pickett, Mei Hui Su, Steve Tuecke, Vas Vasiliadis

Many thanks to our funders: DOE, NSF, and the University of Chicago

Page 38: Cloud com foster december 2010

Thank you!Ian Foster

[email protected]

Computation InstituteUniversity of Chicago & Argonne National Laboratory