cloud com foster december 2010

Post on 10-May-2015

1.352 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

We've all heard about how on-demand computing and storage will transform scientific practice. But by focusing on resources alone, we're missing the real benefit of the large-scale outsourcing and consequent economies of scale that cloud is about. The biggest IT challenge facing science today is not volume but complexity. Sure, terabytes demand new storage and computing solutions. But they're cheap. It is establishing and operating the processes required to collect, manage, analyze, share, archive, etc., that data that is taking all of our time and killing creativity. And that's where outsourcing can be transformative. An entrepreneur can run a small business from a coffee shop, outsourcing essentially every business function to a software-as-a-service provider--accounting, payroll, customer relationship management, the works. Why can't a young researcher run a research lab from a coffee shop? For that to happen, we need to make it easy for providers to develop "apps" that encapsulate useful capabilities and for researchers to discover, customize, and apply these "apps" in their work. The effect, I will argue, will be a dramatic acceleration of discovery.

TRANSCRIPT

What the cloud really

means for science

Ian FosterComputation Institute

University of Chicago & Argonne National Laboratory

Science is merely an extremely powerful method

of winnowing what’s true from what feels good

— Carl Sagan

J.C.R. Licklider on thinking (1960)

About 85% of my “thinking” time was spent getting into a position

to think, to make a decision, to learn something I needed to know

“At one point, it was necessary to compare six experimental determinations of a function relating speech-intelligibilityto speech-to-noise ratio. No two experimenters had used the same definition or measure of speech-to-noise ratio. Several hours of calculating were required to get the data into comparable form. When they were in comparable form, it took only a few seconds to determine what I needed to know.”

42%!!

Time-consuming tasks in science

Run experimentsCollect dataManage dataMove dataAcquire computersAnalyze dataRun simulationsCompare experiment with simulationSearch the literature

• Communicate with colleagues

• Publish papers• Find, configure, install

relevant software• Find, access, analyze

relevant data• Order supplies• Write proposals• Write reports• …

Software-as-a-Service (SaaS)

Platform-as-a-Service (PaaS)

Infrastructure-as-a-Service (IaaS)

Software-as-a-Service (SaaS)

Platform-as-a-Service (PaaS)

Infrastructure-as-a-Service (IaaS)

Software-as-a-Service (SaaS)

Platform-as-a-Service (PaaS)

Infrastructure-as-a-Service (IaaS)

Time-consuming tasks in business

Web presence Email (hosted Exchange) Calendar Telephony (hosted VOIP) Human resources and payroll Accounting Customer relationship mgmt Data analytics Content distribution …

SaaS

Time-consuming tasks in business

Web presence Email (hosted Exchange) Calendar Telephony (hosted VOIP) Human resources and payroll Accounting Customer relationship mgmt Data analytics Content distribution …

SaaS

IaaS

Time-consuming tasks in science

Run experimentsCollect dataManage dataMove dataAcquire computersAnalyze dataRun simulationsCompare experiment with simulationSearch the literature

• Communicate with colleagues

• Publish papers• Find, configure, install

relevant software• Find, access, analyze

relevant data• Order supplies• Write proposals• Write reports• …

Software-as-a-Service (SaaS)

Platform-as-a-Service (PaaS)

Infrastructure-as-a-Service (IaaS)

Time-consuming tasks in science

Run experimentsCollect dataManage dataMove dataAcquire computersAnalyze dataRun simulationsCompare experiment with simulationSearch the literature

• Communicate with colleagues

• Publish papers• Find, configure, install

relevant software• Find, access, analyze

relevant data• Order supplies• Write proposals• Write reports• …

Time-consuming tasks in science

Run experimentsCollect dataManage dataMove dataAcquire computersAnalyze dataRun simulationsCompare experiment with simulationSearch the literature

• Communicate with colleagues

• Publish papers• Find, configure, install

relevant software• Find, access, analyze

relevant data• Order supplies• Write proposals• Write reports• …

A B

SaaS defined (Gartner)

1. The application is owned, delivered, and managed remotely by one or more providers 2. The application is based on a single code base that is consumed in a one-to-many model by all contracted customers at any time3. The application is licensed on pay-per-use or subscription basis

————————————————————————————4. The application behind the service is properly web architected—not an existing application web enabled [D. Terrar]

Globus ToolkitBuild the Grid

Components for building custom grid solutions

globustoolkit.org

Globus OnlineUse the Grid

Cloud-hostedfile transfer service

globusonline.org

“CLI 2.0”

scp go#ep1:/share/godata/file1.txt \ go#ep2:~/myfile.txt

Command Endpoints

canceldetailsendpoint-activateendpoint-addendpoint-deactivateendpoint-listendpoint-modifyendpoint-removeendpoint-rename

eventslsprofilescpstatustransferversionswait

28.6 Terabytes31,000 files56h 44mNo human intervention

Astrophysics simulation data generated in Tennessee, moved to Illinois for visualization (Enzo, UCSD; Futures Lab, Argonne)

Datastore

A peek inside Globus Online

GridFTP

GridFTP

Profiles+ state

ConsumerConsumer

ConsumerConsumerRequest

collector

Notificationtarget

WorkerWorker

WorkerWorker

Worker

A B

32

11 x 125 files200 MB each

11 users12 sites

Coming soon

Lightweight transfer agentFor firewalls, sites without GridFTP installed

Higher-level data management capabilitiesGroup managementData publication, replication, etc.Workflow

Additional protocol supportHTTP, SRM, …

Condor integration (version 7.6.0)Stage in and stage out

Time-consuming tasks in science

Run experimentsCollect dataManage dataMove dataAcquire computersAnalyze dataRun simulationsCompare experiment with simulationSearch the literature

• Communicate with colleagues

• Publish papers• Find, configure, install

relevant software• Find, access, analyze

relevant data• Order supplies• Write proposals• Write reports• …

# U

sers

Application

Research App store

Distributedbig data

Third-partyWeb

apps

Students PartnersResearcher

Resources

Acknowledgements

Numerous people have contributed to the Globus Online work, including:

Bryce Allen, Joshua Boverhof, John Bresnahan, Lisa Childers, Paul Dave’, Fred Dech, Ian Foster, Dan Gunter, Gopi Kandaswany, Nick Karonis, Raj Kettimuthu, Jack Kordas, Lee Liming, Mike Link, Stu Martin, JP Navarro, Karl Pickett, Mei Hui Su, Steve Tuecke, Vas Vasiliadis

Many thanks to our funders: DOE, NSF, and the University of Chicago

Thank you!Ian Foster

foster@anl.gov

Computation InstituteUniversity of Chicago & Argonne National Laboratory

top related