a new collaborative scientific initiative at harvard

33
A new collaborative scientific initiative at Harvard.

Post on 19-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A new collaborative scientific initiative at Harvard

A new collaborative scientific initiative at Harvard.

Page 2: A new collaborative scientific initiative at Harvard

One-Slide IIC

Proposal-driven, from within Harvard“Projects” focus on areas where computers are key to new science;

widely applicable results

Technical focus “Branches” Instrumentation Databases & Provenance Analysis & Simulations Visualization Distributed Computing (e.g. GRID, Semantic Web)

Matrix organization: “Projects” by “Branches”

Education: Train Future Consumers & Producers of Computational

Science

Proposal-driven, from within Harvard“Projects” focus on areas where computers are key to new science;

widely applicable results

Technical focus “Branches” Instrumentation Databases & Provenance Analysis & Simulations Visualization Distributed Computing (e.g. GRID, Semantic Web)

Matrix organization: “Projects” by “Branches”

Education: Train Future Consumers & Producers of Computational

Science

Goal: Fill the void in, highly value, and learn from,

the emerging field of “computational science.”

Goal: Fill the void in, highly value, and learn from,

the emerging field of “computational science.”

Page 3: A new collaborative scientific initiative at Harvard

“Astronomical Medicine”

A joint venture of FAS-Astronomy & HMS/BWH-Surgical Planning Lab; Work shown here is from the 2005 Junior Thesis of Michelle Borkin, Harvard College.

Page 4: A new collaborative scientific initiative at Harvard

Filling the “Gap” between Science and Computer Science

Increasingly, core problems in science require computational solution

Typically hire/“home grow” computationalists, but often lack the expertise or funding to go beyond the immediate pressing need

Focused on finding elegant solutions to basic computer

science challenges

Often see specific, “applied” problems as outside their

interests

Scientific disciplines

Computer Science departments

Page 5: A new collaborative scientific initiative at Harvard

“Workflow” & “Continuum”

Page 6: A new collaborative scientific initiative at Harvard

Workflow Examples Astronomy Public Health

““Collect”Collect” TelescopeTelescope Microscope, Microscope,

Stethoscope, SurveyStethoscope, Survey

COLLECTCOLLECT ““National Virtual National Virtual Observatory”/Observatory”/

COMPLETECOMPLETE

CDC WonderCDC Wonder

““Analyze”Analyze” Study the density Study the density structure of a star-structure of a star-forming glob of gasforming glob of gas

Find a link between Find a link between one factory’s chlorine one factory’s chlorine

runoff & diseaserunoff & disease

ANALYZEANALYZE Study the density Study the density structure of structure of allall star- star-

forming gas in…forming gas in…

Study the toxic Study the toxic effects of chlorine effects of chlorine runoff runoff in the U.Sin the U.S..

““Collaborate”Collaborate” Work with your student Work with your student

COLLABORATECOLLABORATE Work with 20 people in 5 countries, in real-Work with 20 people in 5 countries, in real-timetime

““Respond”Respond” Write a paper for a Journal.Write a paper for a Journal.

RESPONDRESPOND Write a paper, the quantitative results of Write a paper, the quantitative results of which are shared globally, digitally.which are shared globally, digitally.

Page 7: A new collaborative scientific initiative at Harvard

IIC branches address shared “workflow” challenges

Challenges common to data-intensive science

• Data acquisition

• Data processing, storage, and access

• Deriving meaningful insight from large datasets

• Maximizing understanding through visual representation

• Sharing knowledge and computing resources across geographically dispersed researchers

Instrumentation

Analysis & Simulations

Databases/ Provenance

Distributed Computing

Visualization

IIC branches

Page 8: A new collaborative scientific initiative at Harvard

Continuum

“Pure” Discipline Science

(e.g. Einstein)

“Pure” Computer Science

(e.g. Turing)

“Computational Science”Missing at Most Universities

Page 9: A new collaborative scientific initiative at Harvard

IIC Organization: Research and Education

Assoc Dir, Instrumentation

Assoc Dir, Visualization

Assoc Dir, Analysis & Simulation

Provost

IIC DirectorAssoc Provost

Dir of Admin & Operations

Project 1(Proj Mgr 1)

Project 2(Proj Mgr 2)

Project 3(Proj Mgr 3)

Dir of Education &Outreach

Etc.

CIO (systems)

Knowledgemgmt

Education &Outreach staff

Dean, Physical Sciences

Dir of Research

Assoc Dir, Databases/Data

Provenance

Assoc Dir, Distributed Computing

Page 11: A new collaborative scientific initiative at Harvard

Barnard’s Perseus

QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.

COMPLETE/IRAS Ndust

Page 12: A new collaborative scientific initiative at Harvard

IRAS Ndust

H-

em

issi

on,W

HA

M/S

HA

SS

A S

urve

ys (

see

Fin

kbei

ner

2003

)

H

2MASS/NICER Extinction

Page 13: A new collaborative scientific initiative at Harvard

Numerical Simulation of Star Formation

Bate, Bonnell & Bromm 2002 (UKAFF)

•MHD turbulence gives “t=0” conditions; Jeans mass=1 Msun

•50 Msun, 0.38 pc, navg=3 x 105 ptcls/cc

•forms ~50 objects

•T=10 K

•SPH, no B or •movie=1.4 free-fall times

QuickTime™ and aCinepak decompressor

are needed to see this picture.

Page 14: A new collaborative scientific initiative at Harvard

Goal:Statistical Comparison of “Real” and “Synthesized” Star Formation

Figure based on work of Padoan, Nordlund, Juvela, et al.Excerpt from realization used in Padoan & Goodman 2002.

Page 15: A new collaborative scientific initiative at Harvard

Spectral Line Observations

Measuring Motions: Molecular Line Maps

Page 16: A new collaborative scientific initiative at Harvard

Alves, Lada & Lada 1999

Radio Spectral-Line Survey

Radio Spectral-line Observations of Interstellar Clouds

Page 17: A new collaborative scientific initiative at Harvard

Velocity from Spectroscopy

1.5

1.0

0.5

0.0

-0.5

Inte

nsit

y

400350300250200150100

"Velocity"

Observed Spectrum

All thanks to Doppler

Telescope Spectrometer

Page 18: A new collaborative scientific initiative at Harvard

Barnard’s Perseus

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

COMPLETE/FCRAO W(13CO)

QuickTime™ and aYUV420 codec decompressor

are needed to see this picture.

Page 19: A new collaborative scientific initiative at Harvard

“Astronomical Medicine”

Excerpts from Junior Thesis of Michelle Borkin (Harvard College); IIC Contacts: AG (FAS) & Michael Halle (HMS/BWH/SPL)

Page 20: A new collaborative scientific initiative at Harvard

IC 348

IC 348

Page 21: A new collaborative scientific initiative at Harvard

“Astronomical Medicine”

Page 22: A new collaborative scientific initiative at Harvard

“Astronomical Medicine”

Page 23: A new collaborative scientific initiative at Harvard

“Astronomical Medicine”

After “Medical Treatment”After “Medical Treatment”Before “Medical Treatment”Before “Medical Treatment”

Page 24: A new collaborative scientific initiative at Harvard

3D Slicer Demo

IIC contacts: Michael Halle & Ron Kikinis

Page 25: A new collaborative scientific initiative at Harvard

IIC Research Branches

Improved data acquisition.

Novel hardware approaches (e.g. GPUs, sensors).

Development of efficient algorithms.

Cross-disciplinary comparative tools (e.g. statistical).

Management, and rapid retrieval, of data.

“Research reproducibility” …where did the data come from? How?

e-Science aspects of large collaborations.

Sharing of data and computational resources and tools in real-time.

Physically meaningful combination of diverse data types.

InstrumentationAnalysis & Simulations

Databases/ Provenance

Distributed Computing

Visualization

IIC projects will bring together IIC experts from relevant branches with discipline scientists to address a pressing computing challenge facing the discipline, that has broad application

Page 26: A new collaborative scientific initiative at Harvard

3D Slicer3D Slicer

Page 27: A new collaborative scientific initiative at Harvard

Distributed Computing & Large Databases: Large Synoptic Survey TelescopeOptimized for time domainOptimized for time domain

scan modescan mode

deep modedeep mode

7 square degree field7 square degree field

6.5m effective aperture6.5m effective aperture

24th mag in 20 sec24th mag in 20 sec

> 5 Tbyte/night> 5 Tbyte/night

Real-time analysisReal-time analysis

Simultaneous multiple science goalsSimultaneous multiple science goals

IIC contact: Christopher Stubbs (FAS)

Page 28: A new collaborative scientific initiative at Harvard

Relative optical survey power

0

40

80

120

160

Figure of Merit

LSST SNAP Pan-STARRS

Subaru CFHT SDSS MMT

Time (x10)

Stellar

Galactic (x2)

based on A= 270 LSST design

Page 29: A new collaborative scientific initiative at Harvard

  Astronomy High Energy Physics

  LSST SDSS 2MASS MACHO DLS BaBar Atlas RHIC

First year of operation

2011 1998 2001 1992 1999 1998 2007 1999

Run-time data rate to storage (MB/sec)

 5000 Peak

500 Avg

 8.3

 1

 1

 2.7

60 (zero-suppressd)

6*

 540*

120* (’03)250* (’04)

Daily average datarate (TB/day)

20 0.02 0.016 0.008 0.012 0.6 60.0 3 (’03)10 (’04)

Annual data store(TB)

2000 3.6 6 1 0.25 300 7000 200 (’03)500 (’04)

Total data store capacity (TB)

20,000(10 yrs)

200 24.5 8 2 10,000 100,000 (10 yrs)

10,000 (10 yrs)

Peak computational load (GFLOPS)

140,000 100 11 1.00 0.600 2,000 100,000 3,000 

Average computationalload (GFLOPS)

140,000 10 2 0.700 0.030 2,000 100,000 3,000

Data release delayacceptable

1 day moving

3 months static

2 months

6 months

1 year 6 hrs (trans)

1 yr (static

)

1 day (max)

<1 hr (typ)

Few days 100 days

Real-time alert of event

30 sec none none <1 hour 1 hr none none none

Type/number of processors

TBD 1GHzXeon

18

450MHz Sparc

28

60-70MHz Sparc

10

500MHz

Pentium5 

Mixed/

5000

20GHz/

10,000

Pentium/

2500

Page 30: A new collaborative scientific initiative at Harvard

Challenges at the LHC

For each experiment (4 total):

10’s of Petabytes/year of data logged

2000 + Collaborators

40 Countries

160 Institutions (Universities, National Laboratories)

CPU intensive

Global distribution of data

Test with « Data Challenges »

Page 31: A new collaborative scientific initiative at Harvard

CPU v. Collab.

10

100

1,000

10,000

100,000

0 500 1000 1500 2000 2500

Collaboration Size

CPU CPU v. Collab.

Earth Simulator

Atmospheric Chemistry Group

LHC Exp.

Astronomy

Grav. Wave

Nuclear Exp.

Current accelerator Exp.

CPU vs. Collaboration Size

Page 32: A new collaborative scientific initiative at Harvard

interactivephysicsanalysis

batchphysicsanalysis

batchphysicsanalysis

detector

event summary data

rawdata

eventreprocessing

eventreprocessing

eventsimulation

eventsimulation

analysis objects(extracted by physics topic)

Data Handling and Computation for

Physics Analysisevent filter(selection &

reconstruction)

event filter(selection &

reconstruction)

processeddata

les.

rob

ert

son

@ce

rn.c

h

CERN

Page 33: A new collaborative scientific initiative at Harvard

Workflow

a.k.a. The Scientific Method (in the Age of the Age of High-Speed Networks, Fast Processors, Mass Storage, and Miniature Devices)

IIC contact: Matt Welsh, FAS