astrophysics on the osg (ligo, sdss, des) astrophysics on the osg (ligo, sdss, des) kent blackburn...

19
Astrophysics on the Astrophysics on the OSG OSG (LIGO, SDSS, DES) (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open Science Grid Consortium Meeting University of Florida January 23, 2006

Upload: griselda-cobb

Post on 24-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Astrophysics on the Astrophysics on the OSGOSG(LIGO, SDSS, DES)(LIGO, SDSS, DES)

Kent BlackburnLIGO LaboratoryCalifornia Institute of Technology

Open Science Grid Consortium Meeting

University of Florida

January 23, 2006

Page 2: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Outline and ContributorsOutline and Contributors

LIGO on the OSGLIGO on the OSG Kent Blackburn, Duncan Brown, Albert Lazzarini,

David Meyers SDSS, NEO & DES on the OSGSDSS, NEO & DES on the OSG

Nickolai Kuropatkin, Neha Sharma, Chris Stoughton, James Annis, Steve Kent

Page 3: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Gravitational Wave Gravitational Wave Physics on the OSGPhysics on the OSG

Laser Interferometer Gravitational wave Observatory (LIGO)

LIGO Scientific Collaboration (LSC)

Page 4: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

LIGO on the Open Science LIGO on the Open Science GridGrid Search for Gravitaitional Waves

Hanford, WA Livingston, LA Plus GEO, TAMA and VIRGO

LIGO Scientific Collaboration ~ 40 Institutions worldwide ~ 400 individuals contributing

LIGO Data Grid (LDG) Nine Grid Sites Over 2000 CPUs Multi-Petabyte Data Archive at Caltech

Scientific Data Collection grouped into temporal “Science Runs”

Currently In Science Run 5 Goal to collect one year “plus” of design

sensitivity data One Terabyte of data each day

Analysis carried out primarily on the LIGO Data Grid (LDG)

Stepping out onto the OSG http://www.ligo.caltech.edu

Page 5: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

LIGO Data Analysis LIGO Data Analysis Classifications Classifications Principle Classifications of Searches

Binary Inspiral (Neutron Stars & Black Holes) Consumes bulk of LIGO Data Grid resources

Burst (Supernovae and other Unmodeled Events) Coincidence between different data streams necessary

Stochastic Background (Similar to the CMB) Computationally least demanding but requires cross correlation

Periodic (Pulsars, Rotating Neutron Stars) Signal sinusoidal in reference frame of source All Sky Survey could promote Global Warming (Order 1020 FLOPS)

Binary Inspiral Search selected for initial adoption onto the OSG Workflow well suited to Open Science Grid

Already using a similar set of Grid Technologies within LIGO Data Grid Simple parametric parallelization of algorithms

Optimal filtering of data against tens of thousands of waveforms Computationally demanding but interesting on the scale of the OSG

Expect other searches to follow once OSG trailblazing work done

Page 6: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Binary Inspiral Search Binary Inspiral Search Experiences on the Open Experiences on the Open

Science GridScience Grid First attempt at July, 2005 OSG Consortium Meeting in Milwaukee, Wisconsin Unsuccessful at submitting a binary inspiral workflow at any OSG site Authentication was primary reason for failures (LIGO VO not part of 0.2.1) Other issues discovered with the version of VDS distributed in 0.2.1

First successful completion of a binary inspiral workflow October 1st, 2005 on LIGO’s OSG Integration Testbed Cluster at Caltech Eight Node Dual CPU cluster with two terabytes of disk space Running a “patched” version of VDS on top of OSG 0.2.1 Used a test workflow involved ~38 GBs of LIGO Data and workflows with about 700

DAG nodes. Followed up by running at LIGO’s OSG Productions sites at PSU(PBS) and

UWM(Condor) (once VDS patch applied at each) Collaborated with several CMS resources to further test outside LIGO’s VO

Worked with clusters at San Diego, Nebraska and Caltech All clusters added LIGO’s VOMS to allow authentication Updated OSG 0.2.1 with VDS patches Mixed results do to size of LIGO data sets transferred for this test workflow

Worked with Deployment and Integration Teams to assure LIGO’s functional requirements appeared in the OSG 0.4 software stack (just announced!)

Page 7: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Greatly Simplified LIGO Greatly Simplified LIGO DAGDAG

Page 8: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

LIGO’s Next Move on the LIGO’s Next Move on the OSG OSG The OSG 0.4.0 release should greatly improve the OSG for LIGO’s

Binary Inspiral Workflow A workflow geared toward actually conducting a scientific study would

involve at least 16000 DAG nodes and close to two terabytes of data. Recent OSG motivated activities in LIGO have produced a nearly 10:1

reduction is data through improved data selection and compression Need to develop more flexible workflows that don’t challenge the limited

data storage resources typical of a present day OSG site Pegasus is used to construct concrete DAGS from abstract DAX workflows Flexibility here to recognize and adapt to OSG site specifics could facilitate

greater utilization of the OSG as an abstract “Grid” Develop ability to benefit from Storage Resource Management

Typical LIGO data analyses benefit from being able to repeat the analysis on the same data set with improved calibration and selection criteria

LIGO is currently bringing up an SE on our local ITB cluster at Caltech to experiment with SRM

Page 9: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Astronomy on the Astronomy on the OSGOSG

Sloan Digital Sky Survey (SDSS)

Experimental Astronomy Group (EAG)

Fermi National Accelerator Laboratory

Page 10: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Near Earth Objects Near Earth Objects

Near Earth Objects (NEOs) Comets and Asteroids nudged by the gravitational attraction of

planets into orbits that pass by the Earth's neighborhood Composed of water ice and dust, formed early in the history of

the Solar System The scientific interest in comets and asteroids is due to their

being remnants of the early solar system ; the interest in NEO is their potential for hitting the earth…

37 Near Earth Object candidates are identified in the SDSS imaging data Apparent magnitudes r=19 – 21 and proper motions of 1.3 to 18

degrees per day The earth collision rate for this population (size greater than 20

m) is estimated to be one per century

Page 11: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

How to find Near Earth How to find Near Earth ObjectsObjects

SDSS imaging data consist of 6 stripscalled “camcols”. The above image isa small portion of a “run”that extendsfor 800 fields.

Camcols1 2 3 4 5 6

5Fields

Near Earth Objectsshow up as streaksin 3 colors

Page 12: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

NEO WorkflowNEO Workflow

OSG- SiteOSG- Site OSG-SiteOSG-Site OSG-SiteOSG-Site OSG-SiteOSG-Site

SDSS Cluster

TAM Cluster

GFS Disk Local Disk

RLS Server on TAM

VDL Generation

VDL2XML

Abstract DAX Creation

Concrete DAG Creation

Condor Submit DAG

Compute NodeLocal Disk

Register Input Files

Query RLS to Create Condor Submit File

Copy Input to Local Disk

Copy Output Back

Register Output FilesTransfer Output Back to TAM

Page 13: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

NEO Job StatisticsNEO Job Statistics

Total Jobs Total Jobs 180180Total Input Data Total Input Data 9*180=1620 GB9*180=1620 GBTotal Output DataTotal Output Data12*180=2160 K12*180=2160 K

Run-RerunAvg: 150 Fields

camcol 1 camcol 2 camcol3 camcol4 camcol5 camcol6

6 Tar ballsAvg: 1.5*6=9 GB

Neo-$run-$camcol-Input.tar

Neo-Executable

6 Neo Par FilesAvg: 2*6=12 Kneo-00$run-$camcol-$rerun.par

Page 14: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Quasar Spectra Fitting Quasar Spectra Fitting using SDSSusing SDSS Quasars are super massive black holes. Swirling clouds of gas

and plasma falling into a black hole glowing at many different wavelengths. We measure the spectrum of the light to measure the properties of each quasar.

The SDSS provides us with 50,000 quasar spectra. We make fits to these spectra that include the following components: Power-law continuum, decreasing as e-

A Balmer continuum due to ionized Hydrogen, with a characteristic bump from 2000 to 4000 Angstroms

Strong emission lines from ionized gas, such as Hydrogen, Nitrogen, Oxygen, and Magnesium

Many faint emission lines from Iron Starlight from the galaxy that surrounds the quasar

Page 15: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Example Quasar Spectrum Example Quasar Spectrum with Fitwith Fit

Page 16: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Quasar Fit Production: Quasar Fit Production: Science using the Generic Science using the Generic Grid Gofer (GGG)Grid Gofer (GGG) All jobs are stored in “jobs” table. Available grid sites are stored in “pool” table Job Manager takes jobs from the database, creates

Condor DAG files and submits them to sites from the pool in an automatic mode.

Two main parts – Job Manager and DAG Creator All completed stages of a job are recorded in the

database together with submission time and execution time

Page 17: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Workflow in Generic Grid Workflow in Generic Grid GoferGofer

Nickolai Kuropatkin

Page 18: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Astronomy Experiences on Astronomy Experiences on the Gridthe Grid Experience tells us that Grid

is more suitable for CPU Intensive Jobs … achieve parallelism … more jobs… finish sooner

Running locally would limit the number of jobs run simultaneously

On OSG, can run several run-rerun and camcols within a run-rerun in parallel

Current Workflow also will facilitate further analysis

Spectra

CPU CPU IntensiveIntensive

NEOData&CPData&CP

U U IntensiveIntensive

Grid Match

Ideal for Grid

Grid not very

happy

Total No. of Jobs

~50000 180

Data Input/Job

1 Megabyt

es

9 Gigabyte

s

Data Output/Jo

b

2 Megabyt

es

12 Kilobytes

Avg. Rate of Job

Completion

800-1200 per day

10-15 per day ?

Page 19: Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open

Future Grid Projects in Future Grid Projects in AstronomyAstronomy In the coming year 2005-2006 Experimental

Astrophysics Group ( EAG) has 4 projects planned for the Open Science Grid: The Simulation effort for the Dark Energy Survey

(DES) Genetic algorithm fitting of Sloan Digital Sky

Survey (SDSS) Quasar Spectra Search for Near Earth Asteroids (NEOs) in the

SDSS Imaging data The Co-addition of the SDSS Southern Stripe