what to expect of the lsst archive: the lsst science platform · what to expect of the lsst...

Post on 11-Jun-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017NameofMeeting• Location• Date- ChangeinSlideMaster

WhattoExpectoftheLSSTArchive:TheLSSTSciencePlatform

MarioJuric,UniversityofWashingtonLSSTDataManagementSubsystemScientist

fortheLSSTDataManagementTeam.

LSST SCIENCE ADVISORY COMMITTEE MEETINGSeptember 25th, 2017

2LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

LSSTDataProducts:Level1,2,and3

− Astreamof~10milliontime-domaineventspernight,detectedandtransmittedtoeventdistributionnetworkswithin60secondsofobservation.

− Acatalogoforbitsfor~6millionbodiesintheSolarSystem.

− Acatalogof~37billionobjects(20Bgalaxies,17Bstars), ~7trillionobservations(“sources”),and~30trillionmeasurements(“forcedsources”),producedannually,accessiblethroughonlinedatabases.

− Reducedsingle-epoch,deepco-addedimages.

− User-producedadded-valuedataproducts(deepKBO/NEOcatalogs,variablestarclassifications,shearmaps,…)

Level3Level1

Level2

Formoredetails,seethe“DataProductsDefinitionDocument”,http://ls.st/lse-163

(nightly)

(annual)

(usergenerated)

3LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

HowDoWeMakeThoseDataAvailable?

Internet

LSST Users

DumpingFITStablesontoanFTPsitewillnotsufficeinthe2020ies…

4LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

ModelingLSSTUserNeeds

− Alargemajorityofusersarelikelytobeginbyaccessingthedatasetthroughawebportalinterface.TheywishtobecomefamiliarwiththeLSSTdataset.Theymayquerysmallersubsetsofdatafor“athome”analysis.

− Someuserswillwishtousethetoolsthey’reaccustomedto(e.g.,TOPCAT,Aladin,AstroPy,etc.)tograbthedatafromtheLSSTarchive.

− SomefractionofourusersmaychoosetocontinuetheiranalysisbyutilizingresourcesavailabletothemattheDAC. Thisavoidsthelatency(andthenecessarylocalresources)associatedwithdownloading(large)subsetoftheLSSTdataset.Theirsciencecasesmaynotrequiretoomuchcomputing,butarelimitedbystorage,latency,orevenjusthavingtherightsoftwareprerequisites.Theywouldbenefitfromaprepared,next-to-the-data,analysisenvironmentutilizingthe10%Level3allocation.

− Usecasesdemandinglargerresourcesmaybeabletoacquirethematadjacentcomputingfacilities (e.g.,XSEDE).Theseuserswillbenefitfromconnectivitytosuchresources.

− Finally,themostdemandingusecases,therights-holdersmayutilizetheirowncomputingfacilitiestosupportlarger-scaleprocessingorevenputuptheirownDataAccessCenters.Theyneedtheabilitytomove,re-process,and/orre-serve,largedatasets.

5LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

TheLSSTSciencePlatform:AccessingLSSTDataandEnablingLSSTScience

Portal JupyterLab

User Databases

LSST Science Platform

Software ToolsUser ComputingUser FilesData Releases Alert Streams

Web APIs

Internet

LSST Users

TheLSSTSciencePlatform isasetofintegratedwebapplicationsandservicesdeployedattheLSSTDataAccessCenters(DACs)throughwhichthescientificcommunitywillaccess,visualize,subsetandperformnext-to-the-dataanalysisofthedata.

6LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

TheLSSTSciencePlatformAspects:Portal,JupyterLab,WebAPIs

TheSciencePlatformexposestheunderlyingDACservicesservicesthroughthreeuserfacingaspects:thePortal(novice),theJupyterLab (intermediate),andtheWebAPIs(expertandremotetools).

Throughthese,weenableaccesstotheDataReleasesandAlertStreams,andsupportnext-to-thedataanalysisandLevel3productcreationusingthecomputingresourcesavailableattheDAC.

Portal JupyterLab

User Databases

LSST Science Platform

Software ToolsUser ComputingUser FilesData Releases Alert Streams

Web APIs

Internet

LSST Users

7LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

LSSTPortal:TheWebWindowintotheLSSTArchive

TheWebPortaltothearchivewillenablebrowsingandvisualizationoftheavailabledatasetsinwaystheusersareaccustomedtoatarchivessuchasIRSA,MAST,ortheSDSSarchive,withanaddedlevelofinteractivity.

ThroughthePortal,theuserswillbeabletoviewtheLSSTimages,requestsubsetsofdata(viasimpleformsorSQLqueries),constructsimpleplots,andgenerallyexploretheLSSTdatasetinawaythatallowsthemtoidentifyandaccess(subsetsof)datarequiredbytheirsciencecase.

Thiswillallbebackedbyapetascale-capableRDBMS.

JupyterLab

User Databases

LSST Science Platform

Software ToolsUser ComputingUser FilesData Releases Alert Streams

Web APIsPortal

8LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

LSSTPortal:TheWebWindowintotheLSSTArchive

TheFireflyWebScienceUserInterface(Wuetal,2016;ADASS)

WecurrentlyhaveaninitialversionofthePortalrunningatNCSA.

Datasets:• SDSSStripe82• NEOWISE

Soon:• HSC(LSST-

reprocessed)

9LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

JupyterLab:Next-to-the-dataAnalysis

ThetoolsexposedthroughtheWebPortalwillpermitsimpleexploration,subsetting,andvisualizationLSSTdata.Theymaynot,however,besuitableformorecomplexdataselectionoranalysistasks.

Toenablethatnextlevelofnext-to-the-datawork,weplantoenabletheuserstolaunchtheirownJupyter notebooksatourcomputingresourcesattheDAC.ThesewillhavefastaccesstotheLSSTdatabaseandfiles.Theywillcomewithcommonlyusedandusefultoolspreinstalled(e.g.,AstroPy,LSSTdataprocessingsoftwarestack).

ThisserviceissimilarinnaturetoeffortssuchasSciServer atJHU,ortheJupyterHub deploymentforDESatNCSA.

JupyterLab

User Databases

LSST Science Platform

Software ToolsUser ComputingUser FilesData Releases Alert Streams

Web APIsPortal

10LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

JupyterLab:Next-to-the-dataAnalysis

YouTubedemooftheLSSTJupyterLab AspectDemo:http://ls.st/bgt

11LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

WebAPIs:IntegratingWithExistingTools

BackendPlatformservices– suchasaccesstodatabases,images,andotherfiles– willbeexposedthroughmachine-accessiblewebAPIs.

Wehaveapreferenceforindustrystandardand/orVOAPIs(e.g.,WebDAV,TAP,SIA,etc.)– thegoalistosupportwhat’sbroadlyacceptedwithinthecommunity.ThiswillallowthediscoverabilityofLSSTdataproductsfromwithintheVirtualObservatory,federationoftheLSSTdatasettootherarchives,andtheuseofwidelyutilizedtools(eg.,TOPCATorothers).

JupyterLab

User Databases

LSST Science Platform

Software ToolsUser ComputingUser FilesData Releases Alert Streams

Web APIsPortal

12LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

Computing,Storage,andDatabaseResources

Computing,filestorage,andpersonaldatabases(the“userworkspace”)willbemadeavailabletosupporttheworkviathePortalandwithintheNotebooks.

AnimportantfeatureisthatnomatterhowtheuseraccessestheDAC(Portal,Notebook,orVOAPIs)theyalways“see”thesameworkspace.

JupyterLab

User Databases

LSST Science Platform

Software ToolsUser ComputingUser FilesData Releases Alert Streams

Web APIsPortal

13LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

Howbigisthe“LSSTScienceCloud”(@DR2)?

− Computing:• ~2,400cores• ~18TFLOPs

− Filestorage:• ~4PB

− Databasestorage• ~3PB

Thisissharedbyallusers.We’reestimatingthenumberofpotentialDACusersnottoexceed7500(relevantforfileanddatabasestorage).

Notalluserswillbeaccessingthecomputingclusterconcurrently.Weareestimatingonorderofa~100.

Thoughthisisarelativelysmallclusterby2020-erastandards,itwillbesufficienttoenablepreliminaryend-userscienceanalyses (workingoncatalogs,smallernumberofimages)andcreationofsomeadded-value(Level3)dataproducts.

ThinkofthisashavingyourownserverwithafewTBofdiskanddatabasestorage,rightnexttotheLSSTdata,withachancetousetenstohundredsofcoresforanalysis.

Forlargerendeavors(e.g.,pixel-levelreprocessingoftheentireLSSTdataset),theuserswillwanttouseresourcesbeyondtheLSSTDAC(morelater).

TheseresourceswillbemadeavailabletotheusersoftheU.S.DataAccessCenter.

AllDACuserswillbeginwithsomedefault(small)allocation,withmorelikelytobemadeavailableviaa(TBD)proposalprocess.

14LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

Level3:Added-valueDataProducts

− Level3DataProducts:Added-valueproductscreatedbythecommunity

− Thesemayenablescienceuse-casesnotfullycoveredbywhatwe’llgenerateinLevel1and2:• Customprocessingofdeepdrillingfields• SNe photometry(e.g.CFHT-LStypeforwardmodeling)• Extremelycrowdedfieldphotometry(e.g.,globularclusters)• Characterizationofdiffusestructures(e.g.,ISM)• Custommeasurementalgorithms• CatalogsofSNe lightechos

− TheusercomputingandstoragepresentintheLSSTSciencePlatformaremeanttoenablenext-to-the-datarealizationofusecasesliketheonesabove.

− Level3software/dataproductsmaybemigratedtoLevel2(withowners’permission);thisisoneofthewayshowLevel2productswillevolve.

15LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

Howwe(think)wewillworkwithLSSTdata?

Portal JupyterLab

User Databases

LSST Science Platform

Software ToolsUser ComputingUser FilesData Releases Alert Streams

Web APIs

Internet

LSST Users

16LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

MappingthePlatformtoModelUsers

− MostusersarelikelytobeginwiththeWebPortal,tobecomefamiliarwiththeLSSTdatasetandquerysmallersubsetsofdatafor“athome”analysis.Somemayusethetoolsthey’reaccustomedto(e.g.,TOPCAT,Aladin,AstroPy,etc.)tograbthedatausingLSST’sVO-compatibleAPIs.

− SomeusersmaychoosetocontinuetheiranalysisbyutilizingresourcesavailabletothemattheDAC. They’llaccessthesethroughJupyter notebook-typeremoteinterfaces,withaccesstoamid-sizedcomputingcluster.It’squitepossiblethatalargefractionofend-user(“singlePI”)sciencemaybeachievablethisway.

− Foruserswhoneedlargerresources,theymaybeabletoapplyformoreresourcesatadjacentcomputingfacilities.Forexample,U.S.computingislocatedintheNationalPetascale ComputingFacilityattheNationalCenterforSupercomputingApplications(NCSA).Significantadditionalsupercomputingisexpectedtobeavailableatthesamesite(e.g.,NPCFcurrentlyhoststheBlueWaterssupercomputer).

− Finally,rights-holdersmayutilizetheirowncomputingfacilitiestosupportlarger-scaleprocessingorevenputuptheirownDataAccessCenters.Asthey’reopensource,theymayre-useoursoftware(pipelines,middleware,databases)totheextentpossible.

17LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017

Puttingitalltogether:theLSSTSciencePlatform

Portal JupyterLab

User Databases

LSST Science Platform

Software ToolsUser ComputingUser FilesData Releases Alert Streams

Web APIs

Internet

LSST Users

Formoredetails,seethe“LSSTSciencePlatformVisionDocument”,http://ls.st/lsp

top related