desy cloud, the scientific data cloud - dcache · june 1, 2016 , frankfurt, patrick fuhrmann et al....

Post on 06-Jul-2018

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

INDIGO DataCloud

DESY Cloud,The Scientific Data CloudManagedSharedStorageAtthe“ownCloud ConnectsBusiness”workshop

Dr.PatrickFuhrmannQuirin BuchholzTigranMkrtchyanPetervanderReestLusine Yakovleva

June1,2016,Frankfurt,PatrickFuhrmannetal. 2TheScientificDataCloud@ownCloud ConnectsBusiness

Content

• Storage@DESY?• Sync’n ShareatDESY

• Motivation• Requirements• Implementation• Setup

• RequirementsfromScienceCommunities.• dCache forDummies.• TheownCloud– dCache Hybridsystem• Summaryandoutlook.

June1,2016,Frankfurt,PatrickFuhrmannetal. 3TheScientificDataCloud@ownCloud ConnectsBusiness

Storage@DESY

• PetraIII[Tier0](2012…)• SynchrotronRadiation

• 14Beamlines• BeamlineGuestScientists

• 1PB/year– 5PB/year

• European[Tier0]XFEL (2017…)• 3.4Km(Linear)• 2017(Firstbeamline)

• BeamlineGuestScientists• 10– 100PB/year

• HERA[Tier0](1992– 2007)• Particleaccelerator(Proton– Electron)

• 6.3Km(Ring)• Somehundredscientists

• 5PBintotal

• LCG[WLCGTier2](2008,2009 …)• Particleaccelerator(Proton– Proton)• 26.7(Ring)

• About10.000scientist• 15PB/year

2020100PBytes

1992

June1,2016,Frankfurt,PatrickFuhrmannetal. 4TheScientificDataCloud@ownCloud ConnectsBusiness

MorestorageatDESY

•TheDESYdatamanagementteamhasquitesomeexperienceinmanaginghugeamountsofdata.

• Incollaborationwithother‘bigdata’sites,weareprovidingadatamanagementsystem‘dCache’,deployedat70sitesaroundtheworld.

• Seelater.•So,whyarewerunningownCloud ?

June1,2016,Frankfurt,PatrickFuhrmannetal. 5TheScientificDataCloud@ownCloud ConnectsBusiness

Motivation

• DESYhasnoexperienceinsophisticateddatasharing.• DatasharingwasdoneinthetraditionalwaywithACL’sand’group’directories

• However:YoungscientistsstarttheircareersatUniversitiesandLab’swithSync’n Shareintheirblood.(DropBoxGeneration).

• PublicITdepartments,foraverylongtime,didn’tregardSync’n Shareasbeingtheirproblemasmanycommercialsolutionswerearound.

• ItessentiallybecameanissueafterSnowden.• LegalRequirement:Datahadtobestored‘onsite’oratleastinGermany

• Consequence:CCneededtoprovideSync’n Sharelikemechanisms.

June1,2016,Frankfurt,PatrickFuhrmannetal. 6TheScientificDataCloud@ownCloud ConnectsBusiness

Requirements

• Finegrainedsharingoffilesanddirectorieswithindividualsandgroups.

• SharingviaintuitiveWeb2.0mechanisms(AppsorBrowser)• Sharingwith‘thepublic’withorwithoutpasswordprotection• Sharingofspacetouploaddata.(protected)• Expirationofshares• Automaticbidirectionalsynchronizationofdatabetweenmobiledevicesandcentralrepository.

June1,2016,Frankfurt,PatrickFuhrmannetal. 7TheScientificDataCloud@ownCloud ConnectsBusiness

TypicalApplication

Your Cloud SpaceSync

Sync

File up and download

June1,2016,Frankfurt,PatrickFuhrmannetal. 8TheScientificDataCloud@ownCloud ConnectsBusiness

StepstakenbyDESY• Evaluatedpossiblesolutionsin2013.• DecidedtogoforownCloud

• Providesmostofthefeaturesneeded.• OpenSource• WasinusebymanyinstitutesandUniversitiesinGermany• UsedbycolleaguesatSURFSara (Amsterdam)andCERN

• Evaluationshowed:• VerygoodSync’n Sharefeature set• Verygoodinplanningahead(roadmap)• Plansforcrosssitefederatedaccess(nowinplace).• Abitweakindatamanagement

• StartedprototypeinstallationatDESYbeginningof2014

June1,2016,Frankfurt,PatrickFuhrmannetal. 9TheScientificDataCloud@ownCloud ConnectsBusiness

WhatshouldtheDESYSetuplooklike?

(ActuallywilllooklikeinJuly)

June1,2016,Frankfurt,PatrickFuhrmannetal. 10TheScientificDataCloud@ownCloud ConnectsBusiness

TheInfrastructure

AuthenticationKerberos

UserManagementRegistryLDAP

Monitoring

LocalandWide AreaNetworkLoadBalancing Firewalls

Virtualization

Accounting 8 UnlimitedPersistentStorage

June1,2016,Frankfurt,PatrickFuhrmannetal. 11TheScientificDataCloud@ownCloud ConnectsBusiness

Infrastructure Integration

PostgresDB

OwnCloud

OwnCloudOwnCloud

OwnCloud

F5,LoadBalancer

AutomaticFailover

June1,2016,Frankfurt,PatrickFuhrmannetal. 12TheScientificDataCloud@ownCloud ConnectsBusiness

MoreIntegration

DESYKerberos

OwnCloud

8UnlimitedCentral

Storage

DESYLDAPDataLifeCycle

Engine

June1,2016,Frankfurt,PatrickFuhrmannetal. 13TheScientificDataCloud@ownCloud ConnectsBusiness

PoolNode

PoolNode

PoolNode

PoolNode

PoolNode

PoolNode

200TBytesRAID6

200TBytesRAID6

200TBytesRAID6

Horizontally ScalingBackend

OwnCloud OwnCloud OwnCloud OwnCloud

NFS4.1/pNFS

WebLoadBalancer(F5)

June1,2016,Frankfurt,PatrickFuhrmannetal. 14TheScientificDataCloud@ownCloud ConnectsBusiness

SomeStatistics

Filesin/outin7days10.000

70.000Filesin/outperhour

Users Total 490

Users Active 277

SpaceAvailable 567TBytes

SpaceUsed 2*30TBytes

Files 10Millions

CurrentDefaultQualityTwoReplicasondifferentstoragenodes.

June1,2016,Frankfurt,PatrickFuhrmannetal. 15TheScientificDataCloud@ownCloud ConnectsBusiness

Isthatsufficient forscientists?

June1,2016,Frankfurt,PatrickFuhrmannetal. 16TheScientificDataCloud@ownCloud ConnectsBusiness

TypicalWorkflow

Derived PublicationRaw

Sharing

June1,2016,Frankfurt,PatrickFuhrmannetal. 17TheScientificDataCloud@ownCloud ConnectsBusiness

DataCategories

1TB

10- 100TB

1– 100PB Raw

Derived

Publication

LHCDetectordataRawX-RayImagesBrainScansReconstructed(Ntuples)PurifiedImagesBrainMaps

Papers,Presentations,Histograms

Amount Category TypicalApplication

June1,2016,Frankfurt,PatrickFuhrmannetal. 18TheScientificDataCloud@ownCloud ConnectsBusiness

Whatdoweneedtosupport ‘scienceworkflows’?

June1,2016,Frankfurt,PatrickFuhrmannetal. 19TheScientificDataCloud@ownCloud ConnectsBusiness

MoreRequirements

• Storagemustbemanageable:DefinedQoS andDataLifecycle• DifferenttypeofdatamusthavedifferentQoS attached,regardingaccesslatency(performance)anddatadurability(howsafeismydata?)

• SpinningDiskforstreaming• SSDforfastrandomaccess• Tapeforarchive• Multiplecopiesindifferentlocationsondifferentmediaforlongtermdatapreservation

• MovingdatabetweendifferentQoS typeshastobeperformed• w/oserviceinterruption• transparentlytotheuser• w/ochangesinthenamespace

June1,2016,Frankfurt,PatrickFuhrmannetal. 20TheScientificDataCloud@ownCloud ConnectsBusiness

QualityofService

Raw

LongTermPreservation(LegalRequirement)

Derived

SSD

LowLatency(HPC,Analysis)

Publication

SSD

Fast,MultiStreamAccess

June1,2016,Frankfurt,PatrickFuhrmannetal. 21TheScientificDataCloud@ownCloud ConnectsBusiness

EvenmoreRequirements

• Differentaccessprotocolsfordifferentapplications• POSIXMountedFS(nfs4.1/pNFS) forfastanalysis• FTPdialects(gridFTP) forwideareatransferswithGLOBUS,WLCG-FTS• http/WebDAVmostlyforbrowserbasedapplications,visualization,..

• Differentauthenticationmechanismmustbeavailable.• Username/passwordforwebapplications• SAMLtosupporttraditionalIdP’s• OpenIDConnectforgoogle/facebook likeIdP’s• CertificatesforhttpsorGRIDapplications

• Differentcredentialsmustbemap-abletothesameidentity.

June1,2016,Frankfurt,PatrickFuhrmannetal. 22TheScientificDataCloud@ownCloud ConnectsBusiness

ScientificDataCloud

HighSpeedDataIngest

FastAnalysisNFS4.1/pNFS

WideAreaTransfers(Globus Online,FTS)byGridFTP

Sync’ing andSharingwith OwnCloud

June1,2016,Frankfurt,PatrickFuhrmannetal. 23TheScientificDataCloud@ownCloud ConnectsBusiness

Whatwouldthatlooklikefromtheuser’sperspective?

June1,2016,Frankfurt,PatrickFuhrmannetal. 24TheScientificDataCloud@ownCloud ConnectsBusiness

MyDESYXXLHomeQoS support

Patrick’shome

June1,2016,Frankfurt,PatrickFuhrmannetal. 25TheScientificDataCloud@ownCloud ConnectsBusiness

MyDESYXXLHomeProtocolSupport

MultiProtocolNFS4.1/pNFS

GridFTPWebDAVSRM

MyownCloud Home SyncShare

Web2.0ownCloud

June1,2016,Frankfurt,PatrickFuhrmannetal. 26TheScientificDataCloud@ownCloud ConnectsBusiness

Howdoweachievethosegoals?

ORChoosingdCache asthestoragebackendfor

ownCloud !

Thescientificdatacloud

June1,2016,Frankfurt,PatrickFuhrmannetal. 27TheScientificDataCloud@ownCloud ConnectsBusiness

SideTrack

What’sdCache ?

June1,2016,Frankfurt,PatrickFuhrmannetal. 28TheScientificDataCloud@ownCloud ConnectsBusiness

dCache inanutshell (cont.)

• Started2000’• Internationalcollaboration(DESY,FERMIlab,NDGF)• About10members:developers,deployment,support,management• Softwaredeployedatabout70sitesEurope,US,Asia,Russia• Largestdeploymentsintheorderof20PBytes ontapeanddisk.• Totalstoragecloseto200PBytes.• Geographicallylargestinstallationspans4countries.• LargelyfundedbyINDIGO-DataCloud,DESY,FERMIlab andNDGF

INDIGO DataCloud

June1,2016,Frankfurt,PatrickFuhrmannetal. 29TheScientificDataCloud@ownCloud ConnectsBusiness

dCache Design

MediaTransferEngineandPoolManagement dCache

Automaticand

ManualMedia

transition

Virtual file-systemnamespaceLayerProtocoland Authentication Engines

gridFTPNFS/pNFS httpWebDAV

SSDs

SpinningDisks

Tape, BlueRay…

June1,2016,Frankfurt,PatrickFuhrmannetal. 30TheScientificDataCloud@ownCloud ConnectsBusiness

NamespaceDesign

NameSpace PhysicalStorage

Disk

Tape

ExternalSystem

LocationManager

Name

Disk1

Disk2

Tape1

June1,2016,Frankfurt,PatrickFuhrmannetal. 31TheScientificDataCloud@ownCloud ConnectsBusiness

DesignConsequence

• Filesarestoredasobjectsonvariousdataback-ends• RandomDevices :Harddisk,SSD• RemovableMedia:Tape• Objectstores:CEPH

• Back-endscanbehighlydistributed(evenbeyondcountries).• TheFilenamespaceengineisindependentofthedatastorageitself.• Internalandexternalservicescanmovedataaroundw/oserviceinterruption.

June1,2016,Frankfurt,PatrickFuhrmannetal. 32TheScientificDataCloud@ownCloud ConnectsBusiness

dCache Featuressupporting ourideaofascientificdatacloud

• MultiProtocolSupport(TransferandAuthentication)• Transferprotocols:NFS/pNFS,http,WebDAV• MultiAuthenticationCredentialsupport(OpenIDConnect,Kerberos,passwd)

• SophisticatedDataManagement• MultiMediasupport(Tape,SpinningDisk,SSD,…)• Automaticandmanualmediatransitions• Addingandremovingdatanodesw/oserviceinterruption• Automaticreplicamanagement

• Enforcesn<x<mcopiesofdatafiles.• Externalstoragesupport(e.g.Tapesystems:TSM,HPSS,OSM,DMF)

June1,2016,Frankfurt,PatrickFuhrmannetal. 33TheScientificDataCloud@ownCloud ConnectsBusiness

Inparticular :TheQoS Interface

June1,2016,Frankfurt,PatrickFuhrmannetal. 34TheScientificDataCloud@ownCloud ConnectsBusiness

dCache QoS Interfaces

WebService

CDMIService

Cloud

dCache

QoSModule

RESTful

June1,2016,Frankfurt,PatrickFuhrmannetal. 35TheScientificDataCloud@ownCloud ConnectsBusiness

TheQoS WebInterface

DISK TAPE

Click,togetFilebackfromTape.

June1,2016,Frankfurt,PatrickFuhrmannetal. 36TheScientificDataCloud@ownCloud ConnectsBusiness

Puttingpiecestogether

June1,2016,Frankfurt,PatrickFuhrmannetal. 37TheScientificDataCloud@ownCloud ConnectsBusiness

TheDataPath

OwnCloud OwnCloud OwnCloud OwnCloud

NFS4.1/pNFS

WebLoadBalancer(F5)

SpinningDisks

SSD’s TAPE

dCache

June1,2016,Frankfurt,PatrickFuhrmannetal. 38TheScientificDataCloud@ownCloud ConnectsBusiness

FutureWorkTheNamespacePath

Namespace

NamespacedCache

SharingDB

ShareAPI

Namespace,Proxy

June1,2016,Frankfurt,PatrickFuhrmannetal. 39TheScientificDataCloud@ownCloud ConnectsBusiness

dCache – OwnCloud hybrid

• Datapathistheeasiestpart.Worksnicely.• Namespacesynchronizationis/wasverydifficult

• Importanttoletallprotocolsseesynchronizednamespace.• ownCloud didn’texpecttheunderlyingstoragesystemtochangenamespacetree.• Manuallytriggeredsynchronizationtooktoolong.• OwnCloud 9providesfirstattemptforanAPIforexternalnamespace.

• Exposing‘shares’toexternalcomponentnotyetinownCloud.• ImportanttoallowallprotocolstouseownCloud-definedshares.• Prerequisites:

• ownCloud :needsAPItoexpose‘shares’• dCache :needstohavea‘share’objectimplemented.

June1,2016,Frankfurt,PatrickFuhrmannetal. 40TheScientificDataCloud@ownCloud ConnectsBusiness

ownCloud andQoS

I/O(NFS)

ownCloud GUIWeb

dCacheNamespaceAPI

ShareAPI

QoSPluggin

(ServerSideApp)

QoSModule

RESTServices

June1,2016,Frankfurt,PatrickFuhrmannetal. 41TheScientificDataCloud@ownCloud ConnectsBusiness

Summary

• AnOwnCloud - dCache Hybridisaperfectsystemforprovidingmanagedsharedstoragetoscientists.

• Sync’n ShareisprovidedbyownCloud.• AccessprotocolsandAuthenticationMechanismsusedinscienceareprovidedbydCache.

• Unlimitedstoragespaces(viaremovablemedia,e.g.tape)• QualityofServicesupport

• automaticandmanualmediatransitions• Automaticreplicamanagementresultinginhighavailabilityanddatadurability.

• Reduceddowntimesduetotransparentdatamigration.

June1,2016,Frankfurt,PatrickFuhrmannetal. 42TheScientificDataCloud@ownCloud ConnectsBusiness

Outlook

• ThecurrentversionoftheownCloud-dCacheHybridsatisfiestheneedfor

• Sync’n Share• Highlyscalableandmanageableback-endstorage

• Forafullintegration• Thename-spacesofthetwosystemsneedtobesynchronized(OC9)• TheownCloud ‘shares’needtobeexposedtohavethemvisibleinallprotocols(nfs,gridFTP,…)

• WeneedtoprovideanownCloudpluggin(serversideapp)tomakethedCacheQoSstoragetypesvisibleinownCloud.

June1,2016,Frankfurt,PatrickFuhrmannetal. 43TheScientificDataCloud@ownCloud ConnectsBusiness

TheEND

furtherreadingwww.dCache.org

top related