Transcript
Page 1: ARTEMISA Experience · •Very good experience with running the jobs, and support from Javier (!) for tailoring the job submission •Condor queue system good for small number of

ARTEMISAExperienceRoberto Ruiz, Judita Mamužić

IFIC / CSIC - University of Valencia 20 December 2019

Page 2: ARTEMISA Experience · •Very good experience with running the jobs, and support from Javier (!) for tailoring the job submission •Condor queue system good for small number of

ARTEMISAExperience,20December2019,Valencia

ARTEMISA:CPU

2

TWiki:https://twiki.ific.uv.es/twiki/bin/view/Artemisa/UsageGuideuser area ~100 GB: /lhome/ific/<initial_letter>/<username>group area ~5 TB: /lustre/ific.uv.es/ml/<group_name>•SuccessfullyfollowedinstructionsfromTWiki•Initialsetupworkedwithoutproblemsfromanylocation•Jobswhichhavelargeoutputwhilerunninghavetoberunatgrouparea(evenincreased,bydemand)•Jobswithsmallerrun-timeoutputcanberunatuserarea•100jobsin4dayscompletewithouterrors(!),multipletimes(!)•Jobsusingpre-compiledlibrariesfromcvmfshadainterruptionforafewjobswhilerunning10days(notalargefraction,butneedtodevelopscriptstolocatethemandre-submit)•ARTEMISAsetupreallyreallyfastandreliable(running,copy,cvmfsgreat,smallUFOdelayslocally)•Goodloganderroutput,veryimportantfordebuggingofpilotjobs•Verygoodexperiencewithrunningthejobs,andsupportfromJavier(!)fortailoringthejobsubmission•Condorqueuesystemgoodforsmallnumberofusers,forhighernumberofusersupgradeprobablyneeded•LookingforwardtoARTEMISAupgrade,forhighernumberofusershighercapacitiesareneeded•Generated 100TBofMCsamplesformachinelearningtasks,willbeusedforGPUjobs•Requirehigherstoragespace(perhapsthisjobisextreme)•Manynextapplicationsinpipe-line•SupersuperhappywithARTEMISA,managedcalculationslikeneverbefore!

𝒪

Page 3: ARTEMISA Experience · •Very good experience with running the jobs, and support from Javier (!) for tailoring the job submission •Condor queue system good for small number of

ARTEMISAExperience,20December2019,Valencia

ARTEMISA:GPU

3

TWiki:https://twiki.ific.uv.es/twiki/bin/view/Artemisa/UsageGuideuser area ~100 GB: /lhome/ific/<initial_letter>/<username>group area ~5 TB: /lustre/ific.uv.es/ml/<group_name>•SuccessfullyfollowedinstructionsfromTWiki•Initialsetupworkedwithoutproblemsfromanylocation•Usageofstandardmachinelearningtools(Scikit-learn,Keras,TensorFlow)•Installedstandardlibrarieslocally,exportlocationincondorjob•RunsingleGPUpercondorjob

•Hardwareisstateoftheart,veryfast!•NoissueswithGPUusage!


Top Related