the virtual data toolkit
DESCRIPTION
The Virtual Data Toolkit. Todd Tannenbaum (Alain Roy). What is the VDT?. A packaging of software Grid software (Globus, Condor-G…) Virtual data software (Chimera) Utilities An easy installation mechanism Testing and hardening Support. Who makes the VDT?. Grid Physics Network (GriPhyN) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/1.jpg)
VDT 1
The Virtual Data Toolkit
Todd Tannenbaum(Alain Roy)
![Page 2: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/2.jpg)
VDT 2
What is the VDT?
• A packaging of software– Grid software (Globus, Condor-G…)– Virtual data software (Chimera)– Utilities
• An easy installation mechanism• Testing and hardening• Support
![Page 3: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/3.jpg)
VDT 3
Who makes the VDT?
• Grid Physics Network (GriPhyN)– Constructs the VDT
• International Virtual Data Grid Laboratory (IVDGL)– Testing and hardening
Very tight collaboration between GriPhyN and IVDGL
![Page 4: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/4.jpg)
VDT 4
Who makes the VDT? (2)
• Core VDT Team:– Miron Livny: The boss– Alain Roy– Carey Kireyev
• VDT Testing– Xin Zhao– Brian Moe
• Pacman– Saul Youssef
![Page 5: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/5.jpg)
VDT 5
Who uses the VDT?
• GriPhyN collaborators– USCMS: In use today– USAtlas: In use today– LIGO: Will use soon– SDSS: Will use soon
• European Data Grid– Uses subset of software– Uses just RPMs
• LCG
![Page 6: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/6.jpg)
VDT 6
What exactly is in VDT?
• VDT 1.1.8:– Globus 2.2.4 + advisories + patches– Condor & Condor-G 6.5.1– Chimera/Pegasus– RLS– GLUE Schema– CA Certificates– Fault Tolerant Shell– EDG’s Make Gridmap– EDG’s CRL Update– ClassAds– Netlogger
![Page 7: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/7.jpg)
VDT 7
What exactly is in VDT?
• VDT 1.1.8:– Globus 2.2.4 + advisories + patches– Condor & Condor-G 6.5.1– Chimera/Pegasus– RLS– GLUE Schema– CA Certificates– Fault Tolerant Shell– EDG’s Make Gridmap– EDG’s CRL Update– ClassAds– Netlogger
![Page 8: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/8.jpg)
VDT 8
Grid Software Installation
Typical Grid SoftwareInstallation Experience…
VDT Installation Experience!
![Page 9: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/9.jpg)
VDT 9
VDT Installation
• 2 Methods– Pacman– RPM
![Page 10: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/10.jpg)
VDT 10
Pacman Installation
• Goal: – Type a single command– Everything downloads– Everything installs– Everything is configured– No questions asked
• We’re close:– A few questions if you’re root– Basic configuration, may need changing
![Page 11: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/11.jpg)
VDT 11
Pacman Installation (2)
• Download Pacman– http://physics.bu.edu/~youssef/pacman/
• Install VDT– cd <install-directory>– pacman -get VDT-Server– pacman -get VDT-Client– ls
condor/ globus/ post-install/ setup.sh
edg/ gpt/ replica/ vdt/
ftsh/ perl/ setup.csh vdt-install.log
• Use
![Page 12: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/12.jpg)
VDT 12
Pacman post-installation
• Post-install directory:– Notes on configuration choices made– Instructions for editing configuration
• Configuration scripts:– Globus configuration– Condor configuration
![Page 13: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/13.jpg)
VDT 13
RPM Installation
• Subset of whole VDT– Globus– Condor-G
• Nice RPMs:– We repackage Globus– A dozen Globus RPMs, not hundreds
• No configuration• No post-installation help
![Page 14: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/14.jpg)
VDT 14
Testing
• VDT team is building test suite• Interaction with LCG testing group• Working with NMI* to leverage:
– NMI test suite• Stress testing• Application testing (CMS pipeline)
– NMI test infrastructure
* NMI = NSF Middleware Initiative– http://www.nsf-middleware.org
![Page 15: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/15.jpg)
VDT 15
Support
• Send us questions or problems– We will solve them if we can– We will interact with the developers, if
necessary
![Page 16: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/16.jpg)
VDT 16
Interaction with EDG
• EDG gets Globus and Condor-G RPMs from VDT
• We do what we can to solve problems and get changes to Globus and Condor
• We want to make a great package for you
![Page 17: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/17.jpg)
VDT 17
What exactly is in VDT?
• VDT 1.1.8:– Globus 2.2.4 + advisories + patches– Condor & Condor-G 6.5.1– Chimera/Pegasus– RLS– GLUE Schema– CA Certificates– Fault Tolerant Shell– EDG’s Make Gridmap– EDG’s CRL Update– ClassAds– Netlogger
![Page 18: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/18.jpg)
VDT 18
Chimera Virtual Data System
• Much scientific data is not obtained from measurements but rather derived from other data by the application of computational procedures
• Chimera catalog can be used by application environments to describe a set of application programs ("transformations"), and then track all the data files produced by executing those applications ("derivations").
• Chimera contains the mechanism to locate the "recipe" to produce a given logical file, in the form of an abstract program execution graph. These abstract graphs are then turned into and executable DAG for the Condor-G DAGMan meta-scheduler by the bundled Pegasus planner.
• Enables on-demand execution of computation schedules constructed from database queries.
![Page 19: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/19.jpg)
VDT 19
NetLogger
• “Networked Application Logger”• API w/ calls you add to existing
source code to generate time-stamped monitoring events (sent to a file, network server, syslogd, or RAM)
• Visualization Tools• Storage and Retrieval Tools
– Store all events into a database
![Page 20: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/20.jpg)
VDT 20
Fault Tolerant Shell (FTSH)
• The Grid is a hard environment.• FTSH
– The ease of scripting with very precise error semantics.
– Exception-like structure allows scripts to be both succinct and safe.
– A focus on timed repetition simplifies the most common form of recovery in a distributed system.
– A carefully-vetted set of language features limits the "surprises" that haunt system programmers.
![Page 21: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/21.jpg)
VDT 21
Simple Bourne script…
#!/bin/sh
cd /work/foo
rm –rf data
cp -r /fresh/data .
What if ‘/work/foo’ is unavailable??
![Page 22: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/22.jpg)
VDT 22
Getting Grid Ready…#!/bin/sh for attempt in 1 2 3
cd /work/foo if [ ! $? ] then
echo "cd failed, trying again..." sleep 5
else break
fi done
if [ ! $? ] then
echo "couldn't cd, giving up..." return 1
fi
![Page 23: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/23.jpg)
VDT 23
Or with FTSH
#!/usr/bin/ftsh
try 5 times
cd /work/foo
rm -rf bar
cp -r /fresh/data .
end
![Page 24: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/24.jpg)
VDT 24
Or with FTSH
#!/usr/bin/ftsh
try for 3 days or 100 times
cd /work/foo
rm -rf bar
cp -r /fresh/data .
end
![Page 25: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/25.jpg)
VDT 25
Or with FTSH
#!/usr/bin/ftsh
try for 3 days every 1 hour
cd /work/foo
rm -rf bar
cp -r /fresh/data .
end
![Page 26: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/26.jpg)
VDT 26
Or with FTSH
#!/usr/bin/ftsh
try for 3 days every 1 hour
cd /work/foo
rm -rf bar
cp -r /fresh/data .
end
![Page 27: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/27.jpg)
VDT 27
Or with FTSH
#!/usr/bin/ftsh
try for 3 days every 1 hour
cd /work/foo
rm -rf bar
cp -r /fresh/data .
end
![Page 28: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/28.jpg)
VDT 28
Or with FTSH
hosts="mirror1.wisc.edu mirror2.wisc.edu mirror3.wisc.edu"
forany h in ${hosts} echo "Attempting host ${host}" wget http://${h}/some-file
end
echo "Got file from ${h}"
![Page 29: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/29.jpg)
VDT 29
FTSH
• All the usual constructs– Redirection, loops, conditionals, functions,
expressions, nesting, …• And more
– Logging– Timeouts– Process Cancellation– Complete parsing at startup– File cleanup
• Used on Linux, Solaris, Irix, Cygwin, …• Simplify your life!
![Page 30: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/30.jpg)
VDT 30
VDT’s Future
• Additional Software– MyProxy, Java ClassAds
• Access to new versions– Globus 3.0
• Extra VDT to help early adopters• Condor-G will submit to GT2 or GT3
• Helping You– What can we do to make life easier
for you?
![Page 31: The Virtual Data Toolkit](https://reader035.vdocuments.us/reader035/viewer/2022070412/56814a1a550346895db74116/html5/thumbnails/31.jpg)
VDT 31
Where do you learn more?
• http://www.griphyn.org/vdt• Support:
– [email protected]– Alain Roy: [email protected]– Miron Livny: [email protected]