ganga: an interface to the lhc computing grid
DESCRIPTION
Ganga is a tool, designed and used by the large particle physics experiments at CERN. Written in pure Python, it delivers a clean, usable interface to allow thousands of physicists to interact with the huge computing resources available to them. Video at https://www.youtube.com/watch?v=SSdluuVNU3YTRANSCRIPT
![Page 1: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/1.jpg)
1
GangaAn interface to the LHC computing grid
Matt WilliamsUniversity of Birmingham
)/, . #$("(, - ,#
![Page 2: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/2.jpg)
2
CERN and the LHC
● Largest particle physics experiment in the world
● 27km in circumference ● Over 100m underground ● Thousands of physicists● 100s of petabytes of data
![Page 3: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/3.jpg)
3
The Grid
![Page 4: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/4.jpg)
4
GANGA
● ~2001 LHCb started GANGA, an in-house tool– Specific to our needs
● By 2010 when the LHC turned on, it was used by many more– ATLAS, NA62, T2K and many more smaller experiements
● Python had always been the obvious choice– Used everywhere in Particle Physics (along with C++)
– Easy to create new plugins for experiments
● Can be scripted or with an IPython-based interactive console● Open source, released as GPL (like most CERN software)
![Page 5: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/5.jpg)
5
How is it used
j = Job(name = 'Example job')
j.application = Executable()
j.application.exe = File('test.sh')
j.outputfiles = [LocalFile('out.txt')]
j.backend = Local()
j.submit()
![Page 6: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/6.jpg)
6
Retrieving results
In [1]: j.peek()
total 200
-rw-r--r-- 1 phrfbi lhcb 0 Jun 22 2013 __syslog__
-rw-r--r-- 1 phrfbi lhcb 141999 Jun 22 2013 stdout
-rw-r--r-- 1 phrfbi lhcb 53671 Jun 22 2013 stderr
-rw-r--r-- 1 phrfbi lhcb 2463 Jun 22 2013 out.txt
-rw-r--r-- 1 phrfbi lhcb 135 Jun 22 2013 __jobstatus__
In [2]: j.peek('out.txt')
![Page 7: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/7.jpg)
7
Using the Grid
Just change backend from Local() to LCG()
Other backends are Interactive, PBS, LSF, SGE, Panda, Jedi, Dirac, Condor, ARC, CREAM...
![Page 8: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/8.jpg)
8
Input data and splitting
j = Job(name = 'Input splitter', backend = LCG())
j.application = Executable()
j.application.exe = File('analyse_data')
j.inputfiles = [LocalFile(f.strip()) for f in open('inputs.txt')]
j.splitter = SplitByFiles(filesPerJob = 10)
j.outputfiles = [LocalFile('histogram.root')]
j.submit()
![Page 9: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/9.jpg)
9
Mergers
j = Job(name = 'Merger', backend = LCG())
j.application = Executable()
j.application.exe = File('analyse_data')
j.inputfiles = [LocalFile(f.strip()) for f in open('inputs.txt')]
j.splitter = SplitByFiles(filesPerJob = 10)
j.outputfiles = [LocalFile('histogram.root')]
j.merger = RootMerger(files = ['histogram.root'])
j.submit()
![Page 10: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/10.jpg)
10
Job catalogue
In [1]: jobs
Out [1]:
fqid | status | name | subjobs | application | backend
----------------------------------------------------------------------
0 | completed | Example job | | Executable | Local
1 | running | Input splitter | 324 | Executable | LCG
2 | running | Merger | 324 | Executable | LCG
![Page 11: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/11.jpg)
11
Full API access
In [2]: jobs(2).status
Out [2]: running
In [3]: len([j for j in jobs(2).subjobs if j.status == 'completed'])
Out [3]: 24
In [4]: for subjob in jobs(2).subjobs:
if subjob.status == 'failed':
subjob.resubmit()
Can define custom functions in ~/.ganga.py which will be available at runtime
![Page 12: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/12.jpg)
12
Dealing with large files
j = Job(name = 'Large output', backend = Dirac())
j.application = Executable()
j.application.exe = File('analyse_data')
j.inputfiles = [DiracFile('input.root')]
j.outputfiles = [DiracFile('histogram.root')]
j.submit()
![Page 13: Ganga: an interface to the LHC computing grid](https://reader036.vdocuments.us/reader036/viewer/2022062308/559c46211a28ab8a218b46a3/html5/thumbnails/13.jpg)
13
Find more at cern.ch/ganga
Download code from cern.ch/ganga/download/
Thank you