condor pools on 02 apr 2004 - university of notre damedthain/papers/condor-map-handout.pdf ·...
TRANSCRIPT
Mapping CondorDouglas Thain, University of Wisconsin
Paradyn/Condor Week, April 2004
Mapping Condor. How can we monitor the impact of asoftware package on the world at large? To answer thisquestion, we have collected a large amount of data on thedeployment of Condor over five years. This handout is afirst attempt to present some of this data. Expect to seemore in the future!
Data Sources. Each deployed Condor pool in the worldperiodically reports back to the University of Wisconsin bysending one email per week and one UDP packet every fif-teen minutes. These messages give high level details suchas the number of machines in each pool, along with theiroperating system and CPU type. Of course, we don’t re-ceive all messages: some are blocked by firewalls, someare lost due to mis-configured email systems, and someare disabled by users. Email messages have been archivedsince January 2000, while UDP messages have only beenarchived since November 2003. The first two pages usedcombined email and UDP data for a current view, while thelast two pages used only email for a historical view.
Mapping Technique. Each point on the map is locatedby way of this public WHOIS database. The domain nameof each pool is used to retrieve a WHOIS record, extract apostal address, and then use a public U.S. postal database toobtain the coordinates of the nearest city. Outside the U.S.,the top-level country domain name is used to identify thehost country, and the point is plotted at the national capital.In both cases, a small random factor is added to providedvisual distinction as well as anonymity.
Tidbits of Data- Before 2003, Condor grew at 48 CPUs/week.- Since 2003, Condor grew at over 250 CPUSs/week.- The largest pools have grown: 500-2000 CPUs.- Five to ten percent of pools exceed 100 CPUs.- One third of known pools notified by email only, onethird by UDP only, and one third both ways.- WinNT machines are far less likely to use email.- 86% of Linux machines gave a valid DNS name.- 56% of WinNT machines gave a valid DNS name.
Trends. We believe the extraordinary growth in Condorstarting in January 2003 was the result of several factors:the long-awaited release of the 6.4 stable series, the intro-duction of production support for Windows machines, andthe increase in size of large pools worldwide.
Privacy. Users are free to decline participation in thisdata collection technique. Both UDP and email updatescan be turned off in the standard Condor configuration file.No information identifying individual users or jobs is sub-mitted to Wisconsin, nor will we reveal details of individualsystems, except in anonymized ways such as the figures inthis handout. It should be noted that the purpose and roleof WHOIS in the Internet and the legal system is a matterof continuing debate. Anyone with concerns about privacyis encouraged to contact us at [email protected] would be happy to work with you.
(unmappable)
Pool Size:
10
100
500
1000
Condor Pools on 02 Apr 2004998 pools / 37960 hosts
1
1
10
100
1000
10000
100000M
achinesP
ools
COLOMBIA1ARGENTINA1CZECH REPUBLIC1SLOVAK REPUBLIC1
TURKEY21PAKISTAN33BRAZIL43SOUTH AFRICA51CHINA63MEXICO71THAILAND93
GREECE215SINGAPORE2211SWEDEN221EGYPT241NETHERLANDS249IRELAND252HONG KONG261ICELAND271OMAN336MALAYSIA341INDIA366CHAD415SOUTH KOREA499FINLAND513FRANCE599RUSSIA829POLAND8312PORTUGAL885BELGIUM1244AUSTRALIA13411AUSTRIA1408HUNGARY16910SPAIN19829CANADA19813
ITALY41816ISRAEL4369
NORWAY62811SWITZERLAND74116
GERMANY109836JAPAN236932U.K.269581UNKNOWN2833219
U.S.24992388
Fig
ure
1.Cu
rrent
Distribu
tion
of
Co
nd
or
Po
ols
and
Mach
ines
byC
ou
ntry
1
10
100
1000
10000
100000M
achinesP
ools
DE1RI1
CO21AL22
CT73XX94
NM172GA242LA315OH334UT342IA404KS483HI684NC7812AZ872TN1078MI1188OK1487MN1635WA1957MA21414
PA31815VA3228NY33912NJ3526FL47010MD4857CA60543
IL90530TX91214
ID428814IN46819DC479645WI509274
Fig
ure
2.Cu
rrent
Distribu
tion
of
Co
nd
or
Po
ols
and
Mach
ines
byS
tate
2
0
100
200
300
400
500
600
700
Jan1999
Jan2000
Jan2001
Jan2002
Jan2003
Jan2004
Num
ber
of D
ownl
oads
per
Mon
th x86/windowsx86/sunos
x86/linuxsparc/sunospowerpc/osxparisc/hpux
mips/irixlinux/0
ia64/linuxalpha/linuxalpha/dux
Figure 3. Downloads of Condor Software Over Time.The number of successful downloads of Condor software each month over time. The three dominant platforms are,from the top, x86/windows, x86/linux, and sparc/sunos. Notice the significant increase in downloads in 2002. Down-loads are very bursty due to periodic releases of the software.
0
5000
10000
15000
20000
25000
30000
Jan1999
Jan2000
Jan2001
Jan2002
Jan2003
Jan2004N
umbe
r of
Mac
hine
s D
isco
vere
d by
Em
ail
x86/windowsx86/sunos
x86/linuxsparc/sunos
sparc/linuxpowerpc/osxpowerpc/aixparisc/hpux
mips/irixia64/linux
alpha/linuxalpha/dux
Figure 4. Condor Machines Reported by Email Over Time.The number of Condor machines reporting to Wisconsin since data collection began in February 2000. The three dom-inant platforms are, from the top, x86/windows, x86/linux, and sparc/sunos. Notice the rapid expansion beginning inJanuary 2003. Unlike the first page, this figure only includes email reports, not UDP reports, and thus underestimatesthe total number.
2000 1000
100
10
1Jan
1999Jan
2000Jan
2001Jan
2002Jan
2003Jan
2004
Num
ber
of M
achi
nes
in P
ool
10%25%
50%75%90%
100%
Figure 5. Size of Condor Pools Reported by Email Over Time.Each line represents a percentile of all Condor pools. For example, 100% line shows the largest pool at any given timewhile the 50% line shows the median pool size. Notice that about half of all Condor pools are small, less than tenhosts, while the number of large pools has grown steadily since January 2002.
3
Figure 6. Spread of Condor Over Time.The left column shows the location of downloads. One dot represents ten downloads. The right column shows eachCondor pool reported by e-mail. In both columns, each dot is resolved to the national capital, or, in the United States,to the nearest city. Unlike the first page, the right column only includes email reports, not UDP reports, and thusunderestimates the total number.
(unmappable)
Downloads in 20002919 total downloads
(unmappable)
Pool Size:
10
100
500
1000
Condor Pools on 01 Jan 2001352 pools / 5582 hosts
(unmappable)
Downloads in 20014300 total downloads
(unmappable)
Pool Size:
10
100
500
1000
Condor Pools on 01 Jan 2002329 pools / 7392 hosts
(unmappable)
Downloads in 20026665 total downloads
(unmappable)
Pool Size:
10
100
500
1000
Condor Pools on 01 Jan 2003397 pools / 10037 hosts
(unmappable)
Downloads in 20038300 total downloads
(unmappable)
Pool Size:
10
100
500
1000
Condor Pools on 01 Jan 2004548 pools / 24619 hosts
(unmappable)
Downloads in 20042531 total downloads
(unmappable)
Pool Size:
10
100
500
1000
Condor Pools on 01 Apr 2004522 pools / 26671 hosts
4