condor pools on 02 apr 2004 - university of notre damedthain/papers/condor-map-handout.pdf ·...

4
Mapping Condor Douglas Thain, University of Wisconsin Paradyn/Condor Week, April 2004 Mapping Condor. How can we monitor the impact of a software package on the world at large? To answer this question, we have collected a large amount of data on the deployment of Condor over five years. This handout is a first attempt to present some of this data. Expect to see more in the future! Data Sources. Each deployed Condor pool in the world periodically reports back to the University of Wisconsin by sending one email per week and one UDP packet every fif- teen minutes. These messages give high level details such as the number of machines in each pool, along with their operating system and CPU type. Of course, we don’t re- ceive all messages: some are blocked by firewalls, some are lost due to mis-configured email systems, and some are disabled by users. Email messages have been archived since January 2000, while UDP messages have only been archived since November 2003. The first two pages used combined email and UDP data for a current view, while the last two pages used only email for a historical view. Mapping Technique. Each point on the map is located by way of this public WHOIS database. The domain name of each pool is used to retrieve a WHOIS record, extract a postal address, and then use a public U.S. postal database to obtain the coordinates of the nearest city. Outside the U.S., the top-level country domain name is used to identify the host country, and the point is plotted at the national capital. In both cases, a small random factor is added to provided visual distinction as well as anonymity. Tidbits of Data - Before 2003, Condor grew at 48 CPUs/week. - Since 2003, Condor grew at over 250 CPUSs/week. - The largest pools have grown: 500-2000 CPUs. - Five to ten percent of pools exceed 100 CPUs. - One third of known pools notified by email only, one third by UDP only, and one third both ways. - WinNT machines are far less likely to use email. - 86% of Linux machines gave a valid DNS name. - 56% of WinNT machines gave a valid DNS name. Trends. We believe the extraordinary growth in Condor starting in January 2003 was the result of several factors: the long-awaited release of the 6.4 stable series, the intro- duction of production support for Windows machines, and the increase in size of large pools worldwide. Privacy. Users are free to decline participation in this data collection technique. Both UDP and email updates can be turned off in the standard Condor configuration file. No information identifying individual users or jobs is sub- mitted to Wisconsin, nor will we reveal details of individual systems, except in anonymized ways such as the figures in this handout. It should be noted that the purpose and role of WHOIS in the Internet and the legal system is a matter of continuing debate. Anyone with concerns about privacy is encouraged to contact us at [email protected]. We would be happy to work with you. (unmappable) Pool Size: 10 100 500 1000 Condor Pools on 02 Apr 2004 998 pools / 37960 hosts 1

Upload: others

Post on 28-May-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Condor Pools on 02 Apr 2004 - University of Notre Damedthain/papers/condor-map-handout.pdf · Mapping Condor Douglas Thain, University of Wisconsin Paradyn/Condor Week, April 2004

Mapping CondorDouglas Thain, University of Wisconsin

Paradyn/Condor Week, April 2004

Mapping Condor. How can we monitor the impact of asoftware package on the world at large? To answer thisquestion, we have collected a large amount of data on thedeployment of Condor over five years. This handout is afirst attempt to present some of this data. Expect to seemore in the future!

Data Sources. Each deployed Condor pool in the worldperiodically reports back to the University of Wisconsin bysending one email per week and one UDP packet every fif-teen minutes. These messages give high level details suchas the number of machines in each pool, along with theiroperating system and CPU type. Of course, we don’t re-ceive all messages: some are blocked by firewalls, someare lost due to mis-configured email systems, and someare disabled by users. Email messages have been archivedsince January 2000, while UDP messages have only beenarchived since November 2003. The first two pages usedcombined email and UDP data for a current view, while thelast two pages used only email for a historical view.

Mapping Technique. Each point on the map is locatedby way of this public WHOIS database. The domain nameof each pool is used to retrieve a WHOIS record, extract apostal address, and then use a public U.S. postal database toobtain the coordinates of the nearest city. Outside the U.S.,the top-level country domain name is used to identify thehost country, and the point is plotted at the national capital.In both cases, a small random factor is added to providedvisual distinction as well as anonymity.

Tidbits of Data- Before 2003, Condor grew at 48 CPUs/week.- Since 2003, Condor grew at over 250 CPUSs/week.- The largest pools have grown: 500-2000 CPUs.- Five to ten percent of pools exceed 100 CPUs.- One third of known pools notified by email only, onethird by UDP only, and one third both ways.- WinNT machines are far less likely to use email.- 86% of Linux machines gave a valid DNS name.- 56% of WinNT machines gave a valid DNS name.

Trends. We believe the extraordinary growth in Condorstarting in January 2003 was the result of several factors:the long-awaited release of the 6.4 stable series, the intro-duction of production support for Windows machines, andthe increase in size of large pools worldwide.

Privacy. Users are free to decline participation in thisdata collection technique. Both UDP and email updatescan be turned off in the standard Condor configuration file.No information identifying individual users or jobs is sub-mitted to Wisconsin, nor will we reveal details of individualsystems, except in anonymized ways such as the figures inthis handout. It should be noted that the purpose and roleof WHOIS in the Internet and the legal system is a matterof continuing debate. Anyone with concerns about privacyis encouraged to contact us at [email protected] would be happy to work with you.

(unmappable)

Pool Size:

10

100

500

1000

Condor Pools on 02 Apr 2004998 pools / 37960 hosts

1

Page 2: Condor Pools on 02 Apr 2004 - University of Notre Damedthain/papers/condor-map-handout.pdf · Mapping Condor Douglas Thain, University of Wisconsin Paradyn/Condor Week, April 2004

1

10

100

1000

10000

100000M

achinesP

ools

COLOMBIA1ARGENTINA1CZECH REPUBLIC1SLOVAK REPUBLIC1

TURKEY21PAKISTAN33BRAZIL43SOUTH AFRICA51CHINA63MEXICO71THAILAND93

GREECE215SINGAPORE2211SWEDEN221EGYPT241NETHERLANDS249IRELAND252HONG KONG261ICELAND271OMAN336MALAYSIA341INDIA366CHAD415SOUTH KOREA499FINLAND513FRANCE599RUSSIA829POLAND8312PORTUGAL885BELGIUM1244AUSTRALIA13411AUSTRIA1408HUNGARY16910SPAIN19829CANADA19813

ITALY41816ISRAEL4369

NORWAY62811SWITZERLAND74116

GERMANY109836JAPAN236932U.K.269581UNKNOWN2833219

U.S.24992388

Fig

ure

1.Cu

rrent

Distribu

tion

of

Co

nd

or

Po

ols

and

Mach

ines

byC

ou

ntry

1

10

100

1000

10000

100000M

achinesP

ools

DE1RI1

CO21AL22

CT73XX94

NM172GA242LA315OH334UT342IA404KS483HI684NC7812AZ872TN1078MI1188OK1487MN1635WA1957MA21414

PA31815VA3228NY33912NJ3526FL47010MD4857CA60543

IL90530TX91214

ID428814IN46819DC479645WI509274

Fig

ure

2.Cu

rrent

Distribu

tion

of

Co

nd

or

Po

ols

and

Mach

ines

byS

tate

2

Page 3: Condor Pools on 02 Apr 2004 - University of Notre Damedthain/papers/condor-map-handout.pdf · Mapping Condor Douglas Thain, University of Wisconsin Paradyn/Condor Week, April 2004

0

100

200

300

400

500

600

700

Jan1999

Jan2000

Jan2001

Jan2002

Jan2003

Jan2004

Num

ber

of D

ownl

oads

per

Mon

th x86/windowsx86/sunos

x86/linuxsparc/sunospowerpc/osxparisc/hpux

mips/irixlinux/0

ia64/linuxalpha/linuxalpha/dux

Figure 3. Downloads of Condor Software Over Time.The number of successful downloads of Condor software each month over time. The three dominant platforms are,from the top, x86/windows, x86/linux, and sparc/sunos. Notice the significant increase in downloads in 2002. Down-loads are very bursty due to periodic releases of the software.

0

5000

10000

15000

20000

25000

30000

Jan1999

Jan2000

Jan2001

Jan2002

Jan2003

Jan2004N

umbe

r of

Mac

hine

s D

isco

vere

d by

Em

ail

x86/windowsx86/sunos

x86/linuxsparc/sunos

sparc/linuxpowerpc/osxpowerpc/aixparisc/hpux

mips/irixia64/linux

alpha/linuxalpha/dux

Figure 4. Condor Machines Reported by Email Over Time.The number of Condor machines reporting to Wisconsin since data collection began in February 2000. The three dom-inant platforms are, from the top, x86/windows, x86/linux, and sparc/sunos. Notice the rapid expansion beginning inJanuary 2003. Unlike the first page, this figure only includes email reports, not UDP reports, and thus underestimatesthe total number.

2000 1000

100

10

1Jan

1999Jan

2000Jan

2001Jan

2002Jan

2003Jan

2004

Num

ber

of M

achi

nes

in P

ool

10%25%

50%75%90%

100%

Figure 5. Size of Condor Pools Reported by Email Over Time.Each line represents a percentile of all Condor pools. For example, 100% line shows the largest pool at any given timewhile the 50% line shows the median pool size. Notice that about half of all Condor pools are small, less than tenhosts, while the number of large pools has grown steadily since January 2002.

3

Page 4: Condor Pools on 02 Apr 2004 - University of Notre Damedthain/papers/condor-map-handout.pdf · Mapping Condor Douglas Thain, University of Wisconsin Paradyn/Condor Week, April 2004

Figure 6. Spread of Condor Over Time.The left column shows the location of downloads. One dot represents ten downloads. The right column shows eachCondor pool reported by e-mail. In both columns, each dot is resolved to the national capital, or, in the United States,to the nearest city. Unlike the first page, the right column only includes email reports, not UDP reports, and thusunderestimates the total number.

(unmappable)

Downloads in 20002919 total downloads

(unmappable)

Pool Size:

10

100

500

1000

Condor Pools on 01 Jan 2001352 pools / 5582 hosts

(unmappable)

Downloads in 20014300 total downloads

(unmappable)

Pool Size:

10

100

500

1000

Condor Pools on 01 Jan 2002329 pools / 7392 hosts

(unmappable)

Downloads in 20026665 total downloads

(unmappable)

Pool Size:

10

100

500

1000

Condor Pools on 01 Jan 2003397 pools / 10037 hosts

(unmappable)

Downloads in 20038300 total downloads

(unmappable)

Pool Size:

10

100

500

1000

Condor Pools on 01 Jan 2004548 pools / 24619 hosts

(unmappable)

Downloads in 20042531 total downloads

(unmappable)

Pool Size:

10

100

500

1000

Condor Pools on 01 Apr 2004522 pools / 26671 hosts

4