two days in the life of datcat overview and demonstration...overview and demonstration two days in...

25
overview and demonstration Two days in the life of three DNS root servers cooperative association for internet data analysis 16 November 2006 Bradley Huffaker <[email protected] > DatCat SM +

Upload: others

Post on 27-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • overview and demonstrationTwo days in the life of

    three DNS root servers

    cooperative association for internet data analysis

    16 November 2006

    Bradley Huffaker

    DatCatSM+

  • DNS anycast analysis

    2

    Using tcpdump from three roots

    we examined the geographic and toplogical

    clustering of DNS clients.

    Data collected by OARC ISC with

    COGENT and RIPE NCC.

  • Outline

    3

    DNS anycast analysis

    - data source- analysis✦ diurnal patterns

    ✦ num. of requests/addresses per instance

    ✦ geographic reltionships / distances

    ✦ coverage topological / geographic

    - visualization ✦ Influence Map

  • Data Source

    4

    • dates- January 10th-11th 2006 (47 hours)

    • DNS sources- c-root (Cogent) 4 instances- f-root (ISC) 32 instances- k-root (RIPE) 11 instances

    • geographic- Netacuity database used for geographic mapping

    • topological - Route Views used for ASs and prefixes (January 10th)

  • Diurnal Patterns

    5

    The local instances mad1 and mty1 show clear diurnal patterns

    0 100 200 300 400 500 600

    00:0001/10

    04:0001/10

    08:0001/10

    12:0001/10

    16:0001/10

    20:0001/10

    00:0001/11

    04:0001/11

    08:0001/11

    12:0001/11

    16:0001/11

    20:0001/11

    00:0001/12

    04:0001/12

    # Re

    ques

    ts/s

    ec

    UTC time

    Madrid (noon)

    Monterrey (noon)

    LA (noon)

    mad1mty1lax1

  • Num. of Requests/Addresses

    6

    Instances sorted by the number of requests per second.

    0 10 20 30 40 50 600

    2

    4

    6

    8

    10x 105

    Instance

    # IP

    add

    ress

    es

    C!rootF!rootK!root

    jfk1*

    lax1*

    ord1*

    iad1*

    sfo2*

    pao1* linx*

    delhi* tokyo*denicnap*

    ams!ix*

    muc1

    ams1

    gru1

    lga1

    0 10 20 30 40 50 600

    500

    1000

    1500

    2000

    2500

    Instance

    # R

    eque

    sts

    / sec

    C!rootF!rootK!root

    iad1*

    ord1* lax1*

    jfk1*

    pao1*

    sfo2*

    linx*

    nap*

    denic tokyo*

    delhi*

    ams!ix*

    muc1 ams1 gru1 lga1

    *global instance

  • Client Geographic locations

    7

    Instances sorted by the number of requests per second.

    0

    25

    50

    75

    100

    %

    N.Amer S.Amer Europe Africa Asia Oceania

    OceaniaAsiaAfricaEuropeS. AmerN. Amer

    F: San Francisco (US) F: New York (US)

    F: Sao Paulo (BR) F: Santiago (CL)

    K: Reykjavik (IS) K: Helsinki (FI) F:Johannesburg(SA)

    F: Tel Aviv (IL) K: Tokyo (JP) F: Brisbane (AU)

    F: Auckland (NZ)

    sfo2* pao1* delhi*ams!ix*

    jfk1* lax1* ord1* iad1* nap* linx* tokyo*

    0

    25

    50

    75

    100

    %

    N.Amer S.Amer Europe Africa Asia Oceania

    OceaniaAsiaAfricaEuropeS. AmerN. Amer

    F: San Francisco (US) F: New York (US)

    F: Sao Paulo (BR) F: Santiago (CL)

    K: Reykjavik (IS) K: Helsinki (FI) F:Johannesburg(SA)

    F: Tel Aviv (IL) K: Tokyo (JP) F: Brisbane (AU)

    F: Auckland (NZ)

    sfo2* pao1* delhi*ams!ix*

    jfk1* lax1* ord1* iad1* nap* linx* tokyo*

    * global instance

  • CDF of distance from instance to client

    8

    Inflection points between 5000 and 7000 km are the result of clients clustering on continents.

    0 5000 10000 15000 200000

    20

    40

    60

    80

    100

    Distance (km)

    %

    All

    0 5000 10000 15000 200000

    20

    40

    60

    80

    100

    Distance (km)

    %

    LocalAllGlobal

    0 5000 10000 15000 200000

    20

    40

    60

    80

    100

    Distance (km)

    %

    LocalAllGlobal

    c-root f-root k-root

  • CDF of additional distance from optimal

    9

    The additional distance a client’s requests were routed from its geographically closest router.

    0 5000 10000 15000 2000020

    30

    40

    50

    60

    70

    80

    90

    100

    Additional Distance (km)

    %

    C!rootK!rootF!root

  • Topological Coverage

    10

    Instances sorted by AS coverage.

    0 10 20 30 40 50 600

    10

    20

    30

    40

    50

    60

    70

    Instance

    %

    C!rootF!rootK!root

    iad1*

    lax*ord*jfk*

    pao1*

    sfo2*

    linx*

    tokyo*denic

    delhi*

    nap*

    ams!ix*

    ams1 muc1

    lga1

    0 10 20 30 40 50 600

    10

    20

    30

    40

    50

    60

    70

    Instance%

    C!rootF!rootK!root

    iad1*

    lax1*

    ord1*

    jfk1*

    pao1*

    sfo2*

    linx*

    tokyo*denicdelhi*

    nap* ams!ix*

    ams1 muc1 lga1

    prefix coverageAS coverage

    *global instance

  • 1 2 3 4 51

    10

    100

    1K

    10K

    100K

    1M

    10M

    # instances requested by clients#

    clie

    nts

    C!rootF!rootK!root

    Client Stability

    11

    the number of instances queried by clients

    the percentage of clients seen by multiple servers

    C!root F!root K!root0

    1

    2

    3

    4

    5

    %

  • Influence Map Breakdown

    12

    k-apni c

    nap

    k-denic

    ams-ix

    delhi

    linx

    k-cern

    tokyo

    k-qtel

    k-emix

    k-mix

    k-isnic

    k-ficix

    k-grnet

    k-apnic

    napk-denic

    ams-ix

    delhi

    linx

    tokyo

    k-mix

    k-apni

    delhi

    average distance to clients (size)

    number of clients (size)

    number of clients (color)wedge

    nodegeographic location

    displacement location

    displacement mapDistance from client’s centriod and DNS server’s geographic location.

    location mapNumber of DNS

    clients in a given direction and average distance in that direction.

  • k-root

    5 global instances6 local instances

    c-root

    4 global instances

    f-root

    2 global instances30 local instances

    Influence Maps

    13

    c-jfk

    c-iad

    c-lax

    c-ord

    c-jfk

    c-iad

    c-laxc-ord

    k-apni c

    nap

    k-denic

    ams-ix

    delhi

    linx

    k-cern

    tokyo

    k-qtel

    k-emix

    k-mix

    k-isnic

    k-ficix

    k-grnet

    k-apnic

    napk-denic

    ams-ix

    delhi

    linx

    tokyo

    k-mix

    f-pao1

    f-mty

    f-tpe

    f-mad

    f-kix

    f-sel1y

    f-lax

    f-prg

    f-hkg

    f-gru

    f-lga

    sfo2

    f-bne

    f-yyz

    f-lis

    f-yowf-scl

    f-bcn

    f-jnb

    f-akl

    f-tlv

    f-dac

    f-dxb

    f-cgk

    f-maa

    f-khi

    f-mty

    f-laxf-hkg

    f-ams

    f-lga

    sfo2

    f-bne

    f-yyz

    f-sin

    f-bcnf-jnb

    f-maa

    f-khi

    Server Keynodenumber of clients10010,0001,000,000

    halonumber of clients1 2-8 9-7273-619 620-5289 5290-4511745118-384790

  • conclusion

    • Geographic clustering of clients to local instances is high.- 60% of clients experience small distance penalties between selected

    and optimal instances

    - local instances have strong diurnal patterns of use

    • Small minority of clients experience a change in instance. • ASes/IP addresses unevenly spread across instances,

    especially for f-root

    • Propose second data colletion 9th-10th, January 2007

    14

  • http://imdc.datcat.org

    is an NSF funded Internet measurement

    data catalog.

    DatCat

    15

  • http://imdc.datcat.orgTarget Problems• data everywhere and lots of it- caida alone has over 50 terabytes of data

    • pcap (tcpdump) packet traces• skitter topology traces• routing tables• etc

    • data is hard to find- no central indexing- many one-off data collections

    16

  • http://imdc.datcat.orgGoalsMake the following processes easy for users.

    • finding data sets of interest- Many researchers lack access and/or expertise to collect data

    needed for their research.

    • adding new data sets to the catalog- Contributors (who are generally underfunded and providing

    data out of dedication to the general good) want to minimize time lost to their own research.

    • annotating data sets in the catalog- Provides a flexible way for contributors and users to mark up

    interesting facts about data sets. Such as the number of packets or that a given file is corrupted.

    17

  • http://imdc.datcat.orgCurrent Statuscurrent implementation

    • user accounts• data sets- only CAIDA data so far- 57,088 files indexed - 4.8 TB of data indexed- 11 separate collections- 100+ accounts registered

    • browse simple/advanced search• help/tutorial• API for bulk contributions

    18

  • http://imdc.datcat.orgOverview

    19

  • http://imdc.datcat.orgBrowse Catalog

    • Browse page is a list of data file collections.- Collections are a high level group of related files.- A file may belong to more than one collection.

    • User clicks on collection of interest to view the collection details.

    select collection of interest

    20

  • http://imdc.datcat.orgCollection Details

    • Collection details displays specifics of the collection.

    - description of collection and its contents- motivation behind the collection

    • If the collection matches the researchers’ interest they can then list the data objects.

    go to a list of collection’s data objects

    21

  • http://imdc.datcat.orgData Files

    • A listing of the data files contained within the selected collection. Also shown: general statistics about the files such as format and size.

    • After the user has selected the set of files they are interested in, they then find packages which contain them.

    select desired data

    find the data’spackages.

    22

  • http://imdc.datcat.orgPackages

    • Packages are the downloadable groupings of one or more data files.

    • Once the user has selected a set of packages, he then finds download locations.

    select desired packages

    find download locations

    23

  • http://imdc.datcat.orgLocations

    • Locations are the place or process by which the package can be obtained.

    - Provides a simple URL or instructions which must be followed to get the data.

    • Get your data!

    select a location forthe package

    24

  • http://imdc.datcat.orgFuture Work

    • small number of invited third party contributors• public contributors• tools• studies- collections specialized for papers / experiments / etc

    25