1. introduction 2. the case for a distributed data-collection … · 2018. 2. 19. · • p2p,...

19

Upload: others

Post on 28-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • 1. Introduction

    2. The case for a distributed data-collection architecture

    3. Architecture

    4. Validation

    5. Conclusion

    2

  • From the operator perspective, reliable data gathered from strategic network locations make all the difference for a number of applications, such as:

    • quality and SLA monitoring

    • network planning

    • security purposes (threat identification and profiling, violation of terms of use, etc.)

  • However....

    Operators still follow classic approaches based on a limited number of standalone probes and filters positioned along strategic places on their own core network, that gather information about specific trends and patterns

  • 5

    Triple-Play + xDSL/Cable

    • widespread broadband (capacity, # connections)• frequently clients have their own LANs• P2P, IPTV, VoD, VoIP, multimedia content,

    messaging…• ISP has devices on the customers’ LANs

    (set top boxes, IP phones, alarm devices, etc.)• distributed attacks (botnets, D-DoS…)…

    • slow, temporary dial-up connections• traditional applications• point-to-point connections• single-sourced attack patterns• ISP concerns ended at the POP

    Dial-up

    Triple-play /PON

    ?

    xDSL/Cable

    Gbpskbps Mbps Mbps Mbps

  • The classic model is not scaling properly with the sheer increase of traffic flow volume and diversity – a consequence of access technologies such as DSL and optical fiber and applications such as IPTV, video-on-demand, voice and P2P.

  • 7

    Idea:Recruit the customers’gateways to participate in the data collection effort.

    Each router has the necessary means and resources to be able support a embedded data-collection mechanisms.

    Data and events detected by the users’ gateways are transmitted to the ISP level, where they are subjected to correlation and further processing.

  • Traditional Distributed

    ISP restricted to its own network

    Users define the ISP’s scope of influence. Minimum ISP reach extended to CPE

    boundaries. Customer LAN can be monitored by the CPE, if the user allows it

    Frequently requires dedicated equipmentIt is possible to use already available

    network equipment: the broadband routers the subscribers already installed and paid for.

    Each probe deals with a sheer amount of trafficEach probe deals with a much smaller traffic

    flow, making it possible to apply fine-grained processing techniques

    Captured data and event correlation scopes are limited to a global perspective

    Monitoring system to access and infer information at two distinct infrastructure levels: microscopic (subscriber) level and

    macroscopic (operator) levelTraffic monitoring is possible at the ISP

    infrastructure, with scalability limitations Traffic monitoring at the CPE level

    8

  • This coordinated operation model allows the monitoring system to access and infer information at two distinct levels:

    • microscopic (subscriber) level

    • macroscopic (operator) level

    making it capable of detecting trends otherwise impossible for a device operating autonomously (like standalone probes, in the classic model).

    Events are encoded and transported using the Intrusion Detection Message Exchange Format (IDMEF - [RFC4765]) format.

    9

  • It is impossible to implement a testbed with the adequate size and scale to evaluate the solution

    Solutions: simulation or analytical validation

    Main concern refers to the scalability of the solution

    Approach: analyze and chatacterize/profile traffic in the access network; evaluate performance of the solution; extrapolate the results.

    10

  • Several alternatives◦ Simulation

    Problem: adequate data is not avaliable in order to perform a reliable simulation study

    ◦ User third-party traffic tracesSome traces are anonymizedUsage patterns are unknownSome traces are obsolete

    Chosen alternative:◦ Laboratory trace collection based on the usage pattern of

    regular network users

    11

  • 1212

    • Healthy Home User• 3 clients• Web, traffic and P2P traffic• Network traffic is always initiated from

    the LAN

    • Vulnerable Home User• 3 clients• 1 Honey Pot (binds ports 1...1023)• Web, traffic and P2P traffic• Network traffic may be initiated from both

    networks: LAN and WAN

  • Regular Honey Pot

    Number of security events (per hour) 29 1 686

    Percentage of external addresses involved in security events (potential attack sources)

    0.19% 0.32%

    Number of security events per 100MB of traffic

    7 102

    Average Traffic Rate (kbps) 150 447

    Record time (hours) 7.33 8

    Total number of external addresses 50 960 278 433

    13

  • 14

    Performance measurements with 10,100,1000,10000 messages

    HOME GATEWAY ISP Server

    Processor Intel(R) Pentium(R) 4 CPU 3.00GHzIntel(R) Pentium(R) 4

    CPU 3.00GHzMemory 494 MB 2 GBCaches L1/L2 16KB / 2MB 16KB / 2MBOperatingSystem Ubuntu Linux Ubuntu Linux

    Network 82541GI Gigabit Ethernet Controller

    3Com Corporation 3c905C-TX/TX-M

  • The system maintains the same event processing capacity, independently of the number of events, in all message sets

    15

  • 16

  • The IDMEF traffic produced by the solution is substantially low, even when reporting 100% of the events to the ISP

    17

  • 18

    Crossing the traces results and the scalability results◦ 3 Scenarios:

    Every client is healthy80% healthy and 20% vulnerableEvery client is vulnerable

    Most probable scenario

  • The concept allows for the creation of a distributed data-collection infrastructure leveraging resources already existant (and paid for)

    Can be used to support real-time monitoring and reaction mechanisms

    Can be extended for other purposes, but some questions remain: how about user privacy ?