new packet sampling technique for robust flow measurements

32
New Packet Sampling Technique for Robust Flow Measurements Shigeo Shioda Department of Architecture and Urban Science Graduate School of Engineering, Chiba University

Upload: louis-cummings

Post on 02-Jan-2016

25 views

Category:

Documents


2 download

DESCRIPTION

New Packet Sampling Technique for Robust Flow Measurements. Shigeo Shioda Department of Architecture and Urban Science Graduate School of Engineering, Chiba University. Objectives of traffic measurements. Short-term monitoring. - PowerPoint PPT Presentation

TRANSCRIPT

  • New Packet Sampling Technique for Robust Flow MeasurementsShigeo ShiodaDepartment of Architecture and Urban ScienceGraduate School of Engineering, Chiba University

    Chiba University

    Objectives of traffic measurementsShort-term monitoring.Detecting high volume traffic patterns (denial of service attacks).Detecting unexpected or illegal packets.Investigating of origins.

    Long-term traffic engineering.Rerouting traffic.Upgrading selected links.

    Chiba University

    Per-flow-base traffic measurement (1)Just counting the number of packets or bytes is not sufficient; per-flow-base traffic measurement is necessary.

    What is a flow?Informally, a set of packets consisting logical communication between application processes running on different hosts.

    Flow-level information could tell us who is now using the Internet.

    Chiba University

    Per-flow-base traffic measurement (2)Meaning of a flow.Flow 1Flow 2

    Chiba University

    Per-flow-base traffic measurement (3)How we could distinguish flows.Investigating headers of packets.Classifying packets based on IP addresses, port numbers, and protocol ID.versionHLTOSTotal LengthIdentificationFlagsFragment OffsetTTLProtocol-IDHeader ChecksumSource AddressDestination AddressSource PortDestination PortSequence NumberAcknowledgement NumberIP HeaderTCP Header

    Chiba University

    Per-flow-base traffic measurement (4)Flow-measurement procedure.A Router maintains flow cache containing a flow record.When a packet is seen, a router updates counters of the corresponding entry in the flow cache.

    20131500300000115000Flow 1:Flow 2:# of packetsFlow 3:# of bytes001150023000Flow CacheFlow 1 packetFlow 2 packetFlow 3 packet4500

    Chiba University

    Problems of flow measurementsLack of scalabilityDue to the rapid increase of the todays line speed, the number of concurrent flows are increasing yearly.

    Updating per-flow counter on a per-packet basis is already impossible with todays line speed.

    The gap between DRAM speeds and link speeds is increasing.

    Chiba University

    Packet samplingUpdating a flow cache only for sampled packets. Elephant flows would be detected even under the packet sampling.Although many tiny (and unimportant) flows would be missed under the packet sampling, it does not matter in terms of network management.

    Ciscos Sampled NetFlow.

    How to sample packets?

    Chiba University

    Fixed rate samplingDefinitionChoosing sampled packets at a fixed rateFor example, taking one in every N packets.

    Ciscos Sampled NetFlow uses the fixed rate sampling.N = 5

    Chiba University

    Shortcomings of the fixed rate samplingThe size of memory holding the flow cache strongly depends on the traffic load.When DoS attacks are in progress, the memory would be rapidly consumed even if the sampling rate is low.However, low sampling rate would yield large error in traffic measurement under the normal load.

    Its a hard decision for network operators to set the static sampling rate.

    Chiba University

    Fixed period samplingDefinitionChoosing at most one packet to sample in every fixed-length period (called sampling window)For example, taking one in every tw second.

    Our solution.

    Sampling Window

    Chiba University

    Properties of fixed period samplingThe number of samplings during a second is bounded by 1/tw.

    The number of entries in the flow cache is also bounded.

    Sampling interval (tw) is easily determined based on the available memory or CPU for flow measurements.

    Chiba University

    Number of flow entreesTime [s]Number of EntriesIndianapolis-Kansas CityTime [s]Number of EntriesU.S.-Japan linkN=1000, tw=10ms

    Chiba University

    Number of sampled packetsTime [s]Number of Sampled PacketsTrace 1Trace 2N=1000, tw=10ms

    Chiba University

    Second Packet Sampling (1)An arbitrary packet can be chosen to sample during each sampling window.Which packets to be sampled?The simplest (and the most natural) rule: the first packet sampling.

    Intuitively the first packet sampling rule seems to work well, but it is not true.

    We apply the second packet sampling.

    Chiba University

    First packet sampling and second packet samplingFirst packet sampling

    Second packet sampling0tw2 tw3 tw4 tw0tw2 tw3 tw4 tw

    Chiba University

    Second Packet Sampling (2)For exampleFlow 1: packets arrive periodicallyFlow 2: packets arrive according to a Poisson processWe theoretically found that Under the first packet sampling rule, 63.2% of sampled packets are of flow 1. (strongly biased) Under the second packet sampling rule, 49.7% of sampled packets are of flow 1. (almost unbiased)

    Chiba University

    Flow level traffic estimationSampling inevitably misses some information.

    Some inference techniques are required to know the statistics of flow level traffic from the sampled packets.

    Here, we focus on the flow rate estimation.

    Chiba University

    Flow rate estimation (1)Flow rateInformally, the rate at which a flow sends data.Formally, the ratio of the total bytes transferred to the flow duration.

    Flow rate is an index for identifying vital flows, which often have significant impact on network performance.

    Flow rate can be estimated from sampled packet streams.

    Chiba University

    Flow rate estimation (2)Estimated Flow Rate [Mbps]Actual Flow Rate [Mbps]Real trace on a link between Indianapolis-Kansas CityEstimated Flow Rate [Mbps]Actual Flow Rate [Mbps]tw=10ms(0.15% packets were sampled)tw=1ms(1.5% packets were sampled)

    Chiba University

    Flow rate estimation (3)Estimated Flow Rate [Mbps]Actual Flow Rate [Mbps]Real trace on a U.S. Japan linktw=10ms(1.5% packets were sampled)tw=1ms(13.4% packets were sampled)Estimated Flow Rate [Mbps]Actual Flow Rate [Mbps]

    Chiba University

    ConclusionSampling techniques are indispensable to todays traffic measurement in the Internet.

    Fixed period sampling could bypass problems of the existing sampling technique (fixed rate sampling).

    Fixed period sampling should be used together with the second packet sampling.

    Flow rate can be estimated well with the fixed period sampling.

    Chiba University

    Thank you.

    Chiba University

    Flow rate estimation under first packet samplingN=1000, tw=10msEstimated Flow Rate [Mbps]Actual Flow Rate [Mbps]Estimated Flow Rate [Mbps]Actual Flow Rate [Mbps]Indianapolis-KansasU.S.-Japan link

    Chiba University

    Bayesian Estimates (2)tw=1msEstimated Flow Rate [Mbps]Actual Flow Rate [Mbps]Estimated Flow Rate [Mbps]Actual Flow Rate [Mbps]Bayesian EstimatorNaive Estimator

    Chiba University

    Bayesian Estimates (1)Estimated Flow Rate [Mbps]Actual Flow Rate [Mbps]Estimated Flow Rate [Mbps]Actual Flow Rate [Mbps]Bayesian EstimatorNaive Estimatortw=10ms

    Chiba University

    Objectives of traffic measurements (2)QoS monitoring.Measurement of QoS properties.Validating service-level agreement.

    Usage-based accounting.Input to charge or billing.

    Chiba University

    Shortcomings of the fixed rate samplingIs there any sampling strategy which work even under massive DoS attacks?501001502002503003500150300450600750900TrafficTime [s]

    Chiba University

    Existing solutions to the fixed rate samplingSampling rate adaptationFirst, the sampling rate is initialized to the maximum rate, at which the processor can operate.Then, the sampling rate is dynamically adjusted based on the amount of consumed memory.

    Adaptive NetFlow.

    We propose another solution.

    Chiba University

    Fixed period sampling (2)Timeout transactionUnder the sampling measurements, one could not exactly know the beginning and end of flows. (SYN or FIN packets may not be sampled.)Thus, flow entries that have not been seen during last N samplings are deleted from the flow cache.

    Due to timeout transaction, the flow cache keeps only flows, whose packets have been detected at least once during last N samplings.

    Chiba University

    Simulation experimentsThe accuracy of the flow-rate estimation was investigated using real traffic data.

    Two real traces (traffic data) were used .Trace1: Traffic data measured by PMA Project on a backbone link between Indianapolis - Kansas City.

    Trace 2: Traffic data measured by WIDE Project on a U.S. and Japan link published.

    Chiba University

    Flow rate estimation (2)Nave estimation.Estimation based on the sampling frequency.

    Bayesian estimation.If we know the probability density function of the flow rate as prior information, we could apply Bayesian estimator to improve the estimation accuracy.