aditya gudibanda, jordi ros-giralt, alan commike, richard ... · supercomputing conference:...

Supercomputing Conference: Innovating the Network for Data-Intensive Science (INDIS) Workshop 1

Fast Detection of Elephant Flows with Dirichlet-Categorical Inference

Aditya Gudibanda, Jordi Ros-Giralt, Alan Commike, Richard Lethin{gudibanda, giralt, commike, lethin}@reservoir.com

Supercomputing Conference: Innovating the Network for Data-Intensive Science (INDIS) WorkshopNovember 11, 2018

Reservoir Labs632 BroadwaySuite 803New York, NY 10012

Supercomputing Conference: Innovating the Network for Data-Intensive Science (INDIS) Workshop

Roadmap

2

• Detection Under Partial Information: Mathematical Framework

• The Problem of Elephant Flow Detection

• Related Work

• Introduction to Dirichlet-Categorical Inference

• Detection of Elephant-Flows via Dirichlet-Categorical Inference

• Algorithm

• Theory

• Results

• Comparison with Static Sampling Methods

• Conclusion


Detection Under Partial Information: Mathematical Framework

3

• Identifying heavy hitter flows by inspecting all packets is not feasible.

• Packet sampling is introduced to make detection algorithms scalable.

• This leads to the problem of flow reconstruction under partial information.

• Two sources of uncertainty: (1) packet sampling and (2) inability to predict future flow performance.


Detection Under Partial Information: Mathematical Framework

4

• Information theoretic framework: what is the minimum amount of information (packets) we need to detect heavy hitter flows with a user specified level of accuracy?

• Theory of Elephant Flow Detection under Partial Information: IEEE ISNCC 2018 Ros-Giralt et al. Reservoir Labs.

Detection likelihoodReal-time flow cache

Oracle flow cacheOracle flow size

Measured flow size

Zero quantum error region


The Problem of Elephant Flow Detection: Related Work

5

• To the best of our knowledge, all existing solutions require threshold parameters:

• Zhang 2010: Bayesian single sampling (high-rate flow traffic ratio threshold p*)

• Yi 2007: ElephantTrap algorithm (sampling rate p, top talker threshold L)

• Psounis 2005: SIFT algorithm (sampling rate p).

• Etc.

• Our contribution:

• Threshold-free algorithm.

• User-defined accuracy: first algorithm to accurately compute detection likelihood.

• Mathematically proven convergence: O(1/n)

• Scalable streaming/delta computations.


Introduction to Dirichlet-Categorical Inference

6

• Goal is to infer the distribution of flow sizes in a network.

• Normalize flow sizes to one.

• The flow distribution then turns into a categorical distribution.

• Our new goal is to efficiently compute the posterior of the categorical distribution induced by the flow sizes.

Examples of posterior distributions over the space of categorical distributions with three classes



7

• Dirichlet distribution is the conjugate prior of a categorical likelihood function.

• Posterior distribution is an analytic update of Dirichlet parameters:

• Yields efficient inference procedure for categorical distributions.

Flow sizes

Dirichlet parameters

Count size

Analytic update

Prior

Posterior

Observations


Detection of Elephant Flows by Dirichlet-Categorical Inference

8

Objective: to efficiently compute these terms

Quantum error at time t



9

• Marginals of Dirichlet distribution are Beta distributions:

• Each flow’s marginal posterior distribution is Beta-distributed

Marginal of Dirichlet distribution

Beta distribution

Dirichlet distribution


Detection of Elephant Flows by Dirichlet-Categorical Inference

10

• Each such term is an inequality between Beta distributions


Online Updates of Beta Inequalities

11

• Online Updates given by [Cook2005]

First flow: #seen and #unseen

Only first flow seen

Only second flow seen

where:Neither flow seen

Second flow: #seen and #unseen

Update functions


Dirichlet-Categorical Inference: Algorithm

12

process_sample()

• Initialize all pairwise probabilities to 0.5

• For every sample received:

• Update the pairwise probabilities

• Calculate the detection likelihoods

• Identify the number of elephant flows and dynamically adjust the sampling rate


Dirichlet-Categorical Inference: Algorithm

13

calculate_detection_likelihoods()

• Compute a vector d which stores in position i the detection likelihood for the top i flows to be the elephant flows

choose_e_and_adjust_sampling_rate()

• Report best_e to be the largest e above the target detection likelihood

• If no e is above the detection likelihood, do not report a best_e and increase the sampling rate

• If an e is found (we have enough information) decrease the sampling rate


Dirichlet-Categorical Inference: Theory

14


Dirichlet-Categorical Inference: Synthetic Experiments

15

• Created synthetic flow distributions following Gaussian, Laplace, Sech-square, Cauchy, and Linear distributions

• Randomly generated flows from these distributions and measured detection likelihoods based on these samples

• Measured entropy of posterior distribution after each iteration• Performed experiment for 10, 20, 50, 100, and 200 flows


Dirichlet-Categorical Inference: Synthetic Results

16

10 flows, detection likelihood for 1 flow 10 flows, detection likelihood for 2 flows 10 flows, detection likelihood for 3 flows

10 flows, detection likelihood for 4 flows 10 flows, detection likelihood for 5 flows 10 flows, Entropy curves for each distribution

Gaussian: [242; 54; 5; 1; 1; 1; 1; 1; 1; 1]


Dirichlet-Categorical Inference: SDN Experiments

17

• Created an SDN testbed using Open vSwitch for network engineering and Linux KVM for virtualization of hosts on our network

• Simulated network had two nodes and one virtual switch between them

• Packets were sampled on the switch using sFlow

• Modified sFlow to perform dynamic sampling

• Server hosting the network had 2 Xeon E5-2670 CPUs for a total of 32 cores and 64GB of RAM.

• Traffic alternated at 30s intervals between disjoint distributions with 5 elephant flows and 10 elephant flows

• Traffic on link was on average on the order of 100Mb/s

• Comparison between static sampling method with sampling rate of 0.01 [Psounis], and Dirichlet method with dynamic sampling


Dirichlet-Categorical Inference: SDN Results

18

Sampling rate too low, detection likelihood falls

and sampling rate is increased

Distribution changes, detection likelihood drops

and algorithm refrains from reporting the number

of elephant flows


Comparison with Static Sampling Rate

19

• Quantum error: the number of flows that an algorithm misclassifies divided by the true number of elephant flows

• Quantum error rate (QER): average quantum error over all time points:

• Static sampling: 0.19844

• Dirichlet method: 0.00893

• Improvement by a factor of 22

• In addition, detection likelihood is a standalone module that can be placed on top of any existing static sampling rate method.

• Detection likelihood is complementary, not competitive.


Conclusion

20

• We have presented a new algorithm which identifies elephant flows within the Bayesian inference framework.

• Our algorithm is threshold-free, uses dynamic sampling, has proven convergence, and high classification accuracy.

• Dynamic sampling is based on detection likelihood, a quantification of uncertainty in our algorithm.

• Capable of automatically finding the cut-off sampling rate.

• Forthcoming work will integrate this algorithm in real datacenter and wide-area SDN environments.


Thank You

632 BroadwaySuite 803New York, NY 10012

812 SW Washington St.Suite 1200Portland, OR 97205


References

22

J. D. Cook, “Exact Calculation of Beta Inequalities,” UT MD Anderson Cancer Center Department of Biostatistics, Houston, Texas, Tech. Rep., 2005. [Online]. Available: https://www.johndcook.com/UTMDABTR-005-05.pdf

K. Psounis, A. Ghosh, B. Prabhakar, and G. Wang, “SIFT: A simple algorithm for tracking elephant flows, and taking advantage of power laws,” 2005.

L. Yi, W. Mei, B. Prabhakar, and F. Bonomi, “ElephantTrap: A low cost device for identifying large flows,” in Proceedings - 15th Annual IEEE Symposium on High-Performance Interconnects, HOT Interconnects, 2007, pp. 99–105.

Y. Zhang, B. Fang, and Y. Zhang, “Identifying high-rate flows based on Bayesian single sampling,” in ICCET 2010 - 2010 International Conference on Computer Engineering and Technology, Proceedings, vol. 1, 2010.


Additional Backup Slides for Q&A


Dirichlet-Categorical Inference: Implementation

24

• Flow counts are stored in a cache of size gamma

• Flow housekeeping

• Flows which have not been seen within a certain period of time are removed from the cache

• Unique count optimization

• Pairwise probability updates are same for flows with the same observation counts

• Flow counts follow a power law distribution [Zhang2010]

• Drastically reduces the amount of redundant computation, as flow counts are concentrated in a few small numbers

• Ghost Flow Protocol

• Always keep a ghost flow in the pairwise probability table with a count size of 1

• When a new flow enters, copy the ghost flow probabilities

• Amortise the cost of adding a new flow


Dirichlet-Categorical Inference: QER Supplementary Material

25

• Measured the QER of Dirichlet Method over all time points, including time points when Dirichlet refrained from producing output• Static sampling: 0.19844• Dirichlet: 0.03072625698• Improvement by a factor of 6.5

aditya gudibanda, jordi ros-giralt, alan commike, richard ... · supercomputing conference:...

Documents