lessons learned from 10k experiments to compare virtual ......syn/fin pkts sent: 1/1 syn/fin pkts...

21
Lessons Learned from 10k Experiments to Compare Virtual and Physical Testbeds Jonathan Crussell, Thomas M Kroeger, David Kavaler, Aaron Brown, Cynthia Phillips August 12th, 2019 Supported by the Laboratory Directed Research and Development program at Sandia Naধonal Laboratories, a mulধmission labora- tory managed and operated by Naধonal Technology and Engineering Soluধons of Sandia, LLC., a wholly owned subsidiary of Honey- well Internaধonal, Inc., for the U.S. Department of Energy's Naধonal Nuclear Security Administraধon under contract DE-NA-0003525.

Upload: others

Post on 06-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Lessons Learned from 10k Experiments toCompare Virtual and Physical Testbeds

Jonathan Crussell, Thomas M Kroeger, David Kavaler, Aaron Brown, Cynthia Phillips

August 12th, 2019

Supported by the Laboratory Directed Research and Development program at Sandia Na onal Laboratories, a mul mission labora-tory managed and operated by Na onal Technology and Engineering Solu ons of Sandia, LLC., a wholly owned subsidiary of Honey-

well Interna onal, Inc., for the U.S. Department of Energy's Na onal Nuclear Security Administra on under contract DE-NA-0003525.

Page 2: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Goals

Discover where and how virtual and physical testbeds differ“Virtualiza on ar facts”

Methodology:Run representa ve workloads on physical and virtual testbedsCollect, compare, and contrast metrics

Applica on-, OS-, and network-level

August 12th, 2019 2

Page 3: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Goals

Discover where and how virtual and physical testbeds differ“Virtualiza on ar facts”

Methodology:Run representa ve workloads on physical and virtual testbedsCollect, compare, and contrast metrics

Applica on-, OS-, and network-level

August 12th, 2019 2

Page 4: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Lessons Learned

Methodology and experimental results presented in previous work:Jonathan Crussell, Thomas M. Kroeger, Aaron Brown, and CynthiaPhillips. Virtually the same: Comparing physical and virtual testbeds. In2019 Interna onal Conference on Compu ng, Networking andCommunica ons (ICNC), pages 847–853, Feb 2019.

This work focuses on lessons learned in four areas:Tool Valida on and DevelopmentInstrumenta onData Collec on and Aggrega onData Analysis

August 12th, 2019 3

Page 5: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Methodology & Results

ApacheBench fetching fixed pages from an HTTP serverOver 10,000 experiments across three clustersOver 500TB of data (without full packet captures)Varied payload size, network drivers, network bandwidthLarge varia ons in network driver offloading behaviorNear-iden cal sequences of system calls

Leverageminimega toolset (see http://minimega.org for details)

August 12th, 2019 4

Page 6: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Tool Valida on and Development

Lesson: Using a testbed toolset for experimenta on requiressubstan al effort and considera on to put tools together in aworkable and validated fashion.

August 12th, 2019 5

Page 7: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Tool Valida on and Development

bash$ bash run.bashUSAGE: run.bash DIR ITER DURATION CONCURRENT VMTYPE

DRIVER NCPUS OFFLOAD RATE NWORKERS URL NREQUESTSINSTRUMENT

bash$ bash sweep.bash /scratch/output params.bash >sweep -params.bash

bash$ head -n 1 sweep -params.bashbash /root/run.bash /scratch/output/ 1 360 1 kvm e1000 1

on 1000 1 http://10.0.0.1/ 100000 truebash$ sort -R sweep -params.bash | parallel -j1 --eta -S

$(python igor.py --heads jc[0-9])

August 12th, 2019 6

Page 8: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Tool Valida on and Development

Running thousands of repe ons:Handle all edge cases (rare bug inminimega’s capture API)Clean up all state (failed to unmount container filesystem)

bash$ mount | grep megamount | head -n 5megamount_5562 on /tmp/minimega/5562/fs type overlaymegamount_5566 on /tmp/minimega/5566/fs type overlaymegamount_5611 on /tmp/minimega/5611/fs type overlaymegamount_5752 on /tmp/minimega/5752/fs type overlaymegamount_5774 on /tmp/minimega/5774/fs type overlaybash$ mount | grep megamount | wc -l96

August 12th, 2019 7

Page 9: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Tool Valida on and Development

Toolset improvements:Add snaplen op on to capture APIAdd standalone C2 server

# On VMsminimega -e cc exec mkdir /queminimega -e cc background protonuke -serve -httpminimega -e cc recv /miniccc.log

# On physical nodesrond -e exec mkdir /querond -e bg protonuke -serve -httprond -e recv /miniccc.log

August 12th, 2019 8

Page 10: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Instrumenta on

Lesson: Instrumenta on is invaluable but it is o en manually added,expensive, and experiment-specific. Integra ng more forms ofinstrumenta on into the toolset could help researchers to morerapidly develop experiments.

August 12th, 2019 9

Page 11: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Instrumenta on

Two forms of instrumenta on:Verify func onality of environmentUnderstand and evaluate experiments

A

C D

BSRC

DST

Integra ng the former into the toolset could simplify experiments

August 12th, 2019 10

Page 12: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Instrumenta on

Mismatch between capture loca onsDropped events for containers

bash$ tcpdump -i eth0 -w foo.pcaptcpdump: listening on eth0, link-type EN10MB (Ethernet),

capture size 262144 bytes9001 packets captured9000 packet received by filter1 packets dropped by kernelbash$ sysdig -w foo.scap<no indication of dropped events >

August 12th, 2019 11

Page 13: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Instrumenta on

Many ways to instrument experiment at many levels, mostly by handApplica on-level: RPS, Ji er, ...OS-level: System calls, u liza on, perf stats, ...Network-level: Flow sta s cs, bandwidth, ...

Use to understand anomaliese1000 stallsPerformance increase

August 12th, 2019 12

Page 14: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Data Collec on and Aggrega on

Lesson: Testbeds can provide a wealth of data to researchers butshould do more to streamline the process of collec ng andaggrega ng it into a usable form.

August 12th, 2019 13

Page 15: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Data Collec on and Aggrega on

How to extract instrumenta on data from VMs?C2 has limits on file sizeVMs write to qcow2, host extractsFuture: mount guest filesystem?

How to aggregate data?Dump per-itera on data into SQLite databaseCombine SQLite databases a er all itera ons complete

August 12th, 2019 14

Page 16: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Data Analysis

How to reduce storage?total packets: 5 total packets: 5ack pkts sent: 4 ack pkts sent: 5pure acks sent: 2 pure acks sent: 2sack pkts sent: 0 sack pkts sent: 0dsack pkts sent: 0 dsack pkts sent: 0max sack blks/ack: 0 max sack blks/ack: 0unique bytes sent: 72 unique bytes sent: 486actual data pkts: 1 actual data pkts: 1actual data bytes: 72 actual data bytes: 486rexmt data pkts: 0 rexmt data pkts: 0rexmt data bytes: 0 rexmt data bytes: 0zwnd probe pkts: 0 zwnd probe pkts: 0zwnd probe bytes: 0 zwnd probe bytes: 0outoforder pkts: 0 outoforder pkts: 0pushed data pkts: 1 pushed data pkts: 1SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Yadv wind scale: 7 adv wind scale: 7========================== <15 lines omitted > ==========================missed data: 0 bytes missed data: 0 bytestruncated data: 0 bytes truncated data: 352 bytestruncated packets: 0 pkts truncated packets: 1 pktsdata xmit time: 0.000 secs data xmit time: 0.000 secsidletime max: 1.0 ms idletime max: 0.7 msthroughput: 38482 Bps throughput: 259754 Bps

August 12th, 2019 15

Page 17: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Data Analysis

How to reduce storage?tcptrace produces 78 sta s cs per flowCompute summary sta s cs over all flows for itera onCompare mean of means across itera ons and parameters

August 12th, 2019 16

Page 18: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Data Analysis

Lesson: Testbeds allow for many repe ons but care should be usedwhen analyzing the data, especially in confla ng sta s calsignificance with prac cal importance.

August 12th, 2019 17

Page 19: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Data Analysis

Hypothesis tes ngEverything seems significant with many itera onsBut prac cally important? (e.g. 0.1% decrease in latency)

Mul ple comparisonsComparing many metrics can result in significance by chance

August 12th, 2019 18

Page 20: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

What’s next?

Experiments with conten onRun N client/server pairs on the same hardwareGenerates Nx the dataSurprising performance improvement when N is small (<12)

August 12th, 2019 19

Page 21: Lessons Learned from 10k Experiments to Compare Virtual ......SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 req 1323 ws/ts: Y/Y req 1323 ws/ts: Y/Y adv wind scale: 7 adv wind scale:

Ques ons/Comments?

Lessons:Using a testbed toolset for experimenta on requires substan al effort andconsidera on to put tools together in a workable and validated fashion.

Instrumenta on is invaluable but it is o en manually added, expensive, andexperiment-specific. Integra ng more forms of instrumenta on into thetoolset could help researchers to more rapidly develop experiments.

Testbeds can provide a wealth of data to researchers but should do more tostreamline the process of collec ng and aggrega ng it into a usable form.

Testbeds allow for many repe ons but care should be used when analyzingthe data, especially in confla ng sta s cal significance with prac calimportance.

Presenter: Jonathan [email protected]

August 12th, 2019 20