learning networking by reproducing resultsstanford.edu/~yanlisa/publications/precs20_yanlisa... ·...
TRANSCRIPT
-
Learning Networking by Reproducing ResultsLisa Yan, Lecturer in Computer ScienceStanford UniversityJune 24, 2020
1
Slides http://stanford.edu/~yanlisa/publications/precs20_yanlisa-slides.pdf
http://stanford.edu/~yanlisa/publications/precs20_yanlisa-slides.pdf
-
Lisa Yan, 2020 2
Suppose you need to teach…1. Introduction to Networking2. Graduate Networking
-
Lisa Yan, 2020
Introduction to networking
3
Application
Transport
Network
Link
-
Lisa Yan, 2020
Graduate networking
4
-
Lisa Yan, 2020
Graduate networking
5
Train and build experience in order to become a future networking researcher or networking engineer.
-
Lisa Yan, 2020
What kinds of systems should advanced students build?
6
Give them all the same project (a bit boring)
(too risky)Have them create their own project
-
Lisa Yan, 2020
What kinds of systems should advanced students build?
7
Assignment goals• build a system• think critically about a system
?
circa 2012: the beginning of Mininet, a realistic network emulator
-
Lisa Yan, 2020
What kinds of systems should advanced students build?
8
Assignment goals• build a system• think critically about a system
Lisa Yan and Nick McKeown. Learning Networking by Reproducing Research Results. CCR April 2017. Best of CCR award at SIGCOMM 2017.
?Reproduce
someone else’s research.
circa 2012: the beginning of Mininet, a realistic network emulator
-
Lisa Yan, 2020 9
How was our experience?
Lisa Yan and Nick McKeown. Learning Networking by Reproducing Research Results. CCR April 2017. Best of CCR award at SIGCOMM 2017.
-
Lisa Yan, 2020 10
Really, really cool.
Lisa Yan and Nick McKeown. Learning Networking by Reproducing Research Results. CCR April 2017. Best of CCR award at SIGCOMM 2017.
-
Lisa Yan, 2020 11
These projects…1. Spark discussions between researchers and students.2. Give students more tools to use in their own research.
3. Jumpstart careers in networking.
Provide a fully reproducible project in the public domain.
Lisa Yan and Nick McKeown. Learning Networking by Reproducing Research Results. CCR April 2017. Best of CCR award at SIGCOMM 2017.
-
Lisa Yan, 2020
Today
Reproducing research project: Graduate computer networks• Project overview• Student stories
Greater impacts• A stronger research community• A framework for education
12
-
Lisa Yan, 2020
Today
Reproducing research project: Graduate computer networks• Project overview• Student stories
Greater impacts• A stronger research community• A framework for education
13
-
Lisa Yan, 2020
CS 244 Reproducibility Project
14
1. Project proposal• Pick a paper and a key result to reproduce.• Contact the original researchers
Day 1 7 14 21 28
2. Intermediate report• Preliminary work• TA-student meeting for next steps
4. Peer discussionIn-class presentations
3. Final report• Blog post• Public source code• Steps for reproducing
reproducingnetworkresearch.wordpress.com
https://reproducingnetworkresearch.wordpress.com/
-
Lisa Yan, 2020
Research venues
15
top networking conferences
internet standards
systems
security
-
Lisa Yan, 2020
Experiment details: Original research
16
B4 Wide Area Network (WAN)
Facebook 2000-node cluster
NetFPGA,Programmablesolutions
Theoreticalmodels
ns-2network trafficsimulators andemulators
Mininet - http://mininet.org/Mahimahi - http://mahimahi.mit.edu/
world wide web
http://mininet.org/http://mahimahi.mit.edu/
-
Lisa Yan, 2020 17
How can we reproduce research with limited resources?
-
Lisa Yan, 2020 18
1. Use simulators and emulators where necessary.
-
Lisa Yan, 2020
1. Use simulators and emulators where necessary.
19
B4 Wide Area Network (WAN)
Facebook 2000-node cluster
NetFPGA,Programmablesolutions
Theoreticalmodels
ns-2 Mininet - http://mininet.org/Mahimahi - http://mahimahi.mit.edu/
world wide web
network trafficsimulators andemulators
http://mininet.org/http://mahimahi.mit.edu/
-
Lisa Yan, 2020 20
2. Use cloud computing resources.
-
Lisa Yan, 2020
2. Use cloud computing resources.
21
-
Lisa Yan, 2020 22
3. Ask the original authors!
-
Lisa Yan, 2020
3. Ask the original authors!
23
System source code Workload generationOpen-source 33%Open-source butout-of-date/inconsistent
18%
Part of Linux Kernel 10%Contacted author 7%Binary available 1%Student-created 19%Not-needed 12%
Open-source 19%Sufficient details in paper
40%
Student-created 41%
-
Lisa Yan, 2020 24
What have our students achieved?
-
Lisa Yan, 2020
Research topics
25
Spark
TCPvideo
streaming
73 unique published papers, 1993–2018
-
Lisa Yan, 2020
0% 20% 40% 60% 80% 100%
20192018201720162015201420132012
% of student groups
Cour
se o
fferin
g
Unsuccessful Successful
9 years of student projects
26
73 unique papers reproduced300+ students since 2012 (150+ projects)
2018: require newresearch reproductions
2016: introducedmahimahi emulator
-
Lisa Yan, 2020
Reproduced work, by popularity
27
0
2
4
6
8
10
# st
uden
t rep
rodu
ctio
ns
Unique papers (73 total)
Pre-2018Post-2018
1. An Argument for Increasing TCP’s Initial Congestion Window (2010)
2. Jellyfish: Networking Data Centers Randomly (2012)
3. TCP Fast Open (2011)4. Confused, timid, and
unstable: picking a video streaming rate is hard (2014)
(require new unique research reproductions)
-
Lisa Yan, 2020 28
In these projects, our students learn a lot about engineering networked
systems.
-
Lisa Yan, 2020
Quick refresher: Congestion control
Network congestion: overloading network link, preventing useful communication
TCP congestion avoidance:1. Increase sending window
slowly (additively) with receiveracknowledgments (ACKs)
2. If data loss, decrease sendingwindow quickly (multiplicatively)
29
# pa
cket
s se
nttime
1.
2.
data
Sender Receiverack
-
Lisa Yan, 2020
TCP opt-ack attack
30
Original result from paperR. Sherwood et al. Misbehaving TCP receivers can cause internet-wide congestion collapse. CCS 2005.
Optimistic acknowledgments (opt-acks) encourage victim senders to send more
-
Lisa Yan, 2020
Reproduced: TCP opt-ack attack
31
Original result from paper Students’ reproduced result (2016, blog post)R. Sherwood et al. Misbehaving TCP receivers can cause internet-wide congestion collapse. CCS 2005.
ns-2 (simulator) Mininet (emulator)
https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs-244-16-misbehaving-tcp-receivers-can-cause-internet-wide-congestion-collapse/
-
Lisa Yan, 2020
Choosing a video streaming rate
32
Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.
Overly conservative video streaming rates lead to dismally low throughput
-
Lisa Yan, 2020
Reproduced: Choosing a video streaming rate (2013)
33
Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.
Students’ reproduced result (2013, blog post)
Our experiments use the real backend servers of [Netflix]. We do not use Mininet or any other
form of network emulation.
https://reproducingnetworkresearch.wordpress.com/2013/03/13/cs244-13-video-rate-selection-for-streaming-services/
-
Lisa Yan, 2020
Reproduced: Choosing a video streaming rate (2013)
34
Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.
Students’ reproduced result (2013, blog post)
We can easily observe that the problem reported in the paper has since been fixed
in [Netflix]…We have contacted the paper
authors and they confirm that this is the case.
https://reproducingnetworkresearch.wordpress.com/2013/03/13/cs244-13-video-rate-selection-for-streaming-services/
-
Lisa Yan, 2020
Reproduced: Choosing a video streaming rate (2017)
35
Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.
Students’ reproduced result (2017, blog post)
We chose to start with examining [Vimeo and YouTube], since they are freely accessible without a
subscription and there exist third party tools … for manipulating
video downloads…
https://reproducingnetworkresearch.wordpress.com/2017/06/05/cs244-17-confused-timid-and-unstable-picking-a-video-streaming-rate-is-hard-2/
-
Lisa Yan, 2020
Replicated: Choosing a video streaming rate (2017)
36
Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.
Students’ reproduced result (2017, blog post)
Experimental results show that YouTube’s
player does not exhibit the downward spiral
effect…
https://reproducingnetworkresearch.wordpress.com/2017/06/05/cs244-17-confused-timid-and-unstable-picking-a-video-streaming-rate-is-hard-2/
-
Lisa Yan, 2020
Replicated: Choosing a video streaming rate (2017)
37
Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.
Students’ reproduced result (2017, blog post)
…while Vimeo’s player does.
https://reproducingnetworkresearch.wordpress.com/2017/06/05/cs244-17-confused-timid-and-unstable-picking-a-video-streaming-rate-is-hard-2/
-
Lisa Yan, 2020
AWStream
38
Original result from paper GitHub open-source code
Adaptive streaming in wide-area networks(geo-distributed sites, scarce/variable bandwidth)
B. Zhang et al. AWStream: Adaptive Wide-Area Streaming Analytics. SIGCOMM 2018.
-
Lisa Yan, 2020
Reproduced: AWStream
39
Original result from paper Students’ reproduced result (2019, blog post)
B. Zhang et al. AWStream: Adaptive Wide-Area Streaming Analytics. SIGCOMM 2018.
https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs-244-16-misbehaving-tcp-receivers-can-cause-internet-wide-congestion-collapse/
-
Lisa Yan, 2020
Reproduced: AWStream
40
Original result from paper
B. Zhang et al. AWStream: Adaptive Wide-Area Streaming Analytics. SIGCOMM 2018.
Students’ reproduced result (2019, blog post)
…overall we found [the documentation] to be outdated and at times misleading. We relied mostly on close reading of the code
and email correspondence with the original paper’s authors for guidance.
https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs-244-16-misbehaving-tcp-receivers-can-cause-internet-wide-congestion-collapse/
-
Lisa Yan, 2020 41
What about unsuccessful research reproductions?
-
Lisa Yan, 2020
Overambitious engineering
Emulator performance restrictions
42
Common scenarios
“We spent our last week trying to find a mixed LP optimizer.” (reproduction of FastMPC,
SIGCOMM 2015, blog post)
Differences in workloads“Average QoE measurements were much
higher than those reported…our Wifi/International Links more than capable
of delivering high quality video streams”(reproduction of Pensieve, SIGCOMM 2017,
blog post)
"We scaled down all load generation parameters, but we still couldn’t achieve
target latencies when emulating on a single machine.” (reproduction of QJump,
NSDI 2015, blog post)
https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs244-16-failed-experiments-with-fastmpc-integrating-rate-based-adaptive-streaming-into-vlc/https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs244-16-failed-experiments-with-fastmpc-integrating-rate-based-adaptive-streaming-into-vlc/https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs244-16-qjump-controlling-network-interference/
-
Lisa Yan, 2020 43
How does time affect research artifacts?
(at least) two examples:1. Linux kernel versions2. Web traffic
-
Lisa Yan, 2020
Linux kernel versions: PRR for TCPPRR: Proportional Rate Reductionpaces out retransmissions acrossreceived ACKs
44N. Dukkipati, et al. Proportional Rate Reduction for TCP. IMC 2011.
The two (additional) discrepancies in /net/ipv4/tcp_input.c do not have a big
impact on the experiment results. (2015, blog post)
365 commits to /net/ipv4/tcp_input.c
Ubuntu version
Linux kernel
2011 11.10 3.0 No PRR
2014 12.04 3.2 With PRR
12 commits reference PRR
Late 2016 17.04+ 4.10+ Option to turn off PRR
Present-day 20.04+ 5.4+
https://reproducingnetworkresearch.wordpress.com/2015/05/31/cs24415-proportional-rate-reduction-of-tcp/
-
Lisa Yan, 2020
HTTPS and web trafficIn 2014, HTTPS page load times were slower than HTTP.
45
Original result from paper(Alexa top 500 websites)
D. Naylor et al. The Cost of the “S” in HTTPS. CoNEXT 2014.
• 40% of sites: HTTPS noticeably slower
• 55% of sites: no difference•
-
Lisa Yan, 2020
HTTPS and web trafficIn 2014, HTTPS page load times were slower than HTTP.
46
Original result from paper(Alexa top 500 websites)
D. Naylor et al. The Cost of the “S” in HTTPS. CoNEXT 2014.
Students’ reproduced result (2017, blog post)
(over 4G)(over Fiber)
• 15% of sites: HTTPS noticeably slower (vs 40%)
• 80%: no difference (vs 55%)• 30%: HTTPS faster (vs
-
Lisa Yan, 2020
TCP Fast Open
Page Load Time (PLT) much higher in recent yearsEmulator: Dummynet
47
Page RTT (ms)
PLT: non-TFO (s)
PLT: TFO (s)
Improve-ment:
CoNEXT2011
Amazon 100 2.60 2.34 10%
NYTimes 100 4.59 4.30 6%
Students 2015
Amazon 100 15.92 12.55 21%
NYTimes 100 5.37 4.03 25%
(2015, blog post)S. Radhakrishnan et al. TCP Fast Open. CoNEXT 2011.
https://reproducingnetworkresearch.wordpress.com/2015/05/31/cs244-15-tcp-fast-open/
-
Lisa Yan, 2020 48
(pause)
-
Lisa Yan, 2020 49
So what?
-
Lisa Yan, 2020
Today
Reproducing research project: Graduate computer networks• Project overview• Student stories
Greater impacts• A stronger research community• A framework for education
50
-
Lisa Yan, 2020
A stronger research community
51
Student
Original researcher
New researcher
Simulator/emulator developer
-
Lisa Yan, 2020
…overall we found [the documentation] to be outdated and at times misleading. We relied mostly on close reading of the code
and email correspondence with the original paper’s authors for guidance.
B. Zhang et al. AWStream: Adaptive Wide-Area Streaming Analytics. SIGCOMM 2018.
AWStream (SIGCOMM 2018, Students 2019)
52
Original result from paper Students’ reproduced result (2019, blog post)
https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs-244-16-misbehaving-tcp-receivers-can-cause-internet-wide-congestion-collapse/
-
Lisa Yan, 2020
QJump (NSDI 2015, Students 2015)
“Their assumption was that [people] would reproduce the results in an actual datacenter, whereas we did the emulation in Mininet.”
“In the end, we did not use their scripts directly, but it was nice to see that the authors were enthusiastic to have their work reproduced.”
53M. Grosvenor et al. Queues don’t matter when you can JUMP them! NSDI 2015.
(2015, blog post)
https://reproducingnetworkresearch.wordpress.com/2015/05/31/cs-244-15-qjump-delay-guarantees-in-datacenter-networks
-
Lisa Yan, 2020
A stronger research community
54
Student
Original researcher
New researcher
Simulator/emulator developer
I learned how to implement a scheduler for my graduate
research!
I can confirm and improve my current and
past research!
-
Lisa Yan, 2020
A stronger research community
55
Student
Original researcher
New researcher
Simulator/emulator developerI just started a career
in networks, and this prepared me for the
real world.
Tool feedback/ development
Mininethttp://mininet.org/
Mahimahihttp://mahimahi.mit.edu/
http://mininet.org/http://mahimahi.mit.edu/
-
Lisa Yan, 2020
A stronger research community
56
Student
Original researcher
New researcher
Simulator/emulator developer
A fully reproducible project in the public domain.• Other researchers can
build upon it• Eases technology
transferWe were contacted by both the original authors and a
student working on his own research!
-
Lisa Yan, 2020 57
How can we go beyond networks?
-
Lisa Yan, 2020
Example assignment schedule (10-week)
58
Assignment 1Core topic
practice emulationenvironment
Week 1 mid-quarter 10
Assignment 2Core topic
practice emulationenvironment
Final projectReproduce research
-
Lisa Yan, 2020
Example assignment schedule (10-week)
59
Assignment 1Core topic
practice emulationenvironment
Week 1 mid-quarter 10
Assignment 2Reproducethe same
project
Final projectReproduce research,
or original work
-
Lisa Yan, 2020
For platform developers
60
Assignment 1Core topic
practice emulationenvironment
Week 1 mid-quarter 10
Assignment 2Core topic
practice emulationenvironment
Final projectReproduce research
• Provide a list of papers/suggested projectsto get students started
• Be accessible and responsive to amultitude of applications
-
Lisa Yan, 2020
Encouraging community-led reproducible research
61
This talk
Reproducible research for all
-
Thank you!cs244.stanford.edu/reproducibility
62
L. Yan and N. McKeown. Learning Networking by Reproducing Research Results. CCR April 2017.https://ccronline.sigcomm.org/2017/learning-networking-by-reproducing-research-results/
Nick McKeown, Keith Winstein, Sachin Katti, Bruce SpangCS 244: Advanced Topics in Networking
http://cs244.stanford.edu/reproducibilityhttps://ccronline.sigcomm.org/2017/learning-networking-by-reproducing-research-results/