learning networking by reproducing resultsstanford.edu/~yanlisa/publications/precs20_yanlisa... ·...

62
Learning Networking by Reproducing Results Lisa Yan, Lecturer in Computer Science Stanford University June 24, 2020 1 Slides http://stanford.edu/~yanlisa/publications/precs20_yanlisa-slides.pdf

Upload: others

Post on 13-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • Learning Networking by Reproducing ResultsLisa Yan, Lecturer in Computer ScienceStanford UniversityJune 24, 2020

    1

    Slides http://stanford.edu/~yanlisa/publications/precs20_yanlisa-slides.pdf

    http://stanford.edu/~yanlisa/publications/precs20_yanlisa-slides.pdf

  • Lisa Yan, 2020 2

    Suppose you need to teach…1. Introduction to Networking2. Graduate Networking

  • Lisa Yan, 2020

    Introduction to networking

    3

    Application

    Transport

    Network

    Link

  • Lisa Yan, 2020

    Graduate networking

    4

  • Lisa Yan, 2020

    Graduate networking

    5

    Train and build experience in order to become a future networking researcher or networking engineer.

  • Lisa Yan, 2020

    What kinds of systems should advanced students build?

    6

    Give them all the same project (a bit boring)

    (too risky)Have them create their own project

  • Lisa Yan, 2020

    What kinds of systems should advanced students build?

    7

    Assignment goals• build a system• think critically about a system

    ?

    circa 2012: the beginning of Mininet, a realistic network emulator

  • Lisa Yan, 2020

    What kinds of systems should advanced students build?

    8

    Assignment goals• build a system• think critically about a system

    Lisa Yan and Nick McKeown. Learning Networking by Reproducing Research Results. CCR April 2017. Best of CCR award at SIGCOMM 2017.

    ?Reproduce

    someone else’s research.

    circa 2012: the beginning of Mininet, a realistic network emulator

  • Lisa Yan, 2020 9

    How was our experience?

    Lisa Yan and Nick McKeown. Learning Networking by Reproducing Research Results. CCR April 2017. Best of CCR award at SIGCOMM 2017.

  • Lisa Yan, 2020 10

    Really, really cool.

    Lisa Yan and Nick McKeown. Learning Networking by Reproducing Research Results. CCR April 2017. Best of CCR award at SIGCOMM 2017.

  • Lisa Yan, 2020 11

    These projects…1. Spark discussions between researchers and students.2. Give students more tools to use in their own research.

    3. Jumpstart careers in networking.

    Provide a fully reproducible project in the public domain.

    Lisa Yan and Nick McKeown. Learning Networking by Reproducing Research Results. CCR April 2017. Best of CCR award at SIGCOMM 2017.

  • Lisa Yan, 2020

    Today

    Reproducing research project: Graduate computer networks• Project overview• Student stories

    Greater impacts• A stronger research community• A framework for education

    12

  • Lisa Yan, 2020

    Today

    Reproducing research project: Graduate computer networks• Project overview• Student stories

    Greater impacts• A stronger research community• A framework for education

    13

  • Lisa Yan, 2020

    CS 244 Reproducibility Project

    14

    1. Project proposal• Pick a paper and a key result to reproduce.• Contact the original researchers

    Day 1 7 14 21 28

    2. Intermediate report• Preliminary work• TA-student meeting for next steps

    4. Peer discussionIn-class presentations

    3. Final report• Blog post• Public source code• Steps for reproducing

    reproducingnetworkresearch.wordpress.com

    https://reproducingnetworkresearch.wordpress.com/

  • Lisa Yan, 2020

    Research venues

    15

    top networking conferences

    internet standards

    systems

    security

  • Lisa Yan, 2020

    Experiment details: Original research

    16

    B4 Wide Area Network (WAN)

    Facebook 2000-node cluster

    NetFPGA,Programmablesolutions

    Theoreticalmodels

    ns-2network trafficsimulators andemulators

    Mininet - http://mininet.org/Mahimahi - http://mahimahi.mit.edu/

    world wide web

    http://mininet.org/http://mahimahi.mit.edu/

  • Lisa Yan, 2020 17

    How can we reproduce research with limited resources?

  • Lisa Yan, 2020 18

    1. Use simulators and emulators where necessary.

  • Lisa Yan, 2020

    1. Use simulators and emulators where necessary.

    19

    B4 Wide Area Network (WAN)

    Facebook 2000-node cluster

    NetFPGA,Programmablesolutions

    Theoreticalmodels

    ns-2 Mininet - http://mininet.org/Mahimahi - http://mahimahi.mit.edu/

    world wide web

    network trafficsimulators andemulators

    http://mininet.org/http://mahimahi.mit.edu/

  • Lisa Yan, 2020 20

    2. Use cloud computing resources.

  • Lisa Yan, 2020

    2. Use cloud computing resources.

    21

  • Lisa Yan, 2020 22

    3. Ask the original authors!

  • Lisa Yan, 2020

    3. Ask the original authors!

    23

    System source code Workload generationOpen-source 33%Open-source butout-of-date/inconsistent

    18%

    Part of Linux Kernel 10%Contacted author 7%Binary available 1%Student-created 19%Not-needed 12%

    Open-source 19%Sufficient details in paper

    40%

    Student-created 41%

  • Lisa Yan, 2020 24

    What have our students achieved?

  • Lisa Yan, 2020

    Research topics

    25

    Spark

    TCPvideo

    streaming

    73 unique published papers, 1993–2018

  • Lisa Yan, 2020

    0% 20% 40% 60% 80% 100%

    20192018201720162015201420132012

    % of student groups

    Cour

    se o

    fferin

    g

    Unsuccessful Successful

    9 years of student projects

    26

    73 unique papers reproduced300+ students since 2012 (150+ projects)

    2018: require newresearch reproductions

    2016: introducedmahimahi emulator

  • Lisa Yan, 2020

    Reproduced work, by popularity

    27

    0

    2

    4

    6

    8

    10

    # st

    uden

    t rep

    rodu

    ctio

    ns

    Unique papers (73 total)

    Pre-2018Post-2018

    1. An Argument for Increasing TCP’s Initial Congestion Window (2010)

    2. Jellyfish: Networking Data Centers Randomly (2012)

    3. TCP Fast Open (2011)4. Confused, timid, and

    unstable: picking a video streaming rate is hard (2014)

    (require new unique research reproductions)

  • Lisa Yan, 2020 28

    In these projects, our students learn a lot about engineering networked

    systems.

  • Lisa Yan, 2020

    Quick refresher: Congestion control

    Network congestion: overloading network link, preventing useful communication

    TCP congestion avoidance:1. Increase sending window

    slowly (additively) with receiveracknowledgments (ACKs)

    2. If data loss, decrease sendingwindow quickly (multiplicatively)

    29

    # pa

    cket

    s se

    nttime

    1.

    2.

    data

    Sender Receiverack

  • Lisa Yan, 2020

    TCP opt-ack attack

    30

    Original result from paperR. Sherwood et al. Misbehaving TCP receivers can cause internet-wide congestion collapse. CCS 2005.

    Optimistic acknowledgments (opt-acks) encourage victim senders to send more

  • Lisa Yan, 2020

    Reproduced: TCP opt-ack attack

    31

    Original result from paper Students’ reproduced result (2016, blog post)R. Sherwood et al. Misbehaving TCP receivers can cause internet-wide congestion collapse. CCS 2005.

    ns-2 (simulator) Mininet (emulator)

    https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs-244-16-misbehaving-tcp-receivers-can-cause-internet-wide-congestion-collapse/

  • Lisa Yan, 2020

    Choosing a video streaming rate

    32

    Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.

    Overly conservative video streaming rates lead to dismally low throughput

  • Lisa Yan, 2020

    Reproduced: Choosing a video streaming rate (2013)

    33

    Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.

    Students’ reproduced result (2013, blog post)

    Our experiments use the real backend servers of [Netflix]. We do not use Mininet or any other

    form of network emulation.

    https://reproducingnetworkresearch.wordpress.com/2013/03/13/cs244-13-video-rate-selection-for-streaming-services/

  • Lisa Yan, 2020

    Reproduced: Choosing a video streaming rate (2013)

    34

    Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.

    Students’ reproduced result (2013, blog post)

    We can easily observe that the problem reported in the paper has since been fixed

    in [Netflix]…We have contacted the paper

    authors and they confirm that this is the case.

    https://reproducingnetworkresearch.wordpress.com/2013/03/13/cs244-13-video-rate-selection-for-streaming-services/

  • Lisa Yan, 2020

    Reproduced: Choosing a video streaming rate (2017)

    35

    Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.

    Students’ reproduced result (2017, blog post)

    We chose to start with examining [Vimeo and YouTube], since they are freely accessible without a

    subscription and there exist third party tools … for manipulating

    video downloads…

    https://reproducingnetworkresearch.wordpress.com/2017/06/05/cs244-17-confused-timid-and-unstable-picking-a-video-streaming-rate-is-hard-2/

  • Lisa Yan, 2020

    Replicated: Choosing a video streaming rate (2017)

    36

    Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.

    Students’ reproduced result (2017, blog post)

    Experimental results show that YouTube’s

    player does not exhibit the downward spiral

    effect…

    https://reproducingnetworkresearch.wordpress.com/2017/06/05/cs244-17-confused-timid-and-unstable-picking-a-video-streaming-rate-is-hard-2/

  • Lisa Yan, 2020

    Replicated: Choosing a video streaming rate (2017)

    37

    Original result from paper (2012)T.-Y. Huang et al. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. IMC 2012.

    Students’ reproduced result (2017, blog post)

    …while Vimeo’s player does.

    https://reproducingnetworkresearch.wordpress.com/2017/06/05/cs244-17-confused-timid-and-unstable-picking-a-video-streaming-rate-is-hard-2/

  • Lisa Yan, 2020

    AWStream

    38

    Original result from paper GitHub open-source code

    Adaptive streaming in wide-area networks(geo-distributed sites, scarce/variable bandwidth)

    B. Zhang et al. AWStream: Adaptive Wide-Area Streaming Analytics. SIGCOMM 2018.

  • Lisa Yan, 2020

    Reproduced: AWStream

    39

    Original result from paper Students’ reproduced result (2019, blog post)

    B. Zhang et al. AWStream: Adaptive Wide-Area Streaming Analytics. SIGCOMM 2018.

    https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs-244-16-misbehaving-tcp-receivers-can-cause-internet-wide-congestion-collapse/

  • Lisa Yan, 2020

    Reproduced: AWStream

    40

    Original result from paper

    B. Zhang et al. AWStream: Adaptive Wide-Area Streaming Analytics. SIGCOMM 2018.

    Students’ reproduced result (2019, blog post)

    …overall we found [the documentation] to be outdated and at times misleading. We relied mostly on close reading of the code

    and email correspondence with the original paper’s authors for guidance.

    https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs-244-16-misbehaving-tcp-receivers-can-cause-internet-wide-congestion-collapse/

  • Lisa Yan, 2020 41

    What about unsuccessful research reproductions?

  • Lisa Yan, 2020

    Overambitious engineering

    Emulator performance restrictions

    42

    Common scenarios

    “We spent our last week trying to find a mixed LP optimizer.” (reproduction of FastMPC,

    SIGCOMM 2015, blog post)

    Differences in workloads“Average QoE measurements were much

    higher than those reported…our Wifi/International Links more than capable

    of delivering high quality video streams”(reproduction of Pensieve, SIGCOMM 2017,

    blog post)

    "We scaled down all load generation parameters, but we still couldn’t achieve

    target latencies when emulating on a single machine.” (reproduction of QJump,

    NSDI 2015, blog post)

    https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs244-16-failed-experiments-with-fastmpc-integrating-rate-based-adaptive-streaming-into-vlc/https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs244-16-failed-experiments-with-fastmpc-integrating-rate-based-adaptive-streaming-into-vlc/https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs244-16-qjump-controlling-network-interference/

  • Lisa Yan, 2020 43

    How does time affect research artifacts?

    (at least) two examples:1. Linux kernel versions2. Web traffic

  • Lisa Yan, 2020

    Linux kernel versions: PRR for TCPPRR: Proportional Rate Reductionpaces out retransmissions acrossreceived ACKs

    44N. Dukkipati, et al. Proportional Rate Reduction for TCP. IMC 2011.

    The two (additional) discrepancies in /net/ipv4/tcp_input.c do not have a big

    impact on the experiment results. (2015, blog post)

    365 commits to /net/ipv4/tcp_input.c

    Ubuntu version

    Linux kernel

    2011 11.10 3.0 No PRR

    2014 12.04 3.2 With PRR

    12 commits reference PRR

    Late 2016 17.04+ 4.10+ Option to turn off PRR

    Present-day 20.04+ 5.4+

    https://reproducingnetworkresearch.wordpress.com/2015/05/31/cs24415-proportional-rate-reduction-of-tcp/

  • Lisa Yan, 2020

    HTTPS and web trafficIn 2014, HTTPS page load times were slower than HTTP.

    45

    Original result from paper(Alexa top 500 websites)

    D. Naylor et al. The Cost of the “S” in HTTPS. CoNEXT 2014.

    • 40% of sites: HTTPS noticeably slower

    • 55% of sites: no difference•

  • Lisa Yan, 2020

    HTTPS and web trafficIn 2014, HTTPS page load times were slower than HTTP.

    46

    Original result from paper(Alexa top 500 websites)

    D. Naylor et al. The Cost of the “S” in HTTPS. CoNEXT 2014.

    Students’ reproduced result (2017, blog post)

    (over 4G)(over Fiber)

    • 15% of sites: HTTPS noticeably slower (vs 40%)

    • 80%: no difference (vs 55%)• 30%: HTTPS faster (vs

  • Lisa Yan, 2020

    TCP Fast Open

    Page Load Time (PLT) much higher in recent yearsEmulator: Dummynet

    47

    Page RTT (ms)

    PLT: non-TFO (s)

    PLT: TFO (s)

    Improve-ment:

    CoNEXT2011

    Amazon 100 2.60 2.34 10%

    NYTimes 100 4.59 4.30 6%

    Students 2015

    Amazon 100 15.92 12.55 21%

    NYTimes 100 5.37 4.03 25%

    (2015, blog post)S. Radhakrishnan et al. TCP Fast Open. CoNEXT 2011.

    https://reproducingnetworkresearch.wordpress.com/2015/05/31/cs244-15-tcp-fast-open/

  • Lisa Yan, 2020 48

    (pause)

  • Lisa Yan, 2020 49

    So what?

  • Lisa Yan, 2020

    Today

    Reproducing research project: Graduate computer networks• Project overview• Student stories

    Greater impacts• A stronger research community• A framework for education

    50

  • Lisa Yan, 2020

    A stronger research community

    51

    Student

    Original researcher

    New researcher

    Simulator/emulator developer

  • Lisa Yan, 2020

    …overall we found [the documentation] to be outdated and at times misleading. We relied mostly on close reading of the code

    and email correspondence with the original paper’s authors for guidance.

    B. Zhang et al. AWStream: Adaptive Wide-Area Streaming Analytics. SIGCOMM 2018.

    AWStream (SIGCOMM 2018, Students 2019)

    52

    Original result from paper Students’ reproduced result (2019, blog post)

    https://reproducingnetworkresearch.wordpress.com/2016/05/30/cs-244-16-misbehaving-tcp-receivers-can-cause-internet-wide-congestion-collapse/

  • Lisa Yan, 2020

    QJump (NSDI 2015, Students 2015)

    “Their assumption was that [people] would reproduce the results in an actual datacenter, whereas we did the emulation in Mininet.”

    “In the end, we did not use their scripts directly, but it was nice to see that the authors were enthusiastic to have their work reproduced.”

    53M. Grosvenor et al. Queues don’t matter when you can JUMP them! NSDI 2015.

    (2015, blog post)

    https://reproducingnetworkresearch.wordpress.com/2015/05/31/cs-244-15-qjump-delay-guarantees-in-datacenter-networks

  • Lisa Yan, 2020

    A stronger research community

    54

    Student

    Original researcher

    New researcher

    Simulator/emulator developer

    I learned how to implement a scheduler for my graduate

    research!

    I can confirm and improve my current and

    past research!

  • Lisa Yan, 2020

    A stronger research community

    55

    Student

    Original researcher

    New researcher

    Simulator/emulator developerI just started a career

    in networks, and this prepared me for the

    real world.

    Tool feedback/ development

    Mininethttp://mininet.org/

    Mahimahihttp://mahimahi.mit.edu/

    http://mininet.org/http://mahimahi.mit.edu/

  • Lisa Yan, 2020

    A stronger research community

    56

    Student

    Original researcher

    New researcher

    Simulator/emulator developer

    A fully reproducible project in the public domain.• Other researchers can

    build upon it• Eases technology

    transferWe were contacted by both the original authors and a

    student working on his own research!

  • Lisa Yan, 2020 57

    How can we go beyond networks?

  • Lisa Yan, 2020

    Example assignment schedule (10-week)

    58

    Assignment 1Core topic

    practice emulationenvironment

    Week 1 mid-quarter 10

    Assignment 2Core topic

    practice emulationenvironment

    Final projectReproduce research

  • Lisa Yan, 2020

    Example assignment schedule (10-week)

    59

    Assignment 1Core topic

    practice emulationenvironment

    Week 1 mid-quarter 10

    Assignment 2Reproducethe same

    project

    Final projectReproduce research,

    or original work

  • Lisa Yan, 2020

    For platform developers

    60

    Assignment 1Core topic

    practice emulationenvironment

    Week 1 mid-quarter 10

    Assignment 2Core topic

    practice emulationenvironment

    Final projectReproduce research

    • Provide a list of papers/suggested projectsto get students started

    • Be accessible and responsive to amultitude of applications

  • Lisa Yan, 2020

    Encouraging community-led reproducible research

    61

    This talk

    Reproducible research for all

  • Thank you!cs244.stanford.edu/reproducibility

    62

    L. Yan and N. McKeown. Learning Networking by Reproducing Research Results. CCR April 2017.https://ccronline.sigcomm.org/2017/learning-networking-by-reproducing-research-results/

    Nick McKeown, Keith Winstein, Sachin Katti, Bruce SpangCS 244: Advanced Topics in Networking

    http://cs244.stanford.edu/reproducibilityhttps://ccronline.sigcomm.org/2017/learning-networking-by-reproducing-research-results/