perfsonar · measurement measuring network performance and monitoring network components are a...
TRANSCRIPT
![Page 2: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/2.jpg)
Agenda • Mo*va*on • What is perfSONAR? • Suggested Deployment for Campus/Regional
![Page 3: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/3.jpg)
• Networks are an essen*al part of data-‐intensive science – Connect data sources to data analysis – Connect collaborators to each other – Enable machine-‐consumable interfaces to data and analysis resources (e.g. portals), automa*on, scale
• Performance is cri*cal – Exponen*al data growth – Constant human factors – Technology changes/improvements/paradigm shiMs – Data movement and data analysis must keep up
• Effec*ve use of wide area (long-‐haul) networks by scien*sts has historically been difficult
Mo*va*on
![Page 4: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/4.jpg)
Measurement
Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today. In depth network measurement and monitoring services are key components to provide researches and engineers with views into application performance and to trouble shoot network problems.
![Page 5: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/5.jpg)
Network Monitoring • All networks do some form monitoring.
• Addresses needs of local staff for understanding state of the network o Would this informa*on be useful to external users? o Can these tools func*on on a mul*-‐domain basis?
• Beyond passive methods, there are ac*ve tools. o E.g. oMen we want a ‘throughput’ number. Can we automate that idea?
o Wouldn’t it be nice to get some sort of plot of performance over the course of a day? Week? Year? Mul*ple endpoints?
• Where is the “Measurement Middleware”? Something to allow for the easy exchange of metrics that are collected locally, on a global scale?
![Page 6: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/6.jpg)
SoM Failures • SoM failures are where basic connec*vity func*ons,
but high performance is not possible. • TCP was inten*onally designed to hide all
transmission errors from the user: – “As long as the TCPs con*nue to func*on properly and
the internet system does not become completely par**oned, no transmission errors will affect the users.” (From IEN 129, RFC 716)
• Some soM failures only affect high bandwidth long RTT flows.
• Hard failures are easy to detect & fix • soM failures can lie hidden for years! • SoM failures can be present on the host, protocol, applica*on, or network
• One network problem can oMen mask others – this is common
![Page 7: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/7.jpg)
Where Are The Problems?
• Source • Campus • Backbone
• S
• NREN
• Congested or faulty links between domains
• Congested intra-‐ campus links
• D
• Des*na*on • Campus
• Latency dependant problems inside domains with small RTT
• Regional
![Page 8: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/8.jpg)
• Source • Campus
• R&E • Backbone
• Regional
• D • S
• Des8na8on • Campus
• Regional
• Performance is good when RTT is < ~10 ms
• Performance is poor when RTT exceeds ~10 ms
• Switch with small buffers
Local Tes*ng Will Not Find Everything
![Page 9: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/9.jpg)
Agenda • Mo*va*on • What is perfSONAR? • Suggested Deployment for Campus/Regional
![Page 10: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/10.jpg)
What is perfSONAR? • perfSONAR is a tool to: • Set network performance expecta*ons for a variety of use cases • Find network problems (“soM failures”) & help fix these problems • Mi*gate the risks that are associated with the R&E environment (e.g. get
out in front of problems before its too late) • All in mul*-‐domain environments • These problems are all harder when mul*ple networks are involved –
need a mechanism to stop ‘finger poin*ng’ and get real work done • perfSONAR is provides a standard way to publish ac:ve and passive monitoring data
– This data is interes*ng to network researchers as well as network operators – This is the measurement middleware – a way to *e together local and end-‐to-‐
end measurements – A way to separate a network problem from that of an applica*on or host
• 10 – ESnet Science Engagement ([email protected]) - 5/6/14
![Page 11: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/11.jpg)
What is perfSONAR (cont.)
• perfSONAR is an infrastructure for network performance monitoring.
• It is a services oriented architecture delivering performance measurements in a federated environment.
• It is an intermediate layer between the performance measurement tools and the diagnostic or visualization applications.
• A methodology for monitoring network connections that span multiple administrative domains.
• Partners include: GEANT2, ESNET, I2, RNP • http://www.perfsonar.net/"
![Page 12: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/12.jpg)
perfSONAR Present
![Page 13: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/13.jpg)
Lookup Service Directory Search: hfp://stats.es.net/ServicesDirectory/
![Page 14: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/14.jpg)
Lookup Service
• Services register their existence and capabilities with a LS.
• Clients discover services by querying the LS. • LS are found by multicast, well-known servers,
local configuration, or other LSs. • The LS are queried on attributes (service type,
authentication) and more complex constructs (network location) not simply named-based.
![Page 15: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/15.jpg)
Measurement Archive Service
• Measurement Archives store data in databases and publish data produced by MPS (or TSs). • They also provide a historical record of analysis. • Reduces queries to the MPS by publishing to multiple clients. • As a server, it accepts and stores setup and publication requests. • As a client, it registers with an LS and subscribes to a MPS, other MAS and publishes data to subscribers.
![Page 16: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/16.jpg)
• The ToS is a specific example of a TS used to make topological information available to the framework.
• Understanding topology is necessary for the measurement system to optimize its operations (closest nodes).
• ToS may also be used for overviews/maps clients to present measurement data.
Topology Service
![Page 17: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/17.jpg)
Measurement Point Service
• MPS creates and publishes data by initiating active measurements or querying passive devices. • A setup protocol allows users to request measurements and publish the results. • As a server, the MPS accepts requests and publishes the data (client subscriber handle must be known in advance). • As a client, the MPS registers with the LS and publishes to subscribers.
![Page 18: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/18.jpg)
perfSONAR-PS Services
• Focus on development of major perfSONAR components – SNMP Based MP/MA – Lookup Service – Topology – Link Status New additions – OWAMP/BWCTL – Traceroute – Pinger (SLAC+Fermilab) – Visualization (Perfsonar UI plugins + meter)
![Page 19: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/19.jpg)
SNMP Based MP/MA
• Deployed – Internet2 Network – ESNet – Georgia Tech/SLAC/University of Delaware – All over
• Compatible with perfSONAR-UI • CPAN package in development
![Page 20: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/20.jpg)
Pinger Based MP/MA
• Joint effort between Fermi Lab and SLAC"• Present views of historic Pinger data"• Expose interface to schedule live tests"
• Development and integration into perfSONAR-PS based on LHC-OPN requirements"
![Page 21: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/21.jpg)
Visualization
• Utilizing the plugin architecture of perfSONAR-UI"
• Data visualization beyond network utilization"• Google Maps"
• Utilization by physical location"• 'Weather Map' of Internet2 Network"
• Web based speedometer to interact directly with MA code"• Maddash"
![Page 22: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/22.jpg)
Other services in development
• Topology/LS service"• UNIS development (Indiana University)"
• Maddash//mesh"• Ease full mesh deployment"
• OWAMP MA"• Coordinate regular scheduled tests with BWCTL"
• BWCTL MA"• Coordinate regular scheduled tests with OWAMP"
![Page 23: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/23.jpg)
Agenda • Mo*va*on • What is perfSONAR? • Suggested Deployment for Campus/Regional
![Page 24: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/24.jpg)
• The “perfSONAR Toolkit” is an open source implementa*on and packaging of the perfSONAR measurement infrastructure and protocols – everything you (or your scien*sts) needs to get a baseline and start addressing true problems
• hfp://psps.perfsonar.net/toolkit • All components are available as RPMs, and bundled into a CentOS 6-‐based “ne*nstall” and a “Live CD” • perfSONAR tools are much more accurate if run on a dedicated perfSONAR host, not on the DTN.
• Very easy to install and configure • Usually takes less than 30 minutes
perfSONAR Toolkit
![Page 25: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/25.jpg)
• We can’t wait for users to report problems and then fix them (soM failures can go unreported for years!)
• Things just break some*mes – Failing op*cs – Somebody messed around in a patch panel and kinked a fiber – Hardware goes bad
• Problems that get fixed have a way of coming back – System defaults come back aMer hardware/soMware upgrades – New employees may not know why the previous employee set things up a certain way and back out fixes
• Important to con*nually collect, archive, and alert on ac*ve throughput test results
Importance of Regular Tes*ng
![Page 26: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/26.jpg)
Regular perfSONAR Tests • We run regular tests to check for two things
– TCP throughput – One way delay and packet loss
• perfSONAR has mechanisms for managing regular tes*ng between perfSONAR hosts – Sta*s*cs collec*on and archiving – Graphs – Dashboard display – Integrate with NAGIOS
• This infrastructure is deployed now – perfSONAR hosts at facili*es can take advantage of it
• At-‐a-‐glance health check for data infrastructure
![Page 27: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/27.jpg)
• perfSONAR Dashboard: hfp://ps-‐dashboard.es.net
![Page 28: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/28.jpg)
• What are you going to measure? – Achievable bandwidth
• 2-‐3 regional des*na*ons • 4-‐8 important collaborators • 4-‐8 (more if you are willing, especially to start) *mes per day to each des*na*on
• 20-‐30 second tests within a region, longer across oceans and con*nents
– Loss/Availability/Latency • OWAMP: ~10-‐20 collaborators over diverse paths
– Interface U*liza*on & Errors (via SNMP) • Guidance on servers to buy: • hfp://psps.perfsonar.net/toolkit/hardware.html • Virtualiza*on is tricky, recommended to go dedicated
hardware.
Develop a Test Plan
![Page 29: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/29.jpg)
perfSONAR Deployment Loca*ons • Cri*cal to deploy such that you can test with useful seman*cs • perfSONAR hosts allow parts of the path to be tested separately
– Reduced visibility for devices between perfSONAR hosts – Must rely on counters or other means where perfSONAR can’t go
• Effec*ve test methodology derived from protocol behavior – TCP suffers much more from packet loss as latency increases – TCP is more likely to cause loss as latency increases – Tes*ng should leverage this in two ways
• Design tests so that they are likely to fail if there is a problem • Mimic the behavior of produc*on traffic as much as possible
– Note: don’t design your tests to succeed • The point is not to “be green” even if there are problems • The point is to find problems when they come up so that the problems are
fixed quickly
![Page 30: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today](https://reader033.vdocuments.us/reader033/viewer/2022042000/5e6d56bba1878f1cca302576/html5/thumbnails/30.jpg)
Sample Site Deployment