next-gen network telemetry is within your packets: in-band oam

66
Next-gen Network Telemetry is Within Your Packets: In-band OAM Frank Brockners, Shwetha Bhandari

Upload: frank-brockners

Post on 13-Apr-2017

285 views

Category:

Internet


0 download

TRANSCRIPT

PowerPoint Presentation

Next-gen Network Telemetry is Within Your Packets: In-band OAMFrank Brockners, Shwetha Bhandari

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

19/11/2016Cisco Live 2014

Continous & Always-OnOn DemandChecking Health and Compliance2

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

2Cisco Live 20149/11/2016

Continous & Always-OnOn DemandChecking Health and Compliance

3

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

3Cisco Live 20149/11/2016

In-Band OAM Solution EcosystemA Peek at the ImplementationIn-Band OAM The TechnologyWhy In-Band OAM? Use-Case ExamplesPrologue: Does your traffic comply?

4

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Cisco Live 20169/11/20164

Prolog

Consider TE, Service Chaining, Policy Based Routing, etc...:How do you prove that traffic follows the suggested path?5

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

5Cisco Live 20149/11/2016

Willis says the first issue is connecting VNFs to the infrastructure. OpenStack does this in a sequential manner, with the sequence serially numbered in the VNF, but the difficulty comes when trying to verify that the LAN has been connected to the correct LAN port, the WAN has been connected to the correct WAN port and so on. "If we get this wrong for a firewall function it could be the end of a CIO's career," says Willis.

http://www.lightreading.com/nfv/nfv-specs-open-source/bt-threatens-to-ditch-openstack/d/d-id/718735October 14, 2015Light reading citing Peter Willis, Chief researcher for data networks, British Telecom

6

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicEnsuring Path and/or Service Chain Integrity Approach Meta-data added to all user trafficBased on Share of a secretProvisioned by controller over secure channelUpdated at every service hopVerifier checks whether collected meta-data allows retrieval of secretPath verified

ControllerSecret

X

BCA

Verifier7

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicInitial Idea: Combine multiple secretsCompose the OnionApproachA service is described by a set of secrets, where each secret is associated with a service function. Service functions encrypt portions of the meta-data as part of their packet processing.Only the verifying node has access to all secrets. The verifying nodes re-encrypts the meta-data to validate whether the packet correctly traversed the service chain. Notes To be used only when hardware assisted encryption is available. i.e. AES-NI instructions or equivalent. Otherwise this could be very costly operation to verify at line speed.

S1S2S3Service-Secrets are nested like layers of an onion8

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicAdvanced Encryption Standard(AES) NI (New Instructions)8

Solution Approach: Leveraging Shamirs Secret Sharing Polynomials 101- Line: Min 2 points- Parabola: Min 3 pointsGeneral: It takes k+1points to defines apolynomialofdegree k.

9- Cubic function: Min 4 points

Adi Shamir

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicSolution Approach: Leverage Shamirs Secret SharingA polynomial as secretEach service is given a point on the curve When the packet travels through each service it collects these pointsA verifier can reconstruct the curve using the collected pointsOperations done over a finite field (mod prime) to protect against differential analysis

(3,46)(2,28)(1,16)

X

BCASecret:

Verifier10

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicOperationalizing the SolutionLeverage two polynomials: POLY-1 secret, constant: Each hop gets a point on POLY-1Only the verifier knows POLY-1POLY-2 public, random and per packet.Each hop generates a point on POLY-2 each time a packet crosses it.Each service function calculates (Point on POLY-1 + Point on POLY-2) to get (Point on POLY-3) and passes it to verifier by adding it to each packet. The verifier constructs POLY-3 from the points given by all the services and cross checks whether POLY-3 = POLY-1 + POLY-2Computationally efficient: 2 additons, 1 multiplication, mod prime per hopPOLY-1Secret ConstantPOLY-2Public Per Packet+=POLY-3Secret Per Packet11

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicMeta Data for Service/Path VerificationVerification secret is the independent coefficient of POLY-1Computation/retrieval through a cumulative computation at every hop (cumulative)For POLY-2 the independent coefficient is carried within the packet (typically a combination of timestamp and random number)n bits can service a maximum of 2n packetsVerification secret and POLY-2 coefficient (random) are of the same sizeSecret size is bound by prime numberTransferRateRND/SecretSizeMax # of packets(assuming 64 byte packets)Time that random lasts at maximum1 Gbps6410 Gbps64100 Gbps6410 Gbps5610 Gbps4810 Gbps401 Gbps322200 seconds, 36 minutes10 Gbps32220 seconds, 3.5 minutes100 GBps3222 seconds

12

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicMeta-Data ProvisioningMeta-Data for Service Chain Verification (POT) provisioned through a controller (OpenDaylight App)Netconf/YANG based protocolProvisioned information from Controller to Service Function / VerifierService-Chain-Identifier (to be mapped to service chaining technology specific identifier by network element)Service count (number of services in the chain)2 x POT-key-set (even and odd set)Secret (in case of communication to the verifier)Share of a secret, service index2nd polynomial coefficientsPrime number

Service Chain Verification App

S3Verifier

S2

S1POT-key-sets:Primesecret sharepoly-2 POT-key-sets:Primesecret sharepoly-2 POT-key-sets:Primesecret sharepoly-2 POT-key-sets:SecretPrimesecret sharepoly-2 Verification request for a particular service chain13

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

16* Bytes of Meta-Data for POTRandom Unique random number (e.g. Timestamp or combination of Timestamp and Sequence number)Cumulative (algorithm dependent)Transport options for different protocolsSegment Routing: New TLV in SRH headerNetwork Service Header: Type-2 Meta-DataIn-band OAM for IPv6: Proof-of-transit extension headerVXLAN-GPEProof-of-transit embedded-telemetry header... more to be added (incl. IPv4)

Proof of Transit: Meta-Data Transport Options

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Random | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Random (contd) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cumulative | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Cumulative (contd) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

*Note: Smaller numbers are feasible, but require a more frequent renewal of the polynomials/secrets. 2nd PolynomialInterative computation of secret

14

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

15enter: In-Band OAM

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

15Cisco Live 20149/11/2016

In-band OAMOAM traffic embedded in the data traffic but not part of the payload of the packetOAM effected by data trafficExample: IPv4 route recordingOut-of-band OAMOAM traffic is sent as dedicated traffic, independent from the data traffic (probe traffic)OAM not effected by data trafficExamples: Ethernet CFM (802.1ag), Ping, TracerouteHow to send OAM information in packet networks?On-Board Unit

Speed control by police car

16

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicNote the difference to transport OAM (SDH/SONET): In SONET/SDH there is a constant flow of framesOAM is a bit-stream/data between the data encoding sublayer and the physical media and present in every frame, but not part of the data payloadThis mode of OAM transport is often referred to as out of band, because OAM has its own channel but still resides side by side with the payload data (if present)

16

Remember RFC 791?

17

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicMore recently: In-band Network Telemetry for P4

http://p4.org/wp-content/uploads/fixed/INT/INT-current-spec.pdf 18

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

In-Band/Passive OAM - MotivationMultipath Forwarding debug ECMP networksService/Path Verification prove that traffic follows a pre-defined pathService/Quality Assurance Prove traffic SLAs, as opposed to probe-traffic SLAs; Overlay/UnderlayDerive Traffic MatrixCustom/Service Level TelemetryMost large ISP's prioritize Speedtest traffic and I would even go as far to say they probably route it faster as well to keep ping times low.Source: https://www.reddit.com/r/AskTechnology/comments/2i1nxc/can_i_trust_my_speedtestnet_results_when_my_isp/

19

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

Example use-cases...Path Tracing for ECMP networksService/Path VerificationDerive Traffic MatrixSLA proof: Delay, Jitter, LossCustom data: Geo-Location,..

Meta-data required...Node-ID, ingress i/f, egress i/fProof of Transit (random, cumulative)Node-IDSequence numbers, TimestampsCustom meta-data

What if you could collect operational meta-data within your traffic?

20

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicExample: Flow Tracing in EMCP NetworksProbe packet (ping) tests the wrong pathTrace all pathsand detect the one with issues

21

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicExample: Derive the Network Traffic Matrix

22

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public23Further Examples

Loss/Re-ordering Detection

Add sequence number Check sequence number Multicast

Delay measurements andtrend analysis

Link 1Link 2Link 4Delay Link1Delay Link2Delay Link3Delay Link4t = 11.2ms14.8ms3.8ms24.8mst = 21.2ms14.8ms3.8ms24.8mst = 31.2ms14.7ms3.8ms24.7mst = 41.2ms14.8ms3.8ms24.8mst = 51.2ms14.8ms3.8ms24.8mst = 61.2ms17.8ms3.8ms24.7mst = 71.2ms17.9ms3.8ms24.7mst = 81.2ms18.1ms3.8ms24.8ms

Link 3

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicIOT examples Industrial automation where networks are delay sensitive and can employ bicasting of traffic when delay / errors are detected in a give pathTimestamp 64 bit NTP timestamp seconds and pico seconds since epoch 1st jan 1970 0:0 UTC23

Example Generic customer meta-data: Geo-Location within your packets

24

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicAmending the OAM capabilities of IP

CapabilityExisting ToolsAdditionsContinuity CheckBFDLight-weight continuity check (check without sending extra traffic)Connectivity VerificationPing / ICMP echoEMCP supportAcknowledge different packet forwarding paths in routersPath Discovery andVerificationTracerouteEMCP supportAcknowledge different packet forwarding paths in routersProve correctness/integrity of forwarding pathDefect IndicationsIndicate if a forwarding policy (service chain) has not been metPerformance MonitoringIPPM (delay/packet loss)Metrics for live data traffic (delay, packet loss)

25

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicIn-Band OAM (iOAM)26

Gather telemetry and OAM information along the path within the data packet, as part of an existing/additional headerNo extra probe-traffic (as with ping, trace, ipsla)Transport optionsIPv6: Native v6 HbyH extension header or double-encapVXLAN-GPE: Embedded telemetry protocol headerSRv6: Policy-Element (proof-of-transit only)NSH: Type-2 Meta-Data (proof-of-transit only)... additional encapsulations being considered (incl. IPv4, MPLS)DeploymentDomain-ingress, domain-egress, and select devices within a domaininsert/remove/update the extension headerInformation export via IPFIX/Flexible-Netflow/publish into KafkaFast-path implementationHdriOAMPayload

iOAM domainNode-ID, Node-ID,..Ingress/Egress I/F, ..Timestamp, Timestamp,..Proof of TransitSequence numberGeneric customer data

2016 Cisco and/or its affiliates. All rights reserved. Cisco Public

IPFIX13456

2ACBDPayloadHdrPayloadHdrPayloadHdrr=45/c=17

A

1C4PayloadHdrr=45/c=39BA61C4

Insert POTmeta-dataPayloadHdrr=45/c=0

A

1

POT meta-dataPath-tracing data

Update POTmeta-data

Update POT meta-data

Update POT meta-dataPOT Verifier27

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicPer node scopeHop-by-Hop information processingDevice_Hop_LNode_IDIngress Interface IDEgress Interface IDTime-StampApplication Meta DataSet of nodes scopeHop-by-Hop information processingService Chain Validation (Random, Cumulative)Edge to Edge scopeEdge-to-Edge information processingSequence Number

iOAM: Information carried28

2016 Cisco and/or its affiliates. All rights reserved. Cisco PublicTracing Option 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Type | Opt Data Len |IOAM-trace-type| Elements-left | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+