introduction to content switch c. edward chow department of computer science university of colorado...

32
Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

Post on 19-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

Introduction to Content Switch

C. Edward Chow

Department of Computer ScienceUniversity of Colorado at Colorado Springs

Page 2: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 2

Outline of the Talk

• What is a Content Switch?• What Services it Can Provide• Content Switch Example• Related Technologies• Content Switch Architecture and Basic Operations• TCP Delay Binding and Related Improvement• Content Switch Rule and Conflict Detection• Related Load Balancing Research Results

Page 3: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 3

Content Switch (CS)

• Route packets based on high layer (Layer 5/7) headers and content.

• Examples:– Direct Web traffic based on pattern of URLs, host

tags, cookies.– Can Route incoming email based on email address;

Connect POP/IMAP based on login• Web switches and Intel XML Director/accelerator are

special cases of content switch.

Page 4: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 4

What Services It Can Provide

• Enabling premium services for e-commerce, ISP, and Web hosting providers

• Load Balancing and High Available Server Clusters: Web, E-commerce, Email, Computing, File, SAN

• Policy-based networking, differential/QoS services. • Firewall, Strengthening DoS protection, cache/firewall

load-balancing• ‘Flash-crowd' management• Email Spam Protection, Virus Detection/Removal• Applet Authentication/Filtering

Page 5: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 5

F5 VRM Solution

BIG-IP

Server Array

Webmaster

Site Inewyork.domain.com

Site IIItokyo.domain.com

Site IIlosangeles.domain.com

Userlondon.domain.com

Local DNS

3-DNS

GLOBAL-SITE

Router

BIG-IP

InternetInternet

Page 6: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 6

Intel Netstructure XML Director 7280

• Example of Rule:Server1: create */order.asp & //Amount[Value >= 10000]

Page 7: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 7

Phobos In-Switch• Only load balancing switch in a PCI card form factor

• Plugs directly into any server PCI slot

• Supports up to 8,192 servers, ensuring availability and maximum performance

• Six different algorithms are available for optimum performance: Round Robin, Weighted Percentage, Least Connections, Fastest Response Time, Adaptive and Fixed.

• Provides failover to other servers for high-availability of the web site

• U.S. Retail $1995.00

Page 8: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 8

E-Commerce Example: 1. ClientClient submits via HTTP/Post (or SOAP) the following purchase in XML:<purchase>

<customerName>CCL</customerName><customerID>111222333</customerID><item><productID>309121544</productID>

<productName>IBM Thinkpad T21</productName><unitPrice>5000</unitPrice><noOfUnits>10</noOfUnits><subTotal>50000</subTotal>

</item><item><productID>309121538</productID>

<productName>Intel wireless LAN PC Card</productName><unitPrice>200</unitPrice><noOfUnits>10</noOfUnits><subTotal>2000</subTotal>

</item><totalAmount>52000</totalAmount>

</purchase>

Page 9: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 9

E-Commerce Example: 2. Content Switch

• Content switch receives the packet.• Recognize it is a http post request from http request line

POST /purchase.cgi HTTP/1.1• Recognize it is an XML document from the meta header

content-type: TEXT/XML• Parsing XML content• Extract values of tag sequences:

52000 purchase/totalAmount CCL purchase/customerName

• Rule 1 is matched and packet is routed to one of highSpeedServers.Rule 1: if (xml.purchase/totalAmount > 5000) routeTo(highSpeedServers);Rule 2: if (xml.purchase/customerName == CCL) routeTo(specialCustomerServers);

Page 10: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 10

No Free Lunch:Penalty of Having Content Switch

Increased packet processing time.• For XML Director/Accelerator, it needs to parse XML

document and match tag sequences. 1-3? order of processing time

Layer 4 Switching Layer 7 Switchingpacket header extraction fixed short fields varying length long fieldsswitch rule matching hash table look up pattern matching

Size of XML Document (Bytes) XML Content Extract Time (ms)600 14

7000 2167104 53

Page 11: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 11

Related Technologies

• Application level solution: Proxy server; Apache/Tomcat/Servlet; Microsoft NLB

• Kernel level layer 4 load balancing solution: http://www.linuxvirtualserver.org/– Joseph Mark’s presentation– LVS-NAT(Network Address Translation) web page– LVS-IP Tunnel web page– LVS-DR (Direct Routing) web page

• Hardware solution: Cisco 11000, F5 (Big IP), Alteon Web Systems, Foundry Networks (ServerIron),Good information: Foundry ServerIron Installation and Configuration Guide, May 2000.

Page 12: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 12

Basic Operations of Content Switching

CS Rule Matching Algorithm

HeaderContent

Extraction

Packet Classification

CSRules

Packet Routing(Load Balancing)

CS RuleEditor

IncomingPackets

ForwardPacket

To Servers

Network Path Info

Server Load Status

CS: Content Switching

Page 13: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 13

Content Switch ArchitectureApostolopoulos 2000

Page 14: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 14

Efficient Software Architecture

• Tasks: Million Packets with thousand of rules to match and load balancing algorithms to run.

• How to assign tasks to the processors and threads?– Packet Extraction

(Understand header formats, XML parsing)– Content Switching Rule Matching– Packet Routing

(Load Balancing, Bandwidth Control)• How Much Packet Processing Should Controllers Do?• What a controller can do?• A Typical Parallel Processing Problem?

Page 15: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 15

TCP Delay Bindingclient

content switch server

step1

step2

SYN(CSEQ)

SYN(DSEQ) ACK(CSEQ+1)

DATA(CSEQ+1) ACK(DSEQ+1)

step4

step9

step10

step5

step6

SYN(CSEQ)

SYN(SSEQ) ACK(CSEQ+1)

step8

DATA(CSEQ+1) ACK(SSEQ+1)

DATA(SSEQ+1) ACK(CSEQ+lenR+1)

DATA(DSEQ+1) ACK(CSEQ+LenR+1)

ACK(DSEQ+ lenD+1) ACK(SSEQ+lenD+1)

lenR: size of http request. lenD: size of return document.

ACK(DSEQ+1)

step3

step7

ACK(SSEQ+1)

DATA(?) 2nd request ACK(?)

step11

Page 16: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 16

Lesson Learned in Implementing TCP Delay Binding• In our Linux 2.2 kernel-based content switch prototype, we

found client sends duplicate requests after step 3.• It overloads the content switch and the real server.• Reason:

– Client TCP time-out, retransmit– Content switch printk() overhead, too many debug msgs– It could happens when there are many content rules, slow

server response.• Solution: content switch sends ack(CSEQ+LenR+1) to stop

retransmit.

Page 17: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 17

Pre-Allocate Server Schemeclient

content switch Pre-allocatedserver

step1

step2

SYN(CSEQ)

SYN(SSEQ)

ACK(CSEQ+1)

DATA(CSEQ+1)

ACK(SSEQ+1)step3

step4

step5

SYN(CSEQ)

SYN(SSEQ) ACK(CSEQ+1)

DATA(CSEQ+1) ACK(SSEQ+1)

DATA(SSEQ+1)ACK(CSEQ+lenR+1)

DATA(SSEQ+1)ACK(CSEQ+LenR+1)

ACK(SSEQ+lenD+1) ACK(SSEQ+lenD+1)

.

• Guess routing decision based on IP/Port#/History• Advantage:

• Faster than TCP delay binding.• Possible direct route between client and server• Reduce session processing overhead

no need to convert server sequence #

Page 18: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 18

Degenerated to TCP Delay Binding If Guess Wrong

client content switch

Pre-allocatedserver

step1

step2

SYN(CSEQ)

SYN(SSEQ)

ACK(CSEQ+1)

DATA(CSEQ+1)

ACK(SSEQ+1)step3a

SYN(CSEQ)

SYN(SSEQ) ACK(CSEQ+1)

step4

step5

DATA(RSEQ+1)ACK(CSEQ+lenR+1)

DATA(SSEQ+1)ACK(CSEQ+LenR+1)

ACK(DSEQ+lenD+1) ACK(SSEQ+lenD+1)

FIN(CSEQ+1)

step4

step5

step6

SYN(CSEQ)

SYN(RSEQ) ACK(CSEQ+1)

DATA(CSEQ+1) ACK(SSEQ+1)

Right server

Sequence # conversion needed

Page 19: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 19

Filter Process SchemeFilter Processrun on server

client content switch

server

step1

step2

SYN(CSEQ)

SYN(DSEQ)

ACK(CSEQ+1)

DATA(CSEQ+1)

ACK(DSEQ+1)step3

step4 a

step5

step6

step7

step8

SYN(CSEQ)

SYN(SSEQ) ACK(CSEQ+1)

DATA(CSEQ+1) ACK(SSEQ+1)

DATA(SSEQ+1) ACK(CSEQ+lenR+1)

DATA(DSEQ+1) ACK(CSEQ+LenR+1)

ACK(DSEQ+lenD+1) ACK(SSEQ+lenD+1)

step4bMigrate(Data, CSEQ, DSEQ)

Page 20: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 20

Multiple HTTP Requests from One TCP Connection

• A keep alive TCP connection may include multiple HTTP “GET” requests.• Content Switch examines each “GET” request and makes new routing decision.• Content Switch establishes another connection with a different server based on the routing decision.• Those HTTP responses from different servers need to be interleaved and seen by the user as if from the same server.• Solutions: In order delivery (buffer requirement); Out of order delivery (seq# tracking)?• Problems: Should we throw away earlier html requests if receive later requests?

.

.

.

client

NAT approach

uccs.jpgrocky.mid

home.htm

Index.htm

ContentSwitch

server1

server2

server9

Page 21: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 21

Multiple HTTP Requests from One TCP Connection

• Can servers return documents directly to client in keep-alive session case?

• Can equivalent VS-Tunnel or VS-DR be implemented using Content Switch?

.

.

.

client

uccs.jpg

rocky.mid

home.htm

ContentSwitch

server1

server2

server9

Page 22: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 22

Content Switch Rule Survey

Survey shows that existing switches support• rules in basic (condition action) or (action condition)

form• some define condition as class, then specify the

action in separate statement or command• simple single conditional term• command line interface (to facilitate incremental

update?)• Actions can include reject, forward, put in queue (for

bandwidth control, scheduling)

Page 23: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 23

Content Switch Rule Design• Rule syntax generic to support all Intended features.• Use simple C if statement syntax rule: if (condition) { action }

– Easy to read – Allow optimization using c compiler

• Condition consists of multiple terms of – variable relational_operator value

e.g. xml.purchase/totalAmount > 50000 smtp.to == “[email protected]

cookie.name == “servlet1” bitmatch(64, 8, 0xff) == 64 # above mean TTL=64 idea from netfilter universal filter

– suffix(variable, string) e.g. suffix(url, “gif”)– regex(variable, pattern) e.g. regex(url, “/purchase”)

• Action consists of reject, forward(server| queue)loadBalance(serverGroup, loadBalancingAlgorihtm)

Page 24: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 24

Efficient CS Rule Matching

• Brute force, strict priority: Rules are executed in sequential manner.

• Efficient Rule Matching Method:– Organize Rules so that rules can be skipped

based on existing content types.– Utilize compiler optimization technique.

Page 25: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 25

Simple CS Rule Editor GUI

Page 26: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 26

Conflict Detection on Content Switching Rules

• Detect conflicts among rules or rule set.• Absolute conflict type:

r1: if (xml.purchase/customerName == “CCL”) {routeTo(r1)}r2: if (xml.purchase/customerName == “CCL”) {routeTo(r2)}

• Potential conflict type: r1: if (xml.purchase/totalAmount > 5000) {routeTo(quickServers)}r2: if (xml.purchase/totalAmount >20000) {routeTo(superServers)}

• Algorithm: Build tree with the same variable, check operator and value to see if they are the same or lead to potential conflict, compare actions to decide conflict type or duplication.

• Editor can build these trees while a user enters rules and warns about conflict right away.

Page 27: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 27

XML Tag Value Extraction

• A xmlContentExtract() is built to extract the tag values of a list of unique tag sequences.

• It is based on clark cooper’s expat 1.0 xmlparser.• Its argument include the pointer to an XML

document, the pointer to the array of strings (unique xml tag squences we follow the xsl selector syntax), and the number of sequences.

• It return the list of a structure node, with the tag sequence, its attribute, and its value.

• Currently, it supports one attribute and tag sequece needs to be unique.

Page 28: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 28

Status of UCCS ACSD Project

• A Linux-based content switch prototype is almost complete. • It is based on Linux-2.2.16-3 and lvs.• ip_forward.c, ip_masq.c, ip_vs.c are modified to implement

basic TCP delay binding.• Preliminary tests had real server return web document and

discovered the client retransmission problem.• ip_cs.c are added for most of the content switching functions.• http header extraction and xml content extract code are being

integrated in for testing.• A simple Java-based ruleEdit program was created for rule

editing.

Page 29: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 29

Related Load Balancing Research Results

• Modified Apache status module to report– Total bytes to be transferred by child processes– Average document transfer speed

• Modified LB-DNS to receive server status and bandwidth probing results.

• LB-DNS returns IP-address of the best server based a weight contributed by both server load and bandwidth.

• Modified WebStone benchmark to test the performance of load balancing web server clusters.

Page 30: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 30

Load balancing Systems

Modified Web Server1

Modified Web Servern

Statistics GatheringDaemon

LBA: ModifiedDNS

Server Delay

Request for Web pages

Server Ranking/tmp/StatFile

Bandwidth Probe Results

Page 31: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 31

Connection Rate: LBA vs. Round-RobinServer connection rate for 4 servers

0

200

400

600

800

1000

Update for LBA , per sec

Conn

ectio

ns/s

ec

load balancing system round-robin

load balancing system 418.2 656.6 907.9 420 636.7 322.6 711.6 420.5 638.3 670.6 683.4 899

round-robin 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6

1 2 3 4 5 6 7 8 9 10 11 12

Round robin only run once

Page 32: Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs

12/22/2000 Edward Chow ACSD Project Status 32

Conclusion

• Content switch with generic rules can be easily configured for wide-variety of value-added services:– Load balancing/High Available server farm.– Premium services– Firewall– Bandwidth control/Traffic shaping

• Require efficient SW/HW architecture and rule matching algorithms to reduce processing overhead.

• Content rule design/conflict detection are important and challenging.

• TCP delay binding can be improved.• Servicing multiple requests in keep alive session introduces

interesting problem.