stuck in the middle: challenges and trends in optimizing middleware daniel m. yellin director of...
Post on 20-Dec-2015
215 views
TRANSCRIPT
Stuck in the middle: challenges and trends in optimizing
middleware
Daniel M. Yellin
Director of Software Technology
IBM Research
June 2001
With thanks to a cast of players
• Paul Dantzig
• Joe Hellerstein
• Doug Kimelman
• Chet Murthy
• Darrell Reimer
• Gabi Zodik
• …
Outline
• Middleware introduction
• Paradigm shift from programs to compusystems
• Optimizing middleware
In the middle of what?
In the middle of distributed components
App 1
Customer repository
Directory
Data base
App 3
App 2
Horizontal Composition
GLUE
In the middle of application and network services
Layer of abstraction
Application
Service 1Service
2Service
3Service
4
Middleware
VerticalComposition
Middleware sales growing by leaps and bounds!
0200
400600
8001000
12001400
16001800
2000
1998 1999 2000
Product LicenseIncluding Services
In millions of $
Source: Application Integration Middleware Market, Gartner, 9/01
Example: Revenue for Integration Broker Suites
Why the demand?
• Business trends requiring integration– Enterprise Resource Planning (ERP), Supply
Chain Management (SCM), Customer Relationship Management (CRM)
Why the demand?
• Business trends requiring integration– ERP, SCM, CRM
• Internet has accelerated the drive– New e-business processes that tie together
multiple applications
Why the demand?
• Business trends requiring integration– ERP, SCM, CRM
• Internet has accelerated the drive– New e-business processes that tie together
multiple applications
• Business to business (B2B) will further accelerate this trend
Web Browser
Remote
DatabasePersonalization Data Base
Servlet
JSP
HTML
MQ
Se r i
e s
J
N
D
I
JDBC
JSP
Session
Pool
Data
Bean
1
2
3
4
5
1a
Web Application Server
LDAP Directory
Back End
Business
Applications
Business
Integration
Services
Messaging
Intelligent Routing
Message Transform.
Message Based
Message Based
CICS Connect
Etc.
Agents
C
O
M
M
A
N
D
B
E
A
N
JDBC
Inte
rfac
es
Inte
rfac
es
Typical company web architecture
Outline
Middleware “introduction”
• Paradigm shift from programs to compusystems
• Three principles to optimization
What does it all mean?
• From micro to macro programming– Building the “application logic” itself often a
small part of the overall development process– More and more of the development process is
about making the pieces work together (connecting the dots)!
void main(){int i = i+1;}
From programs to compusystems
• Ecosystem: a community of animals and plants and the environment with which it is interrelated (Webster’s)
• Compusystem: a community of software and hardware components and the middleware with which it is related (Yellin)
Characteristics of compusystems
• They are complex– Many parts– Component interactions are often hard to
understand
• They are forever changing– Cyclic events throughout day, month, year,…– Permanent changes
• New “species” introduced, compusystems may be combined,…
See Information Week, April 2, 2001
“Conquering Complexity”
Characteristics of compusystems
• They run forever– 24 x 7– No such thing as stopping to test, debug, fix
• They have a lot of history (Calcification)– This constrains what you can do
• E.g., too much data to reformat
• E.g., policy system cannot be shut down until last policy holder dies
Outline
Middleware “introduction”Paradigm shift from programs to
compusystems
• Optimizing middleware
Typical performance issues
• “Why are we getting such erractic and poor performance?”
• Complex system• Can’t separate app v.
platform issues• Component expertise
doesn’t extend to system
3 principles to optimizing middleware
• How can we better understand a compusystem to optimize it?
3 principles to optimizing middleware
• Understand
• How do you minimize the overhead of all the middleware?– avoid being stuck in the middle!
understand
3 principles to optimizing middleware
• Understand
• New programming models
• And how can we design for optimality when the system is forever changing?
3 Principles to optimizing middleware
• Understand
• New programming models
• Adapt
Understand
• Understanding systems is different than understanding individual organisms/platforms
• Understanding components or even pointwise interactions is not the same as understanding the compusystem
• Effects can – be non-local– interfere with one another– Only visible under heavy loads
Flows within a compusystem can be hard to follow, can spawn
many threads
Web Browser
DatabasePerson. DB
Servlet
JSP
HTML
MQ
Se r i e s
J
N
D
I
JDBC
JSP
Session
Pool
Data
Bean
2
3
Web Application Server
LDAP
Back End
Applications
Business
Integration
Services
C
O
M
M
A
N
D
B
E
A
N
Inte
rfac
es
Inte
rfac
es
And you have a lot of concurrent flows
Web Browser
DatabasePerson. DB
Servlet
JSP
HTML
MQ
Se r i e s
J
N
D
I
JDBC
JSP
Session
Pool
Data
Bean
2
3
Web Application Server
LDAP
Back End
Applications
Business
Integration
Services
C
O
M
M
A
N
D
B
E
A
N
Inte
rfac
es
Inte
rfac
es
Example: Websphere Performance Toolkit
Browser
DBDB
Servlet
JSP
HTML
JSP
App Server
LDAP
Back End
Apps
Messaging
JVM
monitor
monitor
dashboard
Websphere Performance Toolkit
• Key concepts– Monitor: real time collection of “vital” signs
from across the compusystem
– Dashboard: information consolidated into a unified display
• shows what is happening in the compusystem, and can help make apparent correlations between different parts of the compusystsm
Websphere Performance Toolkit
• “My site is slow, now what?”– dashboard gives immediate clues as to what to look for
– what to display is critical
• Helps find correlations – example: router was broken, was not “spraying”
requests appropriately
– could not tell from looking at single site, only by burst pattern (“wave” traveling from one machine to next …)
Stress testing the system
Browser
DBDB
Servlet
JSP
HTML
JSP
App Server
LDAP
Back End
Apps
Messaging
JVM
1. Spray the system with increasing loads2. See what happens!
Stress rig
monitor
dashboard
Next steps
• Analysis– Can we automate the discovery of common performance
errors?• misconfiguration • mismatch of parameters for particular load signatures
– Interactive trouble shooter– Autonomic middleware
• Combine with predictive modeling techniques– Refine models as better data becomes available– Use models to suggest tests to be performed by stress rig
Understand (2)
• Understanding not only the dynamic behavior of computsystems but also its static structure (code base)– Trade off exactness for scale– Need to understand dependencies across
heterogeneous code artifacts (html, servlets, Java, Cobol,…)
– Example: change in LDAP indexing had ripple effect on application. From 10->20 ms/request.
Example: Asset Locator
Composed of two major modules– information gathering engine (crawler)– semantic search server
eColabra: An Enterprise Collaboration & Reuse Environment", Orit Edelstein, Avi Yaeli, Gabi Zodik,
4th International Workshop, NGITS'99, Zikhron-Yaakov, Israel, July 1999,
http://iew3.technion.ac.il:8080/~ngits/
Information gathering architecture
112
2
3
45
67
8
9
10
11
Server
ClearCase
scheduler
Server
DB2 database & index server
Enterprise Code Repositories
WebDAV
TC
File System
Crawlers
Analyzers
Java
C++
COBOL
HTML
XML
JSP
Text
JavaClassFile
ServerServer
CategorizerRelationship
Analysis
DB2 database &
index server Categorization
Engine
Javarules
htmlrules
...rules
find relationshierarchyreference...
RelationshipAnalysis
Repository analysis phase
Run-time architecture
DB2 database index server
DatabaseServerClient
Eclipse Plugin
LightweightJavaScript based
Search Server
HTTP Server / Servlet Engine
Query Processor
Query/Result Processors
XMLHTML
query/
results
Query (SQL, IR)
package, category, hierarchy
RelationshipAnalysis
Navigator ProcessorRecursive
Client
Data base server
Search server
Asset Locator
• Impact analysis– “What resources are affected if I …?”– Resource statistics
• Reuse– Finding assets of a particular type, function,…
• Understanding– Relationships between many different assets of
many different types
Outline
Middleware “introduction”Paradigm shift from programs to
compusystemsOptimizing middleware
Understanding– New programming models– Adaptive systems
New programming models
• Compusystems have a lot of redundancy– In performing the same computations– In marshalling in and and out of normal forms– Use caching intelligently
• Cache consistency can be hard
• Computations through the compusystem can have a long journey– Minimize length of pipeline– Shift computation and queues “stage left”
• Minimize the holding of resources
Pipelines
cthreads
Net routerQueue
Queue
Http server
plugin
Queue
App server
Servletengine Orb
Resource connections
Pipelines
Net router
Http server App server
Shift computations and queuing to the left
FTgeeepuuCSSDAY.frg
..XXXXXX.frg
FTW400901CSELNK.frg
FTW400902CSELNK.frg
FTgeeepuuMSS.frg
09302030.frg
FTgeeepuuRRS.frg
...NVSLIV.frg
...NVSCSS.frg
...NVSCLS.frg
...NVSMWS.frg
FTgeeepuuNVE.frg
News/TE/sports/Ft.frg <-- No News Today Fragment
common_top.frgExample:Very Large Web Sites
Massive web sites with dynamic content
• Evolution of web sites – From static html, gif, jpeg– To dynamic html, gif, jpeg + database accesses,
transactions + rich media– Massive number of changing “objects”
• Objects may be assembled into many pages• Challenges
– management– throughput– consistency
Object Dependency Graph
PageAssem bler
Dependency Parser
E.frag <ol> < li> Item 1 in E frag < li> Item 2 in E frag < /o l>
B .frag <p>This is the B .frag</p> < !--% fragm ent(E .frag)-->
A .frag<htm l><body> <p>This is the body</p> < !--% fragm ent(B .frag)--> < !--% fragm ent(C .frag)--> < im g src="abc.g if" --></body></htm l>
C .frag <p>This is the C .frag</p> < im g "src=xyz.g if">
O b je c t So urc e File s
AB
C
E
xyz.gif
C o m p o se d O b je c ts
abc.g if
A
B C
E
Cxyz.gif
C om position E dges
R eference E dges
O b je c t De p e nd e nc yG ra p h
abc.gif
A
Distributed storage model
Cache
Object ID1
Object ID2
Object ID3
Object ID5
Object ID6
Object ID4
DependenciesHashTable
Page1
Page2
Page3
Page4
Page5
HTMLPages
Network
Page 5Page YPage 4Page 3Page 2Page XPage 1
Cache
Using caching effectively
• Four tier web serving architecture– Content management servers– Origin Servers– Origin Caches– Point of distribution caches
• Generation upon demand
Production upon availability• Intelligent cache expiration policies
The effectiveness of caching
ISP
Internet
Pods
Server Complex
Proportion of hitsProportion of data
Domino
Flat
ISP Caches
Server Caches
100%
100%
75% ?
75% ?
6%
0.7%
1.2%
25%
0.3%
20%
ServerCaches
ServerCaches
Schaumburg
ServerCaches
ServerCaches
Bethesda
Los Angeles
New York
San Francisco
ServerCaches "High-Performance Web Site Design
Techniques",
Arun Iyengar Jim Challenger, Daniel Dias, and Paul Dantzig,
In IEEE Internet Computing, vol. 4 #2, March/April 2000.
Web Traffic Volumes
DATE 1Q 98 2Q 98 2Q 99 3Q 99 2Q 00 3Q 00
Peak Hits1- Day
55 M 37M 100 M 153 M 282 M 875 M
Peak Hits / Minute
110 K 147 K 430 K 374 K 964 K 1.2 M
Availability 100% 100% 100% 100% 100% 100%
Outline
Middleware “introduction”Paradigm shift from programs to
compusystemsOptimizing middleware
UnderstandingNew programming models– Adaptive systems
Adapt
• It is hard to predict how compusystems will behave– The performance characteristics of the
individual components are often not known– Performance characteristics change due to
changing conditions • Periodic events (e.g., monthly billing cycles)
• Permanent changes (e.g., new systems added)
Examples: adaptive server tuning
• Goals– Generic agent for automated tuning
• Automatically learn performance characteristics of target system(s)
• Adapt system to changing workload and environment
• Able to handle distribution, heterogeneity
Users
Administrator
Controller
RPCs
Sensor
Server
Referencevalue
QueueLength
Tuningcontrol
ServerLog
Control law:
u(t) = u(t-1) + K e(t)
Experimental results: k= .1
Experimental results: k = 50
Experimental results: k = 1
Future research
cthreads
plugin
Net router
Queue
Http server
Queue
App server
Servletengine Orb
Resource connectionsCorrelate optimal settings formultiple queues and connections
Other research directions
• Performance contracts– Provide components with signatures not only of the
functions they provide, but performance they need • E.g., guarantee that it will use resources [x,y,z]• E.g., guarantee that it will open [5-20] files
• Better models of compusystems– Including complexity models
• E.g., pipeline length as a measure of complexity? (the # of distinct “components” a transaction must pass through from beginning to end)
• Self healing systems
Summary
Paradigm shift from micro to macro programming– Compusystems
Optimization – Requires understanding of these systems– Exploits new programming paradigms– Will require more adaptive techniques