stuck in the middle: challenges and trends in optimizing middleware daniel m. yellin director of...

57
Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Post on 20-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Stuck in the middle: challenges and trends in optimizing

middleware

Daniel M. Yellin

Director of Software Technology

IBM Research

June 2001

Page 2: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

With thanks to a cast of players

• Paul Dantzig

• Joe Hellerstein

• Doug Kimelman

• Chet Murthy

• Darrell Reimer

• Gabi Zodik

• …

Page 3: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Outline

• Middleware introduction

• Paradigm shift from programs to compusystems

• Optimizing middleware

Page 4: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

In the middle of what?

Page 5: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

In the middle of distributed components

App 1

Customer repository

Directory

Data base

App 3

App 2

Horizontal Composition

GLUE

Page 6: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

In the middle of application and network services

Layer of abstraction

Application

Service 1Service

2Service

3Service

4

Middleware

VerticalComposition

Page 7: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Middleware sales growing by leaps and bounds!

0200

400600

8001000

12001400

16001800

2000

1998 1999 2000

Product LicenseIncluding Services

In millions of $

Source: Application Integration Middleware Market, Gartner, 9/01

Example: Revenue for Integration Broker Suites

Page 8: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Why the demand?

• Business trends requiring integration– Enterprise Resource Planning (ERP), Supply

Chain Management (SCM), Customer Relationship Management (CRM)

Page 9: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Why the demand?

• Business trends requiring integration– ERP, SCM, CRM

• Internet has accelerated the drive– New e-business processes that tie together

multiple applications

Page 10: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Why the demand?

• Business trends requiring integration– ERP, SCM, CRM

• Internet has accelerated the drive– New e-business processes that tie together

multiple applications

• Business to business (B2B) will further accelerate this trend

Page 11: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Web Browser

Remote

DatabasePersonalization Data Base

Servlet

JSP

HTML

MQ

Se r i

e s

J

N

D

I

JDBC

JSP

Session

Pool

Data

Bean

1

2

3

4

5

1a

Web Application Server

LDAP Directory

Back End

Business

Applications

Business

Integration

Services

Messaging

Intelligent Routing

Message Transform.

Message Based

Message Based

CICS Connect

Etc.

Agents

C

O

M

M

A

N

D

B

E

A

N

JDBC

Inte

rfac

es

Inte

rfac

es

Typical company web architecture

Page 12: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Outline

Middleware “introduction”

• Paradigm shift from programs to compusystems

• Three principles to optimization

Page 13: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

What does it all mean?

• From micro to macro programming– Building the “application logic” itself often a

small part of the overall development process– More and more of the development process is

about making the pieces work together (connecting the dots)!

void main(){int i = i+1;}

Page 14: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

From programs to compusystems

• Ecosystem: a community of animals and plants and the environment with which it is interrelated (Webster’s)

• Compusystem: a community of software and hardware components and the middleware with which it is related (Yellin)

Page 15: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Characteristics of compusystems

• They are complex– Many parts– Component interactions are often hard to

understand

• They are forever changing– Cyclic events throughout day, month, year,…– Permanent changes

• New “species” introduced, compusystems may be combined,…

See Information Week, April 2, 2001

“Conquering Complexity”

Page 16: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Characteristics of compusystems

• They run forever– 24 x 7– No such thing as stopping to test, debug, fix

• They have a lot of history (Calcification)– This constrains what you can do

• E.g., too much data to reformat

• E.g., policy system cannot be shut down until last policy holder dies

Page 17: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Outline

Middleware “introduction”Paradigm shift from programs to

compusystems

• Optimizing middleware

Page 18: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Typical performance issues

• “Why are we getting such erractic and poor performance?”

• Complex system• Can’t separate app v.

platform issues• Component expertise

doesn’t extend to system

Page 19: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

3 principles to optimizing middleware

• How can we better understand a compusystem to optimize it?

Page 20: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

3 principles to optimizing middleware

• Understand

• How do you minimize the overhead of all the middleware?– avoid being stuck in the middle!

understand

Page 21: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

3 principles to optimizing middleware

• Understand

• New programming models

• And how can we design for optimality when the system is forever changing?

Page 22: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

3 Principles to optimizing middleware

• Understand

• New programming models

• Adapt

Page 23: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Understand

• Understanding systems is different than understanding individual organisms/platforms

• Understanding components or even pointwise interactions is not the same as understanding the compusystem

• Effects can – be non-local– interfere with one another– Only visible under heavy loads

Page 24: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Flows within a compusystem can be hard to follow, can spawn

many threads

Web Browser

DatabasePerson. DB

Servlet

JSP

HTML

MQ

Se r i e s

J

N

D

I

JDBC

JSP

Session

Pool

Data

Bean

2

3

Web Application Server

LDAP

Back End

Applications

Business

Integration

Services

C

O

M

M

A

N

D

B

E

A

N

Inte

rfac

es

Inte

rfac

es

Page 25: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

And you have a lot of concurrent flows

Web Browser

DatabasePerson. DB

Servlet

JSP

HTML

MQ

Se r i e s

J

N

D

I

JDBC

JSP

Session

Pool

Data

Bean

2

3

Web Application Server

LDAP

Back End

Applications

Business

Integration

Services

C

O

M

M

A

N

D

B

E

A

N

Inte

rfac

es

Inte

rfac

es

Page 26: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Example: Websphere Performance Toolkit

Browser

DBDB

Servlet

JSP

HTML

JSP

App Server

LDAP

Back End

Apps

Messaging

JVM

monitor

monitor

dashboard

Page 27: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Websphere Performance Toolkit

• Key concepts– Monitor: real time collection of “vital” signs

from across the compusystem

– Dashboard: information consolidated into a unified display

• shows what is happening in the compusystem, and can help make apparent correlations between different parts of the compusystsm

Page 28: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Websphere Performance Toolkit

• “My site is slow, now what?”– dashboard gives immediate clues as to what to look for

– what to display is critical

• Helps find correlations – example: router was broken, was not “spraying”

requests appropriately

– could not tell from looking at single site, only by burst pattern (“wave” traveling from one machine to next …)

Page 29: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Stress testing the system

Browser

DBDB

Servlet

JSP

HTML

JSP

App Server

LDAP

Back End

Apps

Messaging

JVM

1. Spray the system with increasing loads2. See what happens!

Stress rig

monitor

dashboard

Page 30: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Next steps

• Analysis– Can we automate the discovery of common performance

errors?• misconfiguration • mismatch of parameters for particular load signatures

– Interactive trouble shooter– Autonomic middleware

• Combine with predictive modeling techniques– Refine models as better data becomes available– Use models to suggest tests to be performed by stress rig

Page 31: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Understand (2)

• Understanding not only the dynamic behavior of computsystems but also its static structure (code base)– Trade off exactness for scale– Need to understand dependencies across

heterogeneous code artifacts (html, servlets, Java, Cobol,…)

– Example: change in LDAP indexing had ripple effect on application. From 10->20 ms/request.

Page 32: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Example: Asset Locator

Composed of two major modules– information gathering engine (crawler)– semantic search server

eColabra: An Enterprise Collaboration & Reuse Environment", Orit Edelstein, Avi Yaeli, Gabi Zodik,

4th International Workshop, NGITS'99, Zikhron-Yaakov, Israel, July 1999,

http://iew3.technion.ac.il:8080/~ngits/

Page 33: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Information gathering architecture

112

2

3

45

67

8

9

10

11

Server

ClearCase

scheduler

Server

DB2 database & index server

Enterprise Code Repositories

WebDAV

TC

File System

Crawlers

Analyzers

Java

C++

COBOL

HTML

XML

JSP

Text

JavaClassFile

Page 34: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

ServerServer

CategorizerRelationship

Analysis

DB2 database &

index server Categorization

Engine

Javarules

htmlrules

...rules

find relationshierarchyreference...

RelationshipAnalysis

Repository analysis phase

Page 35: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Run-time architecture

DB2 database index server

DatabaseServerClient

Eclipse Plugin

LightweightJavaScript based

Search Server

HTTP Server / Servlet Engine

Query Processor

Query/Result Processors

XMLHTML

query/

results

Query (SQL, IR)

package, category, hierarchy

RelationshipAnalysis

Navigator ProcessorRecursive

Client

Data base server

Search server

Page 36: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Asset Locator

• Impact analysis– “What resources are affected if I …?”– Resource statistics

• Reuse– Finding assets of a particular type, function,…

• Understanding– Relationships between many different assets of

many different types

Page 37: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Outline

Middleware “introduction”Paradigm shift from programs to

compusystemsOptimizing middleware

Understanding– New programming models– Adaptive systems

Page 38: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

New programming models

• Compusystems have a lot of redundancy– In performing the same computations– In marshalling in and and out of normal forms– Use caching intelligently

• Cache consistency can be hard

• Computations through the compusystem can have a long journey– Minimize length of pipeline– Shift computation and queues “stage left”

• Minimize the holding of resources

Page 39: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Pipelines

cthreads

Net routerQueue

Queue

Http server

plugin

Queue

App server

Servletengine Orb

Resource connections

Page 40: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Pipelines

Net router

Http server App server

Shift computations and queuing to the left

Page 41: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

FTgeeepuuCSSDAY.frg

..XXXXXX.frg

FTW400901CSELNK.frg

FTW400902CSELNK.frg

FTgeeepuuMSS.frg

09302030.frg

FTgeeepuuRRS.frg

...NVSLIV.frg

...NVSCSS.frg

...NVSCLS.frg

...NVSMWS.frg

FTgeeepuuNVE.frg

News/TE/sports/Ft.frg <-- No News Today Fragment

common_top.frgExample:Very Large Web Sites

Page 42: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Massive web sites with dynamic content

• Evolution of web sites – From static html, gif, jpeg– To dynamic html, gif, jpeg + database accesses,

transactions + rich media– Massive number of changing “objects”

• Objects may be assembled into many pages• Challenges

– management– throughput– consistency

Page 43: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Object Dependency Graph

PageAssem bler

Dependency Parser

E.frag <ol> < li> Item 1 in E frag < li> Item 2 in E frag < /o l>

B .frag <p>This is the B .frag</p> < !--% fragm ent(E .frag)-->

A .frag<htm l><body> <p>This is the body</p> < !--% fragm ent(B .frag)--> < !--% fragm ent(C .frag)--> < im g src="abc.g if" --></body></htm l>

C .frag <p>This is the C .frag</p> < im g "src=xyz.g if">

O b je c t So urc e File s

AB

C

E

xyz.gif

C o m p o se d O b je c ts

abc.g if

A

B C

E

Cxyz.gif

C om position E dges

R eference E dges

O b je c t De p e nd e nc yG ra p h

abc.gif

A

Page 44: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Distributed storage model

Cache

Object ID1

Object ID2

Object ID3

Object ID5

Object ID6

Object ID4

DependenciesHashTable

Page1

Page2

Page3

Page4

Page5

HTMLPages

Network

Page 5Page YPage 4Page 3Page 2Page XPage 1

Cache

Page 45: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Using caching effectively

• Four tier web serving architecture– Content management servers– Origin Servers– Origin Caches– Point of distribution caches

• Generation upon demand

Production upon availability• Intelligent cache expiration policies

Page 46: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

The effectiveness of caching

ISP

Internet

Pods

Server Complex

Proportion of hitsProportion of data

Domino

Flat

ISP Caches

Server Caches

100%

100%

75% ?

75% ?

6%

0.7%

1.2%

25%

0.3%

20%

ServerCaches

ServerCaches

Schaumburg

ServerCaches

ServerCaches

Bethesda

Los Angeles

New York

San Francisco

ServerCaches "High-Performance Web Site Design

Techniques",

Arun Iyengar Jim Challenger, Daniel Dias, and Paul Dantzig,

In IEEE Internet Computing, vol. 4 #2, March/April 2000.

Page 47: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Web Traffic Volumes

DATE 1Q 98 2Q 98 2Q 99 3Q 99 2Q 00 3Q 00

Peak Hits1- Day

55 M 37M 100 M 153 M 282 M 875 M

Peak Hits / Minute

110 K 147 K 430 K 374 K 964 K 1.2 M

Availability 100% 100% 100% 100% 100% 100%

Page 48: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Outline

Middleware “introduction”Paradigm shift from programs to

compusystemsOptimizing middleware

UnderstandingNew programming models– Adaptive systems

Page 49: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Adapt

• It is hard to predict how compusystems will behave– The performance characteristics of the

individual components are often not known– Performance characteristics change due to

changing conditions • Periodic events (e.g., monthly billing cycles)

• Permanent changes (e.g., new systems added)

Page 50: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Examples: adaptive server tuning

• Goals– Generic agent for automated tuning

• Automatically learn performance characteristics of target system(s)

• Adapt system to changing workload and environment

• Able to handle distribution, heterogeneity

Page 51: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Users

Administrator

Controller

RPCs

Sensor

Server

Referencevalue

QueueLength

Tuningcontrol

ServerLog

Control law:

u(t) = u(t-1) + K e(t)

Page 52: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Experimental results: k= .1

Page 53: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Experimental results: k = 50

Page 54: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Experimental results: k = 1

Page 55: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Future research

cthreads

plugin

Net router

Queue

Http server

Queue

App server

Servletengine Orb

Resource connectionsCorrelate optimal settings formultiple queues and connections

Page 56: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Other research directions

• Performance contracts– Provide components with signatures not only of the

functions they provide, but performance they need • E.g., guarantee that it will use resources [x,y,z]• E.g., guarantee that it will open [5-20] files

• Better models of compusystems– Including complexity models

• E.g., pipeline length as a measure of complexity? (the # of distinct “components” a transaction must pass through from beginning to end)

• Self healing systems

Page 57: Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001

Summary

Paradigm shift from micro to macro programming– Compusystems

Optimization – Requires understanding of these systems– Exploits new programming paradigms– Will require more adaptive techniques