reliability tools and options professor ken birman dept. of computer science cornell university
Post on 22-Dec-2015
214 Views
Preview:
TRANSCRIPT
Reliability Tools and Options
Professor Ken BirmanDept. of Computer Science
Cornell University
Last Time
We saw that reliability is a complex spectrum of properties and tradeoffs
We developed the idea of e-triage And we glanced at some
technologies
Today
Last of three lectures on reliability Focus on technologies in more
depth What can they do for us? How do they work? How well to they integrate? Limitations? Scalability issues?
Technologies
Communication Tools: TCP/IP Remote Procedure Call (or “method invocation”) Process group membership tracking and multicast Publish/subscribe (also called “MOMS”)
Checkpoint and Restart, perhaps with mirrored disks Transactions and Databases Web servers and Java/JNI/JavaScript Components and Object-oriented architectures Cluster fault-tolerance and load-balancing Traditional Linux tools and “scripts” Hardware reliability – fault-tolerant computers
Sorting Things Out
Computer scientists like to think in terms of big chunks of technology that they classify into categories
Often we talk about “layers” Lowest layers are close to the hardware Higher layers deal with things closer to
the user who sits in front of a screen
Examples of Layers
Network ProtocolsOperating System
Server Technologies
Middleware
Applications
What Makes a Layer?
Layer uses stuff below it but nothing from above it
And the layer offers a set of services to things above it
Sometimes we imagine a layer as a thing that transforms a computer or a network into a new one with new properties!
Somewhat like looking through a set of magic eyeglasses, each one somehow transforming the world into a magic new world…
Examples of Layers
Network ProtocolsOperating System
Server Technologies
Middleware
Applications
Operating System
Major ones are Windows (several varients), from
Microsoft Linux (one of many versions of Unix) Macintosh OS Palm OS VxWorks, QNX
Many other minor ones
Operating System
It runs the hardware for one computer Also supports “processes”, manages memory and
other resources, provides security
People refer to the OS as a “platform” Applications use OS features and run “on” it They don’t need to deal with special issues involving
the hardware because the OS handles them
These days OS also includes components that handle networking
A modern OS is structured as a set of “objects”
Protocols
These are little programs that run in applications or in the OS
They work by sending messages over network connections
Goal is to do something useful in a distributed manner For example, network can lose packets But web pages can’t tolerate missing chunks of
data! So web uses a protocol that resends lost packets
Representative Protocols?
Just look at two examples to get the feel Don’t worry about the details Idea is to understand the “kinds of
things” each layer is doing, not the specifics We do teach the specifics in Cornell courses But any one of these would take weeks to
cover in a comprehensive way
Communication Tools: TCP/IP
The basic communication technology of the Web
Works like a telephone call: Your browser connects to a server using its IP
address (looks like 128.64.77.133) Your request is sent as a message over the
connection. The result comes back. The connection automatically matches sending
and receiving rates (easily fooled by noisy links!) Also, automatically corrects for data loss
TCP sliding window
mi+k mi+k-1 .... mi+k+1
- - mi+k-2 - mi+k-3 ... mi
sender provides data
receiver consumes data
IP packets carry segments
window has k “segments”
receiver replies with acks and nacks. sender resends missing data
When acknowledgement is received, segment number keeps incrementing but slot number is reused.
TCP/IP: Pros and Cons
Simple, widely supported way to communicate
Overcomes packet loss, duplication, out of order delivery But can reduce rate down to zero when network
becomes congested, easily fooled by a noisy link Also, connections can break even if neither
endpoint actually fails. Things that use TCP, like web browsers,
inherit these benefits… and these problems!
Communication Tools: RPC
Idea is that each program declares a set of actions it can perform – “methods” that can be invoked using an “interface”
Client programs “bind” to interface Send a message to invoke a method,
reply comes back in form of a message too. Special protocols overcome failure
The basic RPC protocol
client server“binds” to
serverregisters with name service
The basic RPC protocol
client server“binds” to
server
prepares, sends request
registers with name service
receives request
The basic RPC protocol
client server“binds” to
server
prepares, sends request
registers with name service
receives requestinvokes handler
The basic RPC protocol
client server“binds” to
server
prepares, sends request
registers with name service
receives requestinvokes handlersends reply
The basic RPC protocol
client server“binds” to
server
prepares, sends request
unpacks reply
registers with name service
receives requestinvokes handlersends reply
RPC Summary
Basic technology in most “client-server” situations with exception of the Web
Can hide packet loss but not server failure Can certainly fail (due to timeout) when
server and client are actually both healthy Many limitations in terms of form of data
you can send, packet size, etc.
When are they used?
TCP is used to transfer “objects” Usually objects are reasonably large Examples are email messages, files,
web pages, copies of programs RPC is used when a program asks for
a service provided by some other program Best for small requests and replies
Examples of Layers
Network ProtocolsOperating System
Middleware
Applications
Server Technologies
Concept Of Middleware
Middleware is any kind of a software tool that runs over a basic infrastructure Provides a standard set of services for some
class of applications Idea is that OS and network may be “too
general” Middleware creates a better environment for
some large class of applications that all share a need poorly addressed by the lower layers
Middleware is increasingly important
Communication Middleware Example: Multicast
Broad term covering a variety of one-many communication tools
We talk about the: Process group: set of programs for which
membership is tracked Multicast: a way of sending data to group State transfer: brings a joining program up to
date Order, atomicity: guarantee that messages are
seen in same order by all members, despite failure
Virtual Synchrony Model
crash
G0={p,q} G1={p,q,r,s} G2={q,r,s} G3={q,r,s,t}
p
q
r
s
tr, s request to join
r,s added; state xfer
t added, state xfer
t requests to join
p fails
Communication Middleware Example: Publish/Subscribe
Packaging of one-many communication tools into an elegant, easily understood form
Idea is that data producers “publish” information, marked with “subjects” that each item is about
Subscribers “subscribe” to the subjects of interest to them
Conceptually, a message “bus”
Boxes are publishers (red / green subjects) Circles are subscribers (“ “ ) Disks represent spoolers used for playback Flexible and easily extended over time
Supports huge numbers of subjects
Conceptually, a message “bus”
Boxes are publishers (blue / green subjects) Circles are subscribers (“ “ ) Disks represent spoolers used for playback Flexible and easily extended over time
Supports huge numbers of subjects
Conceptually, a message “bus”
Boxes are publishers (blue / green subjects) Circles are subscribers (“ “ ) Disks represent spoolers used for playback Flexible and easily extended over time
Supports huge numbers of subjects
Publish/Subscribe Pros and Cons
Conceptually very simple, popular But in practice the infrastructure can be
limiting and cumbersome Often end up with more or less all
processes receiving more or less all the messages, anyhow
Example of a technology that made more sense when computers were slower
When Are They Used?
Process groups? New York Stock Exchange, Swiss Exchange French air traffic control system AEGIS rebuild
Publish-Subscribe message bus Most trading floors Factory automation and process control Some internal use for gluing databases to web
sites
Examples of Layers
Network ProtocolsOperating System
Server Technologies
Middleware
Applications
Servers
Many modern technologies follow a client-server programming model You are the client The server handles incoming requests
This model is probably the big success of the 1980-2000 period for computing
Normally, client connects to server on network and uses some form of RPC to talk to it
Servers
Web servers Database servers Weblogic: a fancy web server that
combines features needed for eCommerce sites
Mail servers, message queuing servers Other application-specific servers
E.g. computer-aided design, payroll, etc…
Servers
Secretly, most servers are a database perhaps extended to know about a specific category of application or use We call this domain-specific refinements Idea is that an Oracle database, out of
the box, is a very general platform but that a lot of work is needed to use it for, say, payroll
Databases use “transactional” model
Transactions and Databases
One of the very big, well supported technologies
Associated with databases Each program “runs a transaction”
beginaction1 action2 action3 ….
commit or abort Either entire transaction is performed, or
entire transaction is erased (if disrupted by crash)
ACID Properties
Atomicity: entire group of actions is treated as one “atomic unit”
Concurrency: more than one can run at the same time on the same database
Isolation: but they are isolated from each other, as if only one ran at a time
Durability: committed transactions survive failures and recoveries
Pros and Cons
Mixture of a powerful model with powerful, comprehensive vendor support
More or less integrated with web But recovery can be slow And high availability databases usually sacrifice some
aspects of ACID guarantee
Note that vendors offer “replication” products but nobody uses these – performance is terrible.
Hot topic: cluster-style parallel servers Clustering is a way to get scalability
Trends in Systems
Enough on layers In previous lectures looked at business
issues associated with the Internet Today have also seen lots of technology
Mixture of current systems Emerging products and systems Technologies
What comes next in distributed computing?
Ways of posing questions
As a business question: I want to get rich, what should I invest in?
Ultimately a flakey and meaningless question Should ask “what should I learn about”
As a research question I want to be famous, what should I invent?
If you’re so smart, you should tell me!
As a big-picture question Where is dramatic change inevitable? This question makes more sense than the others
Looking for Exciting Change
Our goal is to anticipate dramatic, unexpected change
Is there a methodology for identifying the big opportunities?
How can we apply it to networks and distributed computing?
Traditional Areas
File systems Communications Naming of objects, interoperation Security Resource management Transactions Extensibility
Emerging areas
Scalable service management Tools for hosting data Mechanisms for offloading work
from customers onto 3rd party solution provider systems
QoS mechanisms Power-aware and mobility support
Where are the big opportunities?
We could review these one topic at a time, but that might get dull
Can we develop a methodology for recognizing big opportunities and “leaping in”?
Technology trends
Source: Scientific American, Sept. 1995
0100200300400500600700
1985-1990
1990-1995
1995-2000
2000-2005
CPU MIPS
Memor y MB
LAN Mbits
WAN Mbits
O/ Sover head
Note tremendous growth in WAN speeds
Typical latencies (milliseconds)
0.01
0.1
1
10
100
1000
1985
-199
0
1990
-199
5
1995
-200
0
2000
-200
5
Disk I/O
EthernetRPC
ATMroundtrip
WANroundtrip
WAN, disk latencies arefairly constant due to physical limitations
O/S latency: the most expensive overhead on LAN communication!
05
1015
202530
3540
1985-
1990
1995-
2000
O/Soverhead inproportionalterms
Suggests?
Notice that revolutionary opportunity is triggered by technical discontinuity
To predict a revolution…… just identify a technology sector
about to be shaken up by a trend that breaks the usual relationships
… predict “big things will happen”
Recent revolutions
Internet became much faster, more widely available
Operating systems became object oriented Enabled the Web Which enabled all sorts of B2B
developments people knew were coming…
Other examples?
For a long time, PCs were slow and balky, but very cheap
But around 1990 technology gave us a fast, big PC Suddenly, desktop world yielded to PC
world Price point can trigger a discontinuity
Other examples?
We used to be short on memory hence relied heavily on disks
But around 1985 memory sizes and cost changed the equation
Suddenly massive caches made sense Giving us ideas like log-structured file
systems and new styles of caching in file and database systems
A world where 100% hit rates made sense
Looking to the future?
Major discontinuities: Move from PC to PDA/telephone
hybrids Mobility, disconnected operation Emergence of huge numbers of
computing systems that need to cooperate
Perhaps, some form of QoS?
Want to have an impact?
Trick is to zero in on one of these areas Be an early player
For example, get a mobile hand-held system and start to play with it
Lots of things in the legacy infrastructure just aren’t right for it
Your opportunity: fix a few of them by doing the obvious things
And you’ll instantly be famous!
Mobile Trends
Nomadicity: increasingly powerful nomadic devices Anticipate fusion of web browser, telephone
and also PDA functionality Some devices of this sort already exist – but
they remain primitive Low bandwidth interaction a big obstacle
right now – you can’t talk to it, but typing without a keyboard is a pain
Mobile trends
Communications standards We already are seeing widespread use of
wireless ethernet cards Bluetooth is the next big step: widespread
low-power connectivity for small devices XML helps: data objects are readily
understood… fewer proprietary standards
Mobile trends
Power conservation Also better understood Flexibility: compute faster or slower,
move code or data, sleep or run more actively
Signal strength also a factor
Mobile trends
Suggests a future in which We’ll move from place to place with
our computing context In a given setting, devices find the
appropriate local resources and can talk to them
And device is smart about when to ship code, when to ship data
Mobile trends
But this also points to a missing link: exciting research opportunity How to do naming of objects in this new
mobile world? User wants a single personalized name for
resources and a single name space But we also need to share things
And how to organize or structure a nomadic or wireless environment
Peer-to-peer and multi-peer opportunity will be enormous
Illustrating…
A discontinuous development From fixed infrastructure to mobile wireless
one High performance but power-aware Fusion of previously independent
technologies (voice, web, email) Stress on existing infrastructure
We tend to adapt the existing infrastructure to the new setting
But a whole new approach may be needed
Driving…
New ideas in file systems How should we do file systems for mobile and
wireless systems? Communication
How should we do point to point and multicast for wireless peer-to-peer or “ad-hoc” networks?
Is TCP the right protocol for a wireless connection to a server?
The list goes on…
Dangers
It is easy to overreach People tend to try to do 10 things all at
the same time… Need to be incremental
Challenge? Picking the right first step The right infrastructure can enable just
about anything!
But we’re out of time…
Take-aways from this lecture series? Business roles in eCommerce
Examples of existing sectors Some thought about business role in
developing new technology-limited ventures And some review of how technologies are
structured Leading to an angle on how to identify
big emerging opportunity areas
What should I know?
If you want to remember just one thing… Remember the French air traffic control project Where the US project overreached and failed, the
French went slowly, tested like crazy, and built a better system that really worked
Scalability and stability of technology is the key Be French!
Also drink moderate amounts of good red wine Visit http://www.fromages.com now and then Remember that vision of the world as 100
people…
top related