distributed system: lecture 2box/ds_cloud/ds_lecture2.pdf · distributed system: lecture 2 ......
Post on 14-Jul-2020
6 Views
Preview:
TRANSCRIPT
Distributed System: Lecture 2
Box Leangsuksun SWECO Endowed Professor, Computer Science Louisiana Tech University box@latech.edu
CTO, PB Tech International Inc. naibox@gmail.com
3/31/14 2
System Models
based on Professor Paul Francis notes, Cornell University
Distributed System: Lecture 2
3/31/14 3
Models
• a simplified representation of a system or phenomenon
• To provide an abstract, simplified but consistent description of a relevant aspect of distributed system – Mathematical representation – Graphical notations
Why do we need modeling
• To study physical systems without actually building them.
• Help better design • Understanding important aspect such as
– Performance – Reliability – Not to mention to confirm functionality
3/31/14 Towards survivable architecture 4
Why? Goals
• Compare Alternatives • Determine impacts (per features) • System Tuning • quantify relative Rel/Avail/Perf • Debugging • Set Expectation
How to measure or estimate
• Measurements • Simulations • Analytical Modeling
Measurements
• Actual System Construction • Create a workload per requirements • Provides the best results • Inherent difficult and inflexible • Almost impossible for What-if
Measurements (continued)
• Measure system or subsystem performance with tools – Gprof – Top/ ps etc.. – Benchmark programs (e.g. Linpak, Specmark,
Winmark – Papi, perfctr, perfmon, perfsuite
• What about reliability measurement? log, trace, outages.
Simulation
• A program to simulate important characteristics of targeted systems
• Flexible and ease to modify • Good for the What-if analysis • Difficult to model every small details • Popular – cost-effective and flexible • Suffer from details
Analytical Modeling
• Mathematical description of the system • Provide a quick insight
– To help guiding in detail simulation or measurement-based
• Results are much less believable or accurate
Performance
• Computation – CPU – Memory – I/O etc
• Communication – Latency – Bandwidth
• Transaction – Possible more involvement than DB
Some Criteria
• Throughput – # of completed requests per time unit
• Response time – amount of time it takes from when a request was submitted until the first response is produced, not output
• CPU utilization – keep the CPU as busy as possible
• Turnaround time – amount of time to execute a particular request (finishing time – arrival time)
Stop Mar 25
• Important Announcement: "!
Midterm Exam April 10, 2014.!
Performance issue discovery phase
Requirement Architecture/design Development/code test
1/19/2004 3/19/2004
2/1/2004 3/1/2004
1/19/2004 - 3/19/2004Re-design, code, re-test
Telcomm industry architecture review: 1/3 related issues to performance
Simple example of effective memory access time
• Example – H = cache hit prob, – Tm = memory access
time, – Tc= cache access time
• What is an effective memory accees time?
3/31/14 Towards survivable architecture 15
CPU
cache
memory
Example of modeling problem in DS
• operation/transaction modeling for an e-commerce system – Browsing order Tb + submitting order Ts – 90 % vs 10% (volume) – Weight 20% vs. 80% order – Order = 50 instructions + 10 mem
3/31/14 Towards survivable architecture 16
Comparison (Lilja’ book)
Factor Analytical Modeling
Simulation Measurement
Flexibility High High Low Cost Low Medium High Believability Low Medium High Accuracy Low Medium High
3/31/14 18
System Models
• Physical Model represents underlying hardware elements of a distributed system that abstracts away from specific details of the computer and networking technologies employed
• Architectural model defines the way in which the components of the system are placed and how they interact with one another and the way in which they are mapped onto the underlying network of computers.
• Fundamental models: – Interaction model deals with communication details among the
components and their timing and performance details. – Failure model gives specification of faults and defines reliable
communication and correct processes. – Security model specifies possible threats and defines the concept of secure
channels.
Physical Model
• represents underlying hardware elements
3/31/14 19
Credit:http://www.krug-soft.com/ Credit:http://cisco.com/
3/31/14 20
Architectural Model
• Concerned with placement of its parts and relationship among them.
• Example: client-server model, peer-to-peer model • Abstracts the functions of the individual components. • Defines patterns for distribution of data and
workload. • Defines patterns of communication among the
components. • Example: Definition of server process, client process
and peer process and protocols for communication among processes; definition client/server model and its variations.
3/31/14 21
Software and hardware service layers in distributed systems
Applicat ions, serv ices
Computer and network hardware
Platform
Operating sys tem
Mi ddleware
7
National Weather Service Web Site
Data Aggregator RMI WeatherInfo
Server
RMI WeatherInfo Client
Application
RMI IP Socket API
Weather Web Service Web Client
Analytics Weather Web Service
Server
Relation Database MySQL
Http
Http SOAP/REST XML
LAN
1
2
3
4
5 6
Weather Google Map Client
7
Example of distributed weather monitoring systems (Architecture Model)
3/31/14 23
Middleware
• Layer of software whose purpose is to mask the heterogeneity and to provide a convenient programming model for application programmers.
• Middleware supports such abstractions as remote method invocation, group communications, event notification, replication of shared data, real-time data streaming.
• Examples: Java RMI, grid software (Globus, Open grid Services), Web services.
3/31/14 24
Clients invoke individual servers
Server
Client
Client
invocation
result
Serverinvocation
result
Process:Key:
Computer:EX: browser, web client
EX: Web server
EX: 1. File server, 2. Web crawler
3/31/14 25
A service provided by multiple servers
Server
Server
Server
Serv ice
Client
Client
EX: akamai (data duplication), now amazon aws (zones)
3/31/14 26
Web proxy server and caches
Client
Proxy
Web
server
Web
server
serverClient
Proxy servers + cache are used to provide increased Availability and performance. They also play a major role Firewall based security. http://www.interhack.net/pubs/fwfaq/
3/31/14 27
A distributed application based on peer processes
Coordinat ion
Application
code
Coordinat ion
Application
code
Coordinat ion
Application
code
Ex: distributed Whiteboard Application; Music sharing
3/31/14 28
Web applets
a) c lient reques t results in the downloading of appl et code
Web server
ClientWeb serverApplet
Applet codeClient
b) c lient interacts with the applet
EX: Code streaming; mobile code
3/31/14 29
Interaction Models
• Within address space (using path as addresses)
• Socket based communication: connection-oriented, connection-less – Socket is an end-point of communication – Lets look at some code + details
3/31/14 30
Socket based communication
int sockfd; struct sockaddr_in addr; addr.sin_family = AF_INET; addr.sin_addr.s_addr =
inet_addr(SERV_HOST_ADDR); addr.sin_port = htons(SERV_TCP_PORT); sockfd = socket(AF_INET, SOCK_STREAM, 0); connect(sockfd, (struct sockaddr *) &addr,
sizeof(serv_addr)); do_stuff(stdin, sockfd);
3/31/14 31
Classic view of network API
• Start with host name (maybe) foo.bar.com
3/31/14 32
Classic view of network API
• Start with host name • Get an IP address foo.bar.com
gethostbyname()
10.5.4.3
3/31/14 33
Classic view of network API
• Start with host name • Get an IP address • Make a socket
(protocol, address)
foo.bar.com gethostbyname()
10.5.4.3
sock_id
socket();connect();…
3/31/14 34
Classic view of network API
• Start with host name • Get an IP address • Make a socket
(protocol, address) • Send byte stream
(TCP) or packets (UDP)
foo.bar.com gethostbyname()
10.5.4.3
sock_id
socket();connect();…
TCP sock UDP sock
Network
1,2,3,4,5,6,7,8,9 . . . …
Eventually arrive in order
May or may not arrive
3/31/14 35
Protocol layering
• Communications stack consists of a set of services, each providing a service to the layer above, and using services of the layer below – Each service has a programming API, just like any software
module • Each service has to convey information one or more
peers across the network • This information is contained in a header
– The headers are transmitted in the same order as the layered services
3/31/14 36
Protocol layering example
Browser process
HTTP
TCP
Link1
IP
Link1
IP
Link2
Web server process
HTTP
TCP
Link1
IP
Physical Link 1 Physical Link 2
Router
3/31/14 37
HTTP
Protocol layering example
Browser process
TCP
Link1
IP
Link1
IP
Link2
Web server process
HTTP
TCP
Link1
IP
Physical Link 1 Physical Link 2
Router
H
Browser wants to request a page. Calls HTTP with the web address (URL). HTTP’s job is to convey the URL to the web server. HTTP learns the IP address of the web server, adds its header, and calls TCP.
3/31/14 38
HTTP
Protocol layering example
Browser process
TCP
Link1
IP
Link1
IP
Link2
Web server process
HTTP
TCP
Link1
IP
Physical Link 1 Physical Link 2
H
TCP’s job is to work with server to make sure bytes arrive reliably and in order. TCP adds its header and calls IP. (Before that, TCP establishes a connection with its peer.)
T Router
3/31/14 39
HTTP
Protocol layering example
Browser process
TCP
Link1
IP
Link1
IP
Link2
Web server process
HTTP
TCP
Link1
IP
Physical Link 1 Physical Link 2
H
IP’s job is to get the packet routed to the peer through zero or more routers. IP determines the next hop from the destination IP address. IP adds its header and calls the link layer (i.e. Ethernet) with the next hop address.
T
Router
I
3/31/14 40
HTTP
Protocol layering example
Browser process
TCP
Link1
IP
Link1
IP
Link2
Web server process
HTTP
TCP
Link1
IP
Physical Link 1 Physical Link 2
H
The link’s job is to get the packet to the next physical box (here a router). It adds its header and sends the resulting packet over the “wire”.
T
Router
I L1
3/31/14 41
HTTP
Protocol layering example
Browser process
TCP
Link1
IP
Link1
IP
Link2
Web server process
HTTP
TCP
Link1
IP
Physical Link 1 Physical Link 2
H
The router’s link layer receives the packet, strips the link header, and hands the result to the IP forwarding process.
T
Router
I
3/31/14 42
HTTP
Protocol layering example
Browser process
TCP
Link1
IP
Link1
IP
Link2
Web server process
HTTP
TCP
Link1
IP
Physical Link 1 Physical Link 2
H
The router’s IP forwarding process looks at the destination IP address, determines what the next hop is, and hands the packet to the appropriate link layer with the appropriate next hop link address.
T
Router
I
3/31/14 43
HTTP
Protocol layering example
Browser process
TCP
Link1
IP
Link1
IP
Link2
Web server process
HTTP
TCP
Link1
IP
Physical Link 1 Physical Link 2
H
The packet goes over the link to the web server, after which each layer processes and strips its corresponding header.
T
Router
I L2
H T I
H T
H
3/31/14 44
Basic elements of any protocol header
• Demuxing field – Indicates which is the next higher layer (or
process, or context, etc.) • Length field or header delimiter
– For the header, optionally for the whole packet • Header format may be text (HTTP, SMTP
(email)) or binary (IP, TCP, Ethernet)
3/31/14 45
Demuxing fields
• Ethernet: Protocol Number – Indicates IPv4, IPv6, (old: Appletalk, SNA, Decnet, etc.)
• IP: Protocol Number – Indicates TCP, UDP, SCTP
• TCP and UDP: Port Number – Well known ports indicate FTP, SMTP, HTTP, SIP, many others – Dynamically negotiated ports indicate specific processes (for these and
other protocols)
• HTTP: Host field – Indicates “virtual web server” within a physical web server
3/31/14 46
IP (Internet Protocol)
• Three services: – Unicast: transmits a packet to a specific host – Multicast: transmits a packet to a group of hosts – Anycast: transmits a packet to one of a group of hosts (typically
nearest) • Destination and source identified by the IP address (32 bits
for IPv4, 128 bits for IPv6) • All services are unreliable
– Packet may be dropped, duplicated, and received in a different order
3/31/14 47
IP(v4) address format
• In binary, a 32-bit integer • In text, this: “128.52.7.243”
– Each decimal digit represents 8 bits (0 – 255) • “Private” addresses are not globally unique:
– Used behind NAT boxes – 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
• Multicast addresses start with 1110 as the first 4 bits (Class D address) – 224.0.0.0/4
• Unicast and anycast addresses come from the same space
3/31/14 48
UDP (User Datagram Protocol)
• Runs above IP • Same unreliable service as IP
– Packets can get lost anywhere: • Outgoing buffer at source • Router or link • Incoming buffer at destination
• But adds port numbers – Used to identify “application layer” protocols or processes
• Also a checksum, optional
3/31/14 49
TCP (Transmission Control Protocol)
• Runs above IP – Port number and checksum like UDP
• Service is in-order byte stream – Application does not absolutely know how the bytes are packaged
in packets • Flow control and congestion control • Connection setup and teardown phases • Can be considerable delay between bytes in at source and
bytes out at destination – Because of timeouts and retransmissions
• Works only with unicast (not multicast or anycast)
3/31/14 50
UDP vs. TCP
• UDP is more real-time – Packet is sent or dropped, but is not delayed
• UDP has more of a “message” flavor – One packet = one message – But must add reliability mechanisms over it
• TCP is great for transferring a file or a bunch of email, but kind-of frustrating for messaging – Interrupts to application don’t conform to message boundaries – No “Application Layer Framing”
• TCP is vulnerable to DoS (Denial of Service) attacks, because initial packet consumes resources at the receiver
Instructor’s Guide for Coulouris, Dollimore and
Kindberg Distributed Systems: Concepts and Design Edn. 4 © Pearson Education 2005
Figure 2.8 Real-time ordering of events
send
receive
send
receive
m1 m2
2
1
3
4X
Y
Z
Physical time
Am3
receive receive
send
receive receive receivet1 t2 t3
receive
receivem2
m1
Instructor’s Guide for Coulouris, Dollimore and
Kindberg Distributed Systems: Concepts and Design Edn. 4 © Pearson Education 2005
Figure 2.9 Processes and channels
process p process q
Communi cat ion channel
send
Outgoing message buffer Incoming message buffer
receivem
Instructor’s Guide for Coulouris, Dollimore and
Kindberg Distributed Systems: Concepts and Design Edn. 4 © Pearson Education 2005
Failure Model: Omission and arbitrary failures
Class of failure Affects Description Fail-stop Process Process halts and remains halted. Other processes may
detect this state. Crash Process Process halts and remains halted. Other processes may
not be able to detect this state. Omission Channel A message inserted in an outgoing message buffer never
arrives at the other end’s incoming message buffer. Send-omission Process A process completes a send, but the message is not put
in its outgoing message buffer. Receive-omission Process A message is put in a process’s incoming message
buffer, but that process does not receive it. Arbitrary (Byzantine)
Process or channel
Process/channel exhibits arbitrary behaviour: it may send/transmit arbitrary messages at arbitrary times, commit omissions; a process may stop or take an incorrect step.
Instructor’s Guide for Coulouris, Dollimore and
Kindberg Distributed Systems: Concepts and Design Edn. 4 © Pearson Education 2005
Figure 2.11 Timing failures
Class of Failure Affects Description Clock Process Process’s local clock exceeds the bounds on its
rate of drift from real time. Performance Process Process exceeds the bounds on the interval
between two steps. Performance Channel A message’s transmission takes longer than the
stated bound.
Dependability Modeling
• Include reliability modeling and availability modeling • A designed system can be shown to meet performance
and dependability requirement. • provide a good mechanism for examining the behavior of
a system, right from the design stage to implementation and final deployment.
Dependability
• Two measures – Reliability (MTTF) – Availability (ratio of uptime/total)
Reliability
• Definition: The reliability R(t) of a system at time t is the probability that the system failure has not occurred in the interval [0,t). If X is a random variable that represents the time to occurrence of system failure, then R(t)=P(X>t).
• unreliability = 1-R(t)
Reliability
• Definition MTTF of a system is the expected time until the occurrence of the (first) system failure. If X is a random variable that represents the time to occurrence of system failure, then MTTF=E[X].
• Given the system reliability R(t), the MTTF can be computed as,
MTTF = ∫ R(t)dt
Availability
• A measurement represents a ratio of uptime vs. total times
• High availability - ability of a system to perform its function continuously (without interruption) for a significantly longer period of time than the reliabilities of its individual components would suggest.
• High availability is most often achieved through fault tolerance.
Degree of Availability System Type Unavailability
(minutes/year) Availability (in percent) Availability Class
Unmanaged 50,000 90 1
Managed 5,000 99 2
Well-managed 500 99.9 3
Fault-tolerant 50 99.99 4
High Availability 5 99.999 5
Very High Availability 0.5 99.9999 6
Ultra Availability 0.05
99.99999
7
Availability
• Definition: Availability A(t) of a system at time t is the probability that the system is functioning correctly at time t.
• Like the reliability measure, in some applications it is better to compute the system unavailability U(t) = 1 -A(t).
• Availability = MTTF / (MTTF + MTTR) • A steady = lim A(t) where t -> ∞
Modeling Techniques
• Non State-space – Fault-tree – Reliability Block Diagram
• State-Space – Continuous Markov Chain – Stochastic Petri Net
Example of system
Fault Tree
Availability Model Server up Server down & repair
S1
S1
S2
time
Availability model
HA-OSCAR dual head model
S1&S2
HA-OSCAR SRN model
• Server sub-model
• Switches
• Compute nodes
Server Sub Model
• P Server up • P Server down • Failover • P server repair • Failback
• S is up and ready • S takes control • S Server down • S repair
Compute node sub model
Switch sub model
Class discussion/Exercise
• Say we have to design and develop a disaster warning system that has interfaces to multiple systems and perform event analysis for possible disaster/dangers
• High Level Requirements
– Open interface – Scalable for many subscribers for event notification – 24/7 availablity
3/31/14 Towards survivable architecture 69
3/31/14 70
Summary
• When designing systems or analyzing systems, you want to examine at the high level the architectural model.
• Subsequent steps will explore fundamental models such as interaction model, security model, failure model, reliability model etc.
Case study in Cloud-based EKG system
3/31/14 Towards survivable architecture 71
top related