distributed system: lecture 2box/ds_cloud/ds_lecture2.pdf · distributed system: lecture 2 ......

Post on 14-Jul-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Distributed System: Lecture 2

Box Leangsuksun SWECO Endowed Professor, Computer Science Louisiana Tech University box@latech.edu

CTO, PB Tech International Inc. naibox@gmail.com

3/31/14 2

System Models

based on Professor Paul Francis notes, Cornell University

Distributed System: Lecture 2

3/31/14 3

Models

•  a simplified representation of a system or phenomenon

•  To provide an abstract, simplified but consistent description of a relevant aspect of distributed system – Mathematical representation – Graphical notations

Why do we need modeling

•  To study physical systems without actually building them.

•  Help better design •  Understanding important aspect such as

–  Performance – Reliability – Not to mention to confirm functionality

3/31/14 Towards survivable architecture 4

Why? Goals

•  Compare Alternatives •  Determine impacts (per features) •  System Tuning •  quantify relative Rel/Avail/Perf •  Debugging •  Set Expectation

How to measure or estimate

•  Measurements •  Simulations •  Analytical Modeling

Measurements

•  Actual System Construction •  Create a workload per requirements •  Provides the best results •  Inherent difficult and inflexible •  Almost impossible for What-if

Measurements (continued)

•  Measure system or subsystem performance with tools – Gprof –  Top/ ps etc.. –  Benchmark programs (e.g. Linpak, Specmark,

Winmark –  Papi, perfctr, perfmon, perfsuite

•  What about reliability measurement? log, trace, outages.

Simulation

•  A program to simulate important characteristics of targeted systems

•  Flexible and ease to modify •  Good for the What-if analysis •  Difficult to model every small details •  Popular – cost-effective and flexible •  Suffer from details

Analytical Modeling

•  Mathematical description of the system •  Provide a quick insight

–  To help guiding in detail simulation or measurement-based

•  Results are much less believable or accurate

Performance

•  Computation – CPU – Memory –  I/O etc

•  Communication –  Latency –  Bandwidth

•  Transaction –  Possible more involvement than DB

Some Criteria

•  Throughput – # of completed requests per time unit

•  Response time – amount of time it takes from when a request was submitted until the first response is produced, not output

•  CPU utilization – keep the CPU as busy as possible

•  Turnaround time – amount of time to execute a particular request (finishing time – arrival time)

Stop Mar 25

•  Important Announcement: "!

Midterm Exam April 10, 2014.!

Performance issue discovery phase

Requirement Architecture/design Development/code test

1/19/2004 3/19/2004

2/1/2004 3/1/2004

1/19/2004 - 3/19/2004Re-design, code, re-test

Telcomm industry architecture review: 1/3 related issues to performance

Simple example of effective memory access time

•  Example – H = cache hit prob, –  Tm = memory access

time, –  Tc= cache access time

•  What is an effective memory accees time?

3/31/14 Towards survivable architecture 15

CPU

cache

memory

Example of modeling problem in DS

•  operation/transaction modeling for an e-commerce system –  Browsing order Tb + submitting order Ts –  90 % vs 10% (volume) – Weight 20% vs. 80% order – Order = 50 instructions + 10 mem

3/31/14 Towards survivable architecture 16

Comparison (Lilja’ book)

Factor Analytical Modeling

Simulation Measurement

Flexibility High High Low Cost Low Medium High Believability Low Medium High Accuracy Low Medium High

3/31/14 18

System Models

•  Physical Model represents underlying hardware elements of a distributed system that abstracts away from specific details of the computer and networking technologies employed

•  Architectural model defines the way in which the components of the system are placed and how they interact with one another and the way in which they are mapped onto the underlying network of computers.

•  Fundamental models: –  Interaction model deals with communication details among the

components and their timing and performance details. –  Failure model gives specification of faults and defines reliable

communication and correct processes. –  Security model specifies possible threats and defines the concept of secure

channels.

Physical Model

•  represents underlying hardware elements

3/31/14 19

Credit:http://www.krug-soft.com/ Credit:http://cisco.com/

3/31/14 20

Architectural Model

•  Concerned with placement of its parts and relationship among them.

•  Example: client-server model, peer-to-peer model •  Abstracts the functions of the individual components. •  Defines patterns for distribution of data and

workload. •  Defines patterns of communication among the

components. •  Example: Definition of server process, client process

and peer process and protocols for communication among processes; definition client/server model and its variations.

3/31/14 21

Software and hardware service layers in distributed systems

Applicat ions, serv ices

Computer and network hardware

Platform

Operating sys tem

Mi ddleware

7

National Weather Service Web Site

Data Aggregator RMI WeatherInfo

Server

RMI WeatherInfo Client

Application

RMI IP Socket API

Weather Web Service Web Client

Analytics Weather Web Service

Server

Relation Database MySQL

Http

Http SOAP/REST XML

LAN

1

2

3

4

5 6

Weather Google Map Client

7

Example of distributed weather monitoring systems (Architecture Model)

3/31/14 23

Middleware

•  Layer of software whose purpose is to mask the heterogeneity and to provide a convenient programming model for application programmers.

•  Middleware supports such abstractions as remote method invocation, group communications, event notification, replication of shared data, real-time data streaming.

•  Examples: Java RMI, grid software (Globus, Open grid Services), Web services.

3/31/14 24

Clients invoke individual servers

Server

Client

Client

invocation

result

Serverinvocation

result

Process:Key:

Computer:EX: browser, web client

EX: Web server

EX: 1. File server, 2. Web crawler

3/31/14 25

A service provided by multiple servers

Server

Server

Server

Serv ice

Client

Client

EX: akamai (data duplication), now amazon aws (zones)

3/31/14 26

Web proxy server and caches

Client

Proxy

Web

server

Web

server

serverClient

Proxy servers + cache are used to provide increased Availability and performance. They also play a major role Firewall based security. http://www.interhack.net/pubs/fwfaq/

3/31/14 27

A distributed application based on peer processes

Coordinat ion

Application

code

Coordinat ion

Application

code

Coordinat ion

Application

code

Ex: distributed Whiteboard Application; Music sharing

3/31/14 28

Web applets

a) c lient reques t results in the downloading of appl et code

Web server

ClientWeb serverApplet

Applet codeClient

b) c lient interacts with the applet

EX: Code streaming; mobile code

3/31/14 29

Interaction Models

•  Within address space (using path as addresses)

•  Socket based communication: connection-oriented, connection-less – Socket is an end-point of communication – Lets look at some code + details

3/31/14 30

Socket based communication

int sockfd; struct sockaddr_in addr; addr.sin_family = AF_INET; addr.sin_addr.s_addr =

inet_addr(SERV_HOST_ADDR); addr.sin_port = htons(SERV_TCP_PORT); sockfd = socket(AF_INET, SOCK_STREAM, 0); connect(sockfd, (struct sockaddr *) &addr,

sizeof(serv_addr)); do_stuff(stdin, sockfd);

3/31/14 31

Classic view of network API

•  Start with host name (maybe) foo.bar.com

3/31/14 32

Classic view of network API

•  Start with host name •  Get an IP address foo.bar.com

gethostbyname()

10.5.4.3

3/31/14 33

Classic view of network API

•  Start with host name •  Get an IP address •  Make a socket

(protocol, address)

foo.bar.com gethostbyname()

10.5.4.3

sock_id

socket();connect();…

3/31/14 34

Classic view of network API

•  Start with host name •  Get an IP address •  Make a socket

(protocol, address) •  Send byte stream

(TCP) or packets (UDP)

foo.bar.com gethostbyname()

10.5.4.3

sock_id

socket();connect();…

TCP sock UDP sock

Network

1,2,3,4,5,6,7,8,9 . . . …

Eventually arrive in order

May or may not arrive

3/31/14 35

Protocol layering

•  Communications stack consists of a set of services, each providing a service to the layer above, and using services of the layer below –  Each service has a programming API, just like any software

module •  Each service has to convey information one or more

peers across the network •  This information is contained in a header

–  The headers are transmitted in the same order as the layered services

3/31/14 36

Protocol layering example

Browser process

HTTP

TCP

Link1

IP

Link1

IP

Link2

Web server process

HTTP

TCP

Link1

IP

Physical Link 1 Physical Link 2

Router

3/31/14 37

HTTP

Protocol layering example

Browser process

TCP

Link1

IP

Link1

IP

Link2

Web server process

HTTP

TCP

Link1

IP

Physical Link 1 Physical Link 2

Router

H

Browser wants to request a page. Calls HTTP with the web address (URL). HTTP’s job is to convey the URL to the web server. HTTP learns the IP address of the web server, adds its header, and calls TCP.

3/31/14 38

HTTP

Protocol layering example

Browser process

TCP

Link1

IP

Link1

IP

Link2

Web server process

HTTP

TCP

Link1

IP

Physical Link 1 Physical Link 2

H

TCP’s job is to work with server to make sure bytes arrive reliably and in order. TCP adds its header and calls IP. (Before that, TCP establishes a connection with its peer.)

T Router

3/31/14 39

HTTP

Protocol layering example

Browser process

TCP

Link1

IP

Link1

IP

Link2

Web server process

HTTP

TCP

Link1

IP

Physical Link 1 Physical Link 2

H

IP’s job is to get the packet routed to the peer through zero or more routers. IP determines the next hop from the destination IP address. IP adds its header and calls the link layer (i.e. Ethernet) with the next hop address.

T

Router

I

3/31/14 40

HTTP

Protocol layering example

Browser process

TCP

Link1

IP

Link1

IP

Link2

Web server process

HTTP

TCP

Link1

IP

Physical Link 1 Physical Link 2

H

The link’s job is to get the packet to the next physical box (here a router). It adds its header and sends the resulting packet over the “wire”.

T

Router

I L1

3/31/14 41

HTTP

Protocol layering example

Browser process

TCP

Link1

IP

Link1

IP

Link2

Web server process

HTTP

TCP

Link1

IP

Physical Link 1 Physical Link 2

H

The router’s link layer receives the packet, strips the link header, and hands the result to the IP forwarding process.

T

Router

I

3/31/14 42

HTTP

Protocol layering example

Browser process

TCP

Link1

IP

Link1

IP

Link2

Web server process

HTTP

TCP

Link1

IP

Physical Link 1 Physical Link 2

H

The router’s IP forwarding process looks at the destination IP address, determines what the next hop is, and hands the packet to the appropriate link layer with the appropriate next hop link address.

T

Router

I

3/31/14 43

HTTP

Protocol layering example

Browser process

TCP

Link1

IP

Link1

IP

Link2

Web server process

HTTP

TCP

Link1

IP

Physical Link 1 Physical Link 2

H

The packet goes over the link to the web server, after which each layer processes and strips its corresponding header.

T

Router

I L2

H T I

H T

H

3/31/14 44

Basic elements of any protocol header

•  Demuxing field – Indicates which is the next higher layer (or

process, or context, etc.) •  Length field or header delimiter

– For the header, optionally for the whole packet •  Header format may be text (HTTP, SMTP

(email)) or binary (IP, TCP, Ethernet)

3/31/14 45

Demuxing fields

•  Ethernet: Protocol Number –  Indicates IPv4, IPv6, (old: Appletalk, SNA, Decnet, etc.)

•  IP: Protocol Number –  Indicates TCP, UDP, SCTP

•  TCP and UDP: Port Number –  Well known ports indicate FTP, SMTP, HTTP, SIP, many others –  Dynamically negotiated ports indicate specific processes (for these and

other protocols)

•  HTTP: Host field –  Indicates “virtual web server” within a physical web server

3/31/14 46

IP (Internet Protocol)

•  Three services: –  Unicast: transmits a packet to a specific host –  Multicast: transmits a packet to a group of hosts –  Anycast: transmits a packet to one of a group of hosts (typically

nearest) •  Destination and source identified by the IP address (32 bits

for IPv4, 128 bits for IPv6) •  All services are unreliable

–  Packet may be dropped, duplicated, and received in a different order

3/31/14 47

IP(v4) address format

•  In binary, a 32-bit integer •  In text, this: “128.52.7.243”

–  Each decimal digit represents 8 bits (0 – 255) •  “Private” addresses are not globally unique:

–  Used behind NAT boxes –  10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16

•  Multicast addresses start with 1110 as the first 4 bits (Class D address) –  224.0.0.0/4

•  Unicast and anycast addresses come from the same space

3/31/14 48

UDP (User Datagram Protocol)

•  Runs above IP •  Same unreliable service as IP

–  Packets can get lost anywhere: •  Outgoing buffer at source •  Router or link •  Incoming buffer at destination

•  But adds port numbers –  Used to identify “application layer” protocols or processes

•  Also a checksum, optional

3/31/14 49

TCP (Transmission Control Protocol)

•  Runs above IP –  Port number and checksum like UDP

•  Service is in-order byte stream –  Application does not absolutely know how the bytes are packaged

in packets •  Flow control and congestion control •  Connection setup and teardown phases •  Can be considerable delay between bytes in at source and

bytes out at destination –  Because of timeouts and retransmissions

•  Works only with unicast (not multicast or anycast)

3/31/14 50

UDP vs. TCP

•  UDP is more real-time –  Packet is sent or dropped, but is not delayed

•  UDP has more of a “message” flavor –  One packet = one message –  But must add reliability mechanisms over it

•  TCP is great for transferring a file or a bunch of email, but kind-of frustrating for messaging –  Interrupts to application don’t conform to message boundaries –  No “Application Layer Framing”

•  TCP is vulnerable to DoS (Denial of Service) attacks, because initial packet consumes resources at the receiver

Instructor’s Guide for Coulouris, Dollimore and

Kindberg Distributed Systems: Concepts and Design Edn. 4 © Pearson Education 2005

Figure 2.8 Real-time ordering of events

send

receive

send

receive

m1 m2

2

1

3

4X

Y

Z

Physical time

Am3

receive receive

send

receive receive receivet1 t2 t3

receive

receivem2

m1

Instructor’s Guide for Coulouris, Dollimore and

Kindberg Distributed Systems: Concepts and Design Edn. 4 © Pearson Education 2005

Figure 2.9 Processes and channels

process p process q

Communi cat ion channel

send

Outgoing message buffer Incoming message buffer

receivem

Instructor’s Guide for Coulouris, Dollimore and

Kindberg Distributed Systems: Concepts and Design Edn. 4 © Pearson Education 2005

Failure Model: Omission and arbitrary failures

Class of failure Affects Description Fail-stop Process Process halts and remains halted. Other processes may

detect this state. Crash Process Process halts and remains halted. Other processes may

not be able to detect this state. Omission Channel A message inserted in an outgoing message buffer never

arrives at the other end’s incoming message buffer. Send-omission Process A process completes a send, but the message is not put

in its outgoing message buffer. Receive-omission Process A message is put in a process’s incoming message

buffer, but that process does not receive it. Arbitrary (Byzantine)

Process or channel

Process/channel exhibits arbitrary behaviour: it may send/transmit arbitrary messages at arbitrary times, commit omissions; a process may stop or take an incorrect step.

Instructor’s Guide for Coulouris, Dollimore and

Kindberg Distributed Systems: Concepts and Design Edn. 4 © Pearson Education 2005

Figure 2.11 Timing failures

Class of Failure Affects Description Clock Process Process’s local clock exceeds the bounds on its

rate of drift from real time. Performance Process Process exceeds the bounds on the interval

between two steps. Performance Channel A message’s transmission takes longer than the

stated bound.

Dependability Modeling

•  Include reliability modeling and availability modeling •  A designed system can be shown to meet performance

and dependability requirement. •  provide a good mechanism for examining the behavior of

a system, right from the design stage to implementation and final deployment.

Dependability

•  Two measures – Reliability (MTTF) – Availability (ratio of uptime/total)

Reliability

•  Definition: The reliability R(t) of a system at time t is the probability that the system failure has not occurred in the interval [0,t). If X is a random variable that represents the time to occurrence of system failure, then R(t)=P(X>t).

•  unreliability = 1-R(t)

Reliability

•  Definition MTTF of a system is the expected time until the occurrence of the (first) system failure. If X is a random variable that represents the time to occurrence of system failure, then MTTF=E[X].

•  Given the system reliability R(t), the MTTF can be computed as,

MTTF = ∫ R(t)dt

Availability

•  A measurement represents a ratio of uptime vs. total times

•  High availability - ability of a system to perform its function continuously (without interruption) for a significantly longer period of time than the reliabilities of its individual components would suggest.

•  High availability is most often achieved through fault tolerance.

Degree of Availability System Type Unavailability

(minutes/year) Availability (in percent) Availability Class

Unmanaged 50,000 90 1

Managed 5,000 99 2

Well-managed 500 99.9 3

Fault-tolerant 50 99.99 4

High Availability 5 99.999 5

Very High Availability 0.5 99.9999 6

Ultra Availability 0.05

99.99999

7

Availability

•  Definition: Availability A(t) of a system at time t is the probability that the system is functioning correctly at time t.

•  Like the reliability measure, in some applications it is better to compute the system unavailability U(t) = 1 -A(t).

•  Availability = MTTF / (MTTF + MTTR) •  A steady = lim A(t) where t -> ∞

Modeling Techniques

•  Non State-space –  Fault-tree – Reliability Block Diagram

•  State-Space – Continuous Markov Chain –  Stochastic Petri Net

Example of system

Fault Tree

Availability Model Server up Server down & repair

S1

S1

S2

time

Availability model

HA-OSCAR dual head model

S1&S2

HA-OSCAR SRN model

• Server sub-model

• Switches

• Compute nodes

Server Sub Model

• P Server up • P Server down • Failover • P server repair • Failback

• S is up and ready • S takes control • S Server down • S repair

Compute node sub model

Switch sub model

Class discussion/Exercise

•  Say we have to design and develop a disaster warning system that has interfaces to multiple systems and perform event analysis for possible disaster/dangers

•  High Level Requirements

–  Open interface –  Scalable for many subscribers for event notification –  24/7 availablity

3/31/14 Towards survivable architecture 69

3/31/14 70

Summary

•  When designing systems or analyzing systems, you want to examine at the high level the architectural model.

•  Subsequent steps will explore fundamental models such as interaction model, security model, failure model, reliability model etc.

Case study in Cloud-based EKG system

3/31/14 Towards survivable architecture 71

top related