distributed system: lecture 2box/ds_cloud/ds_lecture2.pdf · distributed system: lecture 2 ......

Distributed System: Lecture 2

Box Leangsuksun SWECO Endowed Professor, Computer Science Louisiana Tech University box@latech.edu

CTO, PB Tech International Inc. naibox@gmail.com

3/31/14 2

System Models

based on Professor Paul Francis notes, Cornell University

Distributed System: Lecture 2

3/31/14 3

Models

•  a simplified representation of a system or phenomenon

•  To provide an abstract, simplified but consistent description of a relevant aspect of distributed system – Mathematical representation – Graphical notations

Why do we need modeling

•  To study physical systems without actually building them.

•  Help better design •  Understanding important aspect such as

–  Performance – Reliability – Not to mention to confirm functionality

3/31/14 Towards survivable architecture 4

Why? Goals

•  Compare Alternatives •  Determine impacts (per features) •  System Tuning •  quantify relative Rel/Avail/Perf •  Debugging •  Set Expectation

How to measure or estimate

•  Measurements •  Simulations •  Analytical Modeling

Measurements

•  Actual System Construction •  Create a workload per requirements •  Provides the best results •  Inherent difficult and inflexible •  Almost impossible for What-if

Measurements (continued)

•  Measure system or subsystem performance with tools – Gprof –  Top/ ps etc.. –  Benchmark programs (e.g. Linpak, Specmark,

Winmark –  Papi, perfctr, perfmon, perfsuite

•  What about reliability measurement? log, trace, outages.

Simulation

•  A program to simulate important characteristics of targeted systems

•  Flexible and ease to modify •  Good for the What-if analysis •  Difficult to model every small details •  Popular – cost-effective and flexible •  Suffer from details

Analytical Modeling

•  Mathematical description of the system •  Provide a quick insight

–  To help guiding in detail simulation or measurement-based

•  Results are much less believable or accurate

Performance

•  Computation – CPU – Memory –  I/O etc

•  Communication –  Latency –  Bandwidth

•  Transaction –  Possible more involvement than DB

Some Criteria

•  Throughput – # of completed requests per time unit

•  Response time – amount of time it takes from when a request was submitted until the first response is produced, not output

•  CPU utilization – keep the CPU as busy as possible

•  Turnaround time – amount of time to execute a particular request (finishing time – arrival time)

Stop Mar 25

•  Important Announcement: "!

Midterm Exam April 10, 2014.!

Performance issue discovery phase

Requirement Architecture/design Development/code test

1/19/2004 3/19/2004

2/1/2004 3/1/2004

1/19/2004 - 3/19/2004Re-design, code, re-test

Telcomm industry architecture review: 1/3 related issues to performance

Simple example of effective memory access time

•  Example – H = cache hit prob, –  Tm = memory access

time, –  Tc= cache access time

•  What is an effective memory accees time?

memory

Example of modeling problem in DS

•  operation/transaction modeling for an e-commerce system –  Browsing order Tb + submitting order Ts –  90 % vs 10% (volume) – Weight 20% vs. 80% order – Order = 50 instructions + 10 mem

Comparison (Lilja’ book)

Factor Analytical Modeling

Simulation Measurement

Flexibility High High Low Cost Low Medium High Believability Low Medium High Accuracy Low Medium High

3/31/14 18

System Models

•  Physical Model represents underlying hardware elements of a distributed system that abstracts away from specific details of the computer and networking technologies employed

•  Architectural model defines the way in which the components of the system are placed and how they interact with one another and the way in which they are mapped onto the underlying network of computers.

•  Fundamental models: –  Interaction model deals with communication details among the

components and their timing and performance details. –  Failure model gives specification of faults and defines reliable

communication and correct processes. –  Security model specifies possible threats and defines the concept of secure

channels.

Physical Model

•  represents underlying hardware elements

3/31/14 19

Credit:http://www.krug-soft.com/ Credit:http://cisco.com/

3/31/14 20

Architectural Model

•  Concerned with placement of its parts and relationship among them.

•  Example: client-server model, peer-to-peer model •  Abstracts the functions of the individual components. •  Defines patterns for distribution of data and

workload. •  Defines patterns of communication among the

components. •  Example: Definition of server process, client process

and peer process and protocols for communication among processes; definition client/server model and its variations.

3/31/14 21

Software and hardware service layers in distributed systems

Applicat ions, serv ices

Computer and network hardware

Platform

Operating sys tem

Mi ddleware

National Weather Service Web Site

Data Aggregator RMI WeatherInfo

Server

RMI WeatherInfo Client

Application

RMI IP Socket API

Weather Web Service Web Client

Analytics Weather Web Service

Server

Relation Database MySQL

Http SOAP/REST XML

Weather Google Map Client

Example of distributed weather monitoring systems (Architecture Model)

3/31/14 23

Middleware

•  Layer of software whose purpose is to mask the heterogeneity and to provide a convenient programming model for application programmers.

•  Middleware supports such abstractions as remote method invocation, group communications, event notification, replication of shared data, real-time data streaming.

•  Examples: Java RMI, grid software (Globus, Open grid Services), Web services.

3/31/14 24

Clients invoke individual servers

Server

Client

invocation

result

Serverinvocation

result

Process:Key:

Computer:EX: browser, web client

EX: Web server

EX: 1. File server, 2. Web crawler

3/31/14 25

A service provided by multiple servers

Server

Serv ice

Client

EX: akamai (data duplication), now amazon aws (zones)

3/31/14 26

Web proxy server and caches

Client

server

serverClient

Proxy servers + cache are used to provide increased Availability and performance. They also play a major role Firewall based security. http://www.interhack.net/pubs/fwfaq/

3/31/14 27

A distributed application based on peer processes

Coordinat ion

Application

Coordinat ion

Application

Coordinat ion

Application

Ex: distributed Whiteboard Application; Music sharing

3/31/14 28

Web applets

a) c lient reques t results in the downloading of appl et code

Web server

ClientWeb serverApplet

Applet codeClient

b) c lient interacts with the applet

EX: Code streaming; mobile code

3/31/14 29

Interaction Models

•  Within address space (using path as addresses)

•  Socket based communication: connection-oriented, connection-less – Socket is an end-point of communication – Lets look at some code + details

3/31/14 30

Socket based communication

int sockfd; struct sockaddr_in addr; addr.sin_family = AF_INET; addr.sin_addr.s_addr =

inet_addr(SERV_HOST_ADDR); addr.sin_port = htons(SERV_TCP_PORT); sockfd = socket(AF_INET, SOCK_STREAM, 0); connect(sockfd, (struct sockaddr *) &addr,

sizeof(serv_addr)); do_stuff(stdin, sockfd);

3/31/14 31

Classic view of network API

•  Start with host name (maybe) foo.bar.com

3/31/14 32

•  Start with host name •  Get an IP address foo.bar.com

gethostbyname()

10.5.4.3

3/31/14 33

•  Start with host name •  Get an IP address •  Make a socket

(protocol, address)

foo.bar.com gethostbyname()

10.5.4.3

sock_id

socket();connect();…

3/31/14 34

•  Start with host name •  Get an IP address •  Make a socket

(protocol, address) •  Send byte stream

(TCP) or packets (UDP)

foo.bar.com gethostbyname()

10.5.4.3

sock_id

socket();connect();…

TCP sock UDP sock

Network

1,2,3,4,5,6,7,8,9 . . . …

Eventually arrive in order

May or may not arrive

3/31/14 35

Protocol layering

•  Communications stack consists of a set of services, each providing a service to the layer above, and using services of the layer below –  Each service has a programming API, just like any software

module •  Each service has to convey information one or more

peers across the network •  This information is contained in a header

–  The headers are transmitted in the same order as the layered services

3/31/14 36

Protocol layering example

Browser process

Web server process

Physical Link 1 Physical Link 2

Router

3/31/14 37

Browser process

Web server process

Router

Browser wants to request a page. Calls HTTP with the web address (URL). HTTP’s job is to convey the URL to the web server. HTTP learns the IP address of the web server, adds its header, and calls TCP.

3/31/14 38

Browser process

Web server process

TCP’s job is to work with server to make sure bytes arrive reliably and in order. TCP adds its header and calls IP. (Before that, TCP establishes a connection with its peer.)

T Router

3/31/14 39

Browser process

Web server process

IP’s job is to get the packet routed to the peer through zero or more routers. IP determines the next hop from the destination IP address. IP adds its header and calls the link layer (i.e. Ethernet) with the next hop address.

Router

3/31/14 40

Browser process

Web server process

The link’s job is to get the packet to the next physical box (here a router). It adds its header and sends the resulting packet over the “wire”.

Router

3/31/14 41

Browser process

Web server process

The router’s link layer receives the packet, strips the link header, and hands the result to the IP forwarding process.

Router

3/31/14 42

Browser process

Web server process

The router’s IP forwarding process looks at the destination IP address, determines what the next hop is, and hands the packet to the appropriate link layer with the appropriate next hop link address.

Router

3/31/14 43

Browser process

Web server process

The packet goes over the link to the web server, after which each layer processes and strips its corresponding header.

Router

3/31/14 44

Basic elements of any protocol header

•  Demuxing field – Indicates which is the next higher layer (or

process, or context, etc.) •  Length field or header delimiter

– For the header, optionally for the whole packet •  Header format may be text (HTTP, SMTP

(email)) or binary (IP, TCP, Ethernet)

3/31/14 45

Demuxing fields

•  Ethernet: Protocol Number –  Indicates IPv4, IPv6, (old: Appletalk, SNA, Decnet, etc.)

•  IP: Protocol Number –  Indicates TCP, UDP, SCTP

•  TCP and UDP: Port Number –  Well known ports indicate FTP, SMTP, HTTP, SIP, many others –  Dynamically negotiated ports indicate specific processes (for these and

other protocols)

•  HTTP: Host field –  Indicates “virtual web server” within a physical web server

3/31/14 46

IP (Internet Protocol)

•  Three services: –  Unicast: transmits a packet to a specific host –  Multicast: transmits a packet to a group of hosts –  Anycast: transmits a packet to one of a group of hosts (typically

nearest) •  Destination and source identified by the IP address (32 bits

for IPv4, 128 bits for IPv6) •  All services are unreliable

–  Packet may be dropped, duplicated, and received in a different order

3/31/14 47

IP(v4) address format

•  In binary, a 32-bit integer •  In text, this: “128.52.7.243”

–  Each decimal digit represents 8 bits (0 – 255) •  “Private” addresses are not globally unique:

–  Used behind NAT boxes –  10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16

•  Multicast addresses start with 1110 as the first 4 bits (Class D address) –  224.0.0.0/4

•  Unicast and anycast addresses come from the same space

3/31/14 48

UDP (User Datagram Protocol)

•  Runs above IP •  Same unreliable service as IP

–  Packets can get lost anywhere: •  Outgoing buffer at source •  Router or link •  Incoming buffer at destination

•  But adds port numbers –  Used to identify “application layer” protocols or processes

•  Also a checksum, optional

3/31/14 49

TCP (Transmission Control Protocol)

•  Runs above IP –  Port number and checksum like UDP

•  Service is in-order byte stream –  Application does not absolutely know how the bytes are packaged

in packets •  Flow control and congestion control •  Connection setup and teardown phases •  Can be considerable delay between bytes in at source and

bytes out at destination –  Because of timeouts and retransmissions

•  Works only with unicast (not multicast or anycast)

3/31/14 50

UDP vs. TCP

•  UDP is more real-time –  Packet is sent or dropped, but is not delayed

•  UDP has more of a “message” flavor –  One packet = one message –  But must add reliability mechanisms over it

•  TCP is great for transferring a file or a bunch of email, but kind-of frustrating for messaging –  Interrupts to application don’t conform to message boundaries –  No “Application Layer Framing”

•  TCP is vulnerable to DoS (Denial of Service) attacks, because initial packet consumes resources at the receiver

Instructor’s Guide for Coulouris, Dollimore and

Figure 2.8 Real-time ordering of events

receive

Physical time

receive receive

receive receive receivet1 t2 t3

receive

receivem2

Figure 2.9 Processes and channels

process p process q

Communi cat ion channel

Outgoing message buffer Incoming message buffer

receivem

Failure Model: Omission and arbitrary failures

Class of failure Affects Description Fail-stop Process Process halts and remains halted. Other processes may

detect this state. Crash Process Process halts and remains halted. Other processes may

not be able to detect this state. Omission Channel A message inserted in an outgoing message buffer never

arrives at the other end’s incoming message buffer. Send-omission Process A process completes a send, but the message is not put

in its outgoing message buffer. Receive-omission Process A message is put in a process’s incoming message

buffer, but that process does not receive it. Arbitrary (Byzantine)

Process or channel

Process/channel exhibits arbitrary behaviour: it may send/transmit arbitrary messages at arbitrary times, commit omissions; a process may stop or take an incorrect step.

Figure 2.11 Timing failures

Class of Failure Affects Description Clock Process Process’s local clock exceeds the bounds on its

rate of drift from real time. Performance Process Process exceeds the bounds on the interval

between two steps. Performance Channel A message’s transmission takes longer than the

stated bound.

Dependability Modeling

•  Include reliability modeling and availability modeling •  A designed system can be shown to meet performance

and dependability requirement. •  provide a good mechanism for examining the behavior of

a system, right from the design stage to implementation and final deployment.

Dependability

•  Two measures – Reliability (MTTF) – Availability (ratio of uptime/total)

Reliability

•  Definition: The reliability R(t) of a system at time t is the probability that the system failure has not occurred in the interval [0,t). If X is a random variable that represents the time to occurrence of system failure, then R(t)=P(X>t).

•  unreliability = 1-R(t)

Reliability

•  Definition MTTF of a system is the expected time until the occurrence of the (first) system failure. If X is a random variable that represents the time to occurrence of system failure, then MTTF=E[X].

•  Given the system reliability R(t), the MTTF can be computed as,

MTTF = ∫ R(t)dt

Availability

•  A measurement represents a ratio of uptime vs. total times

•  High availability - ability of a system to perform its function continuously (without interruption) for a significantly longer period of time than the reliabilities of its individual components would suggest.

•  High availability is most often achieved through fault tolerance.

Degree of Availability System Type Unavailability

(minutes/year) Availability (in percent) Availability Class

Unmanaged 50,000 90 1

Managed 5,000 99 2

Well-managed 500 99.9 3

Fault-tolerant 50 99.99 4

High Availability 5 99.999 5

Very High Availability 0.5 99.9999 6

Ultra Availability 0.05

99.99999

Availability

•  Definition: Availability A(t) of a system at time t is the probability that the system is functioning correctly at time t.

•  Like the reliability measure, in some applications it is better to compute the system unavailability U(t) = 1 -A(t).

•  Availability = MTTF / (MTTF + MTTR) •  A steady = lim A(t) where t -> ∞

Modeling Techniques

•  Non State-space –  Fault-tree – Reliability Block Diagram

•  State-Space – Continuous Markov Chain –  Stochastic Petri Net

Example of system

Fault Tree

Availability Model Server up Server down & repair

Availability model

HA-OSCAR dual head model

HA-OSCAR SRN model

• Server sub-model

• Switches

• Compute nodes

Server Sub Model

• P Server up • P Server down • Failover • P server repair • Failback

• S is up and ready • S takes control • S Server down • S repair

Compute node sub model

Switch sub model

Class discussion/Exercise

•  Say we have to design and develop a disaster warning system that has interfaces to multiple systems and perform event analysis for possible disaster/dangers

•  High Level Requirements

–  Open interface –  Scalable for many subscribers for event notification –  24/7 availablity

3/31/14 70

Summary

•  When designing systems or analyzing systems, you want to examine at the high level the architectural model.

•  Subsequent steps will explore fundamental models such as interaction model, security model, failure model, reliability model etc.

Case study in Cloud-based EKG system

distributed system: lecture 2box/ds_cloud/ds_lecture2.pdf · distributed system: lecture 2 ......

Documents

distributed transactions 7. transaction …...7. transaction...

forecasting: principles and practice · outline 1variance...

main features -...

chapter 22: distributed databases. 22.2 chapter 22:...

1 distributed systems. 2 overview definitions advantages...

distributed systems course distributed transactions

distributed systems: distributed algorithms

box/ds_cloud/term_papers/security issues... · web...

1 distributed process management: distributed global states...

distributed database unit 12 distributed database

distributed process management: distributed global states ...

drumit five usermanual - 2box · drumit five usermanual os...

vpa-2box model diagnostics used in the …€¦ · en...

distributed data, distributed governance, distributed...

ubi529 3. distributed graph algorithms. 2.4 distributed path...

distributed system: lecture 4...

distributed database: part 2. distributed dbms distributed...

ndg security: distributed governance, distributed access...

distributed distributed systems - brown university

distributed cognition - outline distributed cognition...