DISTRIBUTED SYSTEMS AND THE INTERNET
Distributed System Fundamentals Basic Structure of Distributed System Computing Models in Distributed System Networking and Internetworking The Internet Technical Issues in Distributed System
Lan Jin Tsinghua University
California State University-Fresno
DISTRIBUTED SYSTEMFUNDAMENTALS
Multiple computers + communication network + message passing + single system image
Lack of central memory and a global clock Distributed hardware - multicomputer & network Distributed software and information Distributed control not relying on global state, but
on consensus and agreement protocolMajor goal: Transparency with no knowledge of
what, where, and how.
General Characteristics of DS
DISTRIBUTED SYSTEMFUNDAMENTALS
Transparency: access, location, mobility, replication,
concurrency, parallelism, scaling, failure, network Heterogeneity: network, computer h/w, OS, prog.
languages, vendors Openness or Flexibility: microkernel & open services Scalability & Reconfigurability Reliability & Availability: redundancy, fault tolerance Security: firewall, encryption Concurrency & Performance
Challenges of Distributed Systems
DISTRIBUTED SYSTEMFUNDAMENTALS
Economics: A decisive price/performance advantage
over traditional time-sharing systems Improved reliability & availability thru replication Modular scalability Many applications are inherently distributed, with a
great demand for communication, information
sharing, and resource sharing among computers. The Internet is the greatest worldwide distr. system.
Why Distributed Systems ?
BASIC STRUCTURE OF DISTRIBUTED SYSTEM
Network connecting multiple computer platforms Middleware as the infrastructure between:
OS + comm. protocol distributed applications interacting via the network
Distributed applications
Middlewaree
Operating system
Computer & network h/w
platform
Network
Middleware in Distributed System
BASIC STRUCTURE OF DISTRIBUTED SYSTEM
masks the heterogeneity of computer architectures
hides the underlying networked environment’s
complexity
facilitates the interaction among distributed
software modules
QoS management, information security, RPC,
RMI, remote DB access, …
Basic Functions of Middleware
BASIC STRUCTURE OF DISTRIBUTED SYSTEM
RPC, RMI, ROI middleware
OMG Corba, DCOM middleware
Integrating QoS management into middleware
Middleware supporting mobile computing
Middleware supporting ubiquitous computing
Mobile code and mobile agent
Jini, JavaSpaces, JavaBeans,…middleware
Middleware Environments
COMPUTING MODELS
Client-server (Pull) model — Connectionless Request-Reply protocol — Synchronous, RPC-style communication model Push model — Publish/subscribe system, workflow system — Asynchronous communication model Peer-to-peer (P2P) interaction model — Serverless file sharing — Event-based middleware architecture
COMPUTING MODELS
Variations on the Computing Models
Partitioned or replicated servers
Servers may in turn be clients of other servers
Client
Server
Client
Serverrequest
replyrequest
reply
reply
request
(Client)
ClientServer
Client
Serverrequestreply
Server
request
reply
COMPUTING MODELS
Mobile agent
Network computer
Thin client
Variations on the Computing Models (continued)
Proxy servers and caches
Mobile code
Clientproxy
Client
Server
Server
server
Web
Web
Client
AppletClient
Server
Servercode
Web
Web
(Local site)
Applet code
NETWORKING & INTERNETWORKING
Type Range Data rate(Mbps) Latency(ms) Examples
LAN 1-2 Km 10-1000 1-10 Ethernet,Token ringWPAN >10m 1-2 BluetoothWLAN 0.15-1.5 Km 2-11 5-20 WaveLAN MAN 2-50 Km 1-600 1-10 DSL, ATM WAN worldwide 0.010-600 100-500 ISDN,BISDN,ATMWWAN worldwide 0.010-2 100-500 the Internet (wireless)
Types of Networks Personal Area Network (PAN), WPAN (wireless) Local Area Network (LAN), WLAN (wireless) Metropolitan Area Network (MAN) Wide Area Network (WAN) Internetworks, the Internet
NETWORKING & INTERNETWORKING
2G: ≤ 14.4 Kbps; 3G: 2 Mbps; 4G: > 50 Mbps.
Optical Networks WDM/DWDM handle 160-320 wavelengths/fiber. Optical Ethernet and MAN: 10 - 40 Gbps over 70 Km. TDM reduces the cost: 2000 Gb/s on a single fiber. Advanced optical fiber eliminates cross-talk problem. The last-mile problem: LEC on DSL or wireless links.
Wireless Networks
band system bit-rate users radius spatial capacity SRW 802.11b 11 Mbps 3 100m 1 Kpbs/m2
SRW Bluetooth 1 Mbps 10 10m 30 Kbps/m2
SRW 802.11a 54 Mbps 12 50m 83 Kbps/m2
UWB 50 Mbps 6 10m 1000 Kbps/m2
THE INTERNET
IP datagrams
Network-specific frames
Internet Protocols — TCP/IP Protocol Suite TCP directly supports applications (e.g., HTTP). TCP — reliable connection-oriented communication UDP — unreliable connectionless communication IP datagrams — basic Internet transmission mechanism IP supports WAN applications, e.g., file transfer, email. Internet application protocols: HTTP, SMTP, FTP,
telnet, NNTP by TCP; DNS by UDP
Messages (UDP) or streams (TCP)
UDP or TCP packets
MessageLayers
Application
Transport
Internetwork
Network interface
Underlying network
Internetwork packets
Network-specific packets
Internetworking The Internet
THE INTERNET
UDP: unnecessary to establish and release a connection. TCP: needs to establish a connection - a connect request
is followed by an accept from the server.UDP transmits datagrams w/o acknowledgement or retries. TCP retransmits if not acknowledged within a timeout.UDP: inadequate for using the limited length of datagrams. TCP: avoids implementing multi-packet protocols.UDP: difficult to decide on the server buffer size. TCP: Message size is decided before transmitting it.UDP: no flow control TCP: Flow control matches the speeds of writing to and
reading from a stream by a producer/consumer paradigm.
UDP vs. TCP in Client-Server Computing
THE INTERNET
(To be continued)
History of the Internet
1957 Forming of ARPA1967 1st paper on the ARPAnet - the 1st WAN
1972 1st email program and Telnet standard1973 1st international connection to
ARPAnet1973 FTP developed1981 BITnet, CSnet using ARPAnet tech.
1982 TCP/IP est’d as an Internet standard1982 The name Internet is
assigned1984 DNS introduced
1st 4 hosts on the ARPAnet1969 (UCLA,UCSB,UofUtah,Stanford)
THE INTERNET
History of the Internet (continued)
1989 130,000 computers connected to the Internet1990 First commercially available dial-up Internet access
1991 1st code for the World Wide Web1993 Mosaic, the 1st graphics-based browser
1993 1,776,000 computers, 130 Web servers1995 Java released by Sun microSystems
1995 6,642,000 comp, 23,500 Web servers1997 19,540,000 comp, 1,203,096 Web sv.
1999 56,218,000 comp, 6,598,697 Web sv.
Gopher created as a nongraphics-based browser1991
THE INTERNET
WAP offers a small, extensible protocol stack to handle mobile communications more efficiently.
XML — a powerful extensible alternative to HTMLWML — a small set of XML for wireless networkHDML — compact HTML for handheld devices
Wireless Internet
WAP (Wireless Application Protocol)
WAPmicrobrowser running onWAP-enabled devices
WAPprotocol stack
WAPgateway
TCP/IP stack
Web server
HDML /WML
HTML /XML
Wireless network
THE INTERNETBringing the Internet to Next Generation
Limitations of current WAN: ◊ do not deal with congestion effectively ◊ poor support for QoS ◊ low reliability ◊ no clear definition of the semantics of shared state Internet2 Project ◊ High-speed gigapops at > 155 Mbps ◊ vBNS at 622 Mbps - 2.4 Gbps ◊ IP Multicast protocol and IPv6 ◊ Digital audio and video frameworks ◊ QoS ◊ Distributed storage management NGI - a US government program along with vBNS
THE INTERNETExtending Internet Markup Languages
Limitations of HTML: ◊ Presentations rather than content orientation ◊ No extensibility ◊ No data validation capabilities Enter XML ◊ Let information publishers invent their own tags. ◊ addresses only content. ◊ supports validation by using OTS XML parsers. XML Benefits ◊ extensibility ◊ presentation/content ◊ support for multiple views of the same content ◊ support for document and validation of structured data ◊ selective (field-sensitive) queries over the Internet
TECHNICAL ISSUES IN DISTRIBUTED SYSTEM
Primitive operations of communication and synchronization Message-Passing mechanism: synchronous or blocking
vs. asynchronous or non-blocking Network communication mechanism Multicast communication between groups of processes Client-server communication
◊ Blocking vs. non-blocking primitives ◊ Buffered vs. unbuffered primitives ◊ Reliable vs. unreliable primitives
Client-server exchange protocols
Interprocess Communication
TECHNICAL ISSUES IN DISTRIBUTED SYSTEM
Programmable models for distributed applications ◊ Remote procedure call (RPC)/Remote method invocation (RMI) ◊ Event notification — event-based prog. model
Middleware layer — RPC, RMI, and events built on request-reply protocol and external data representation
RMI by Java ◊ Request invocation - Response ◊ Interface compiler generates client stub & server skeleton by IDL. ◊ Stub & skeleton perform marshaling and unmarshaling. ◊ RMI passes full objects as operation parameters & return values.
Remote Procedure Call (RPC) / Remote Method Invocation (RMI)
TECHNICAL ISSUES IN DISTRIBUTED SYSTEM
Unavailability of global memory and global clock Unpredictable message delays To implement distributed system-wide control: ◊ Algorithms based on arriving at a consensus ◊ Clock synchronization for temporal ordering of events Logical clock: Lamport’s virtual time or vector time Message ordering in group communication Defining a coherent (consistent) global state
Unavailability of Up-to-Date Global State
TECHNICAL ISSUES IN DISTRIBUTED SYSTEM
Distributed mutual exclusion
Distributed deadlock detection
Distributed election algorithms
Processor (workload) allocation
Distributed (process, thread) scheduling
Fault tolerance in DS
Distributed agreement
Distributed Control Algorithms
TECHNICAL ISSUES IN DISTRIBUTED SYSTEM
Distributed file model
Naming and name transparency
File and directory service interface
Semantics of file sharing
Caching and cache consistency
File replication and update
Network file systems
Distributed File System
TECHNICAL ISSUES IN DISTRIBUTED SYSTEM
Transaction Model ACID Properties of transactions Implementation of transactions:
◊ Private workspace◊ Writeahead log
Two-phase commit protocol Concurrency control
◊ Compatible locks and deadlock prevention◊ Optimistic concurrency control◊ Timestamp ordering
Atomic Transactions