verteilte systeme (distributed systems) · integration at presentation, data, or functional layer....
TRANSCRIPT
Verteilte Systeme(Distributed Systems)
Karl M. Gö[email protected]
http://www.infosys.tuwien.ac.at/teaching/courses/VerteilteSysteme/
2
Some slides based on material from this book(Prentice Hall, 2005)
6
Concepts, Paradigms, Technologies
CommunicationProcesses & Concurrency
Naming and DiscoveryCoordination & AgreementReplication & Consistency
Dependability and FTTransactions
Security(Persistency, Durability)
CO
RB
AJ2
EE
CO
M+
.NE
TW
eb S
ervi
ces
GR
IDP
2PP
erva
sive
Mul
timed
iaW
WW
Dat
abas
esV
oIP
...
How is naming and discovery realized in COM+ technology?
Systems Engineering
7
Paradigms and Characteristics (1)
Classifications mixed and not orthogonal! Focused entity: Data item, tuple, file,
document, object, component, service, resource, stream, ...
Communication mechanisms:Transparency vs. coordination-based Request/reply Message-passing (document exchange) DSM and associative tuple spaces Event and notification systems: publish/subscribe;
subject-based, (interupt signaling/exception handling)
8
Paradigms and Characteristics (2) Communication patterns: synchronous (blocking) vs. asynchronous (non-
blocking) transient vs. persistent
Time: real-time vs. non-real-time, sync vs. async, ...
Coupling: tight vs. loose (temporal, referential, ...) Scale and granularity: Client/server – pervasive Mobility and topology: Structured, unstructured,
ad-hoc, overlay, ... Technologies, standards, and big players: OMG,
J2EE, Microsoft, IBM, SAP, IETF, ITU, ...
9
Communication coupling
A taxonomy of coordination models
e.g. „ad-hoc“ communication; event and subject-based;
publish/subscribe;TIB/Rendevous
e.g. tuple spaces; persistent DSM;
associative;JavaSpaces
e.g. transient message-passing; RPC; RMI
e.g. persistent message-passing
Technology Overview
Distributed object-based systems Components and EAI Coordination-based systemsWWW, SOA, Grid, P2P Lecture summary
12
Distributed object based systems
CORBA ORB IDL (language mapping)
DCOM Builds ORPC on top of DCE RPC Integration of binary components from different
languages (e.g. Visual Basic, Java, C++) Java RMI Single-language middleware
16
CORBA Object Model
The general organization of a CORBA system.
22
CORBA Language Mapping
IDL
IDL Compiler
SkeletonCode
ClientCode
generatesreads
Client ServerObject
Stub Skeleton
Object Request Broker
In programming language „A“
StubCode
ObjectImplementation
Code
In programming language „B“
Language „A“Compiler and Linker
Language „B“Compiler and Linker
43
Portable Object Adapter
Mapping of CORBA object identifiers to servants.a) The POA supports multiple servants.b) The POA supports a single servant.
OID:POA assigned or
user assigned
Activation explicit or
on demand
1. Multiple OIDs single servant2. One servant for all objects3. Single servant, many objects
and types (DSI)
Thread per request, connection,
object,...
Policy separated from
mechanism!
46
Object References
The organization of an IOR with specific information for IIOP binding!
49
CORBA Services
Overview of CORBA services.
Provides the current time within specified error marginsTime
Mechanisms for secure channels, authorization, and auditingSecurity
Facilities for expressing relationships between objectsRelationship
Facilities for persistently storing objectsPersistence
Facilities to publish and find the services on object has to offerTrading
Facilities for associating (attribute, value) pairs with objectsProperty
Facilities for systemwide name of objectsNaming
Facilities for attaching a license to an objectLicensing
Facilities for creation, deletion, copying, and moving of objectsLife cycle
Facilities for marshaling and unmarshaling of objectsExternalization
Advanced facilities for event-based asynchronous communicationNotification
Facilities for asynchronous communication through eventsEvent
Flat and nested transactions on method calls over multiple objectsTransaction
Facilities to allow concurrent access to shared objectsConcurrency
Facilities for querying collections of objects in a declarative mannerQuery
Facilities for grouping objects into lists, queue, sets, etc.Collection
DescriptionServiceA key to success for each technology is the set of services and development support it provides!
62
COM Object Model
DCOM
CORBA
The difference between language-defined and binary interfaces.
64
Type Library and Registry
The overall architecture of DCOM.
~ CORBA interface repository
~ CORBA implementation
repository
78
Java RMI Integrated into the language (homogeneous) interface construct used to define remote objects remote interface indicates remotely callable object Objects are passed between JVMs Marshaling = serialization (homogeneous)
rmic = IDL compiler rmid = daemon listening for rmi incoming calls rmiregistry ~ directory service (supports binding) Similar to CORBA but supports only one language Subtle differences local/remote: Cloning (server only),
synchronize (proxy only) Remote object passed by reference (refers to proxy
implementation) (implementation handle) ( only one language + class loader facility)
Technology Overview
Distributed object-based systems Components and EAI Coordination-based systemsWWW, SOA, Grid, P2P Lecture summary
84
Enterprise Application Integration Today we are developing tomorrow‘s legacy
systems, that have to be integrated the next day
Approaches to replace N technologies by one new technology usually end up with N+1technologies
Enterprise Application Integration (EAI) is the creation of business solutions by combining applications using common middlewares, interfaces, standards, and toolchains.
Integration at presentation, data, or functional layer
85
Legacy Systems
proven functionality
tested code
historical Investment
often highly efficientand robust
they work
87
Where can we agree?
There will be no consensus on hardware. There will be no consensus on operating
system. There will be no consensus on network
protocols. There will be no consensus on programming
language. There must be consensus on interfaces and
interoperability (interaction and composition)! Standards, virtualization, components,
services
88
Virtualization in Distributed Systems
(a) A process virtual machine, with multiple instances of (application, runtime) combinations. E.g., JVM.
(b) A virtual machine monitor, with multiple instances of (applications, OS) combinations. E.g., VMware, Xen
93
Why use components?
Concentration on the core competence Encapsulation of solutions in components Open for COTS products Re-Use of components
“Avoid developing software – use components“OMG Tagung, Wien 2001
“Components are for composition“C. Szyperski
Components are well-known and proven in
other domains
94
Component-based Software Engineering
Components: CBSEand Product Lines
„Buy before build. Reuse before buy“
Fred Brooks 1975(!)
95
Product Line
Application A Application B
Components of Mercedes E class cars are 70% equal.Components of Boeing 757 and 767 are 60% equal. most effort is integration istead of development!
Quality,time to market,but complexity re-use
97
Component Independent Development Third party development Binary integration Component + Component
== Component Connection through
composition heterogeneous black-box re-use
Object Banana – gorilla problem Local development Source integration Object + Object
!= Object Connection through
references (homogeneous) white-box re-use through
inheritance
Component vs. Object
KlasseKomponente
98
Component Granularity
0 100%Proportion of application addressed by component
cost-effectiveness
of use and reuse
match to requirements,
flexibility/speed of change
Routine/
operationProgram/
class
Coarse-grained
component
Application
package
101
Component lifecycle
ComponentRepository
Compositionenvironment
Run-timeenvironment
ComponentModel
Component-basedArchitecture for
field devices
110
Enterprise Java Beans: Ziele EJB as integration technology The following concepts are most relevant: EJB – Enterprise Java Beans, RMI – Remote Method Invocation, JNDI – Java Naming and Directory Interface, JMS – Java Messaging Service, JDBC – Access to databases, SQLJ – static embedded SQL for Java, JDO – Java Data Objects, J2EE Connector (for Legacy Integration), JTS/JTA – Java Transaction Service and Java Transaction
Architecture.
111
EJB Architecture
EJB ServerEJB Container
EnterpriseBeans
Web Container
JSP FileServlet
EnterpriseBeansClient Database
Server
Tier 1 Tier 3
Tier 2
Run-time environment for containers, thread management, OS resources, load balancing,
directory service, ...
Run-time environment for beans, life-cycle management, instance pooling, distribution, service interfaces (standard and proprietary), e.g. user
administration, ...
Services: JNDI, JTS, Persistence,
JMS, Security Policy, ...
113
EJB Roles
Client
ServerApplicationAssembler
EJB
EJBProvider EJB
Deployer
EJB ContainerProvider
EJB ServerProvider
DB
EJB Server
EJB Container
Client
Business Logic
<<uses client contracts>>
<<Installing EJB usingserver tools>>
Technology Overview
Distributed object-based systems Components and EAI Coordination-based systemsWWW, SOA, Grid, P2P Lecture summary
129
Coordination-based systems
To achieve highly scalable and open distributed systems, components must be loosely coupled
Explicit communication and coordination of components/activities (instead of transparency)
Event-based systems (also called publish/subscribe) attempt to achieve loose coupling through event communication (notifications)
Key concept: loose coupling – separationbetween computation and coordination
145
Principles of JavaSpaces ™ Temporal and referential uncoupling Tuples with serialized Java objects rich typing (type-safe), methods, subtypes
Entries are leased Transactions Persistent or transient (durability) Operations: write read (blocking), readIfExists (non-blocking) take (blocking), takeIfExists (non-blocking) notify
Read() is based on template
matching
146
Overview of JavaSpaces
The general organization of a JavaSpace in Jini.
158
Distributed Shared Memory Abstraction for sharing data
(without atually sharing physical memory) Effective for parallel applications Less effective for client/server applications DSM runtime support performs message passing
and replication/caching Processors actually sharing memory: groups of 4,
practical limit is 10, 64 with hierarchy Distributed memory multiprocessors and clusters
(high-speed network) typically scale better Scalability of DSM is a well-known problem (similar to replication consistency relaxed)
160
Message passing vs. DSM Message passing: marshalling, process
protection, heterogeneous, synchronization through messages (agreement problem), programmer is aware of communication costs
DSM: direct sharing (even pointers!), processes may interfere, homogeneous, synchronization through locks and semaphores, programmer may be unaware of communication costs
Efficiency depends on patterns of data sharing Message passing cannot be avoided altogether
in a distributed system!
Technology Overview
Distributed object-based systems Components and EAI Coordination-based systemsWWW, SOA, Grid, P2P Lecture summary
171
The World Wide Web
Key: Document (A), HTTP, URL, loose coupling
176
Architectural Overview
The principle of using server-side CGI programs.
200
Web Services and SOA – motivation EAI – Enterprise Application Integration (MoM)
(note: Was an argument for CBSE as well) WfMS – Workflow Management Systems BPEL CBSE – Components are not obsolete! SOA provide a virtual component model
WWW – Loose coupling: Heterogeneous, flexible, and dynamic orchestration
Re-use (note: Was an argument for CBSE, Middleware, ...)
Interface management (note: -“-) Business integration („business goals with IT“)
202
DBMS .NETJ2EE
Virtual Component
Concrete Component
Virtualizing Components
AssemblyStP ...
WebService
(E)JB
implementsimplementsimplements
203
Web Services – introduction
Effectiveness of simple protocols Complex applications with service integration Web service != web server Data representation and marshalling: XML SOAP protocol: How to package messages WSDL: Service description („IDL“) UDDI: Naming and discovery (did not work) XML Security: Documents signed or encrypted Coordination through explicit protocols
204
Organizing Into A Platform
Messaging
Quality of Service
Transport
Description
Transports
Interface + Bindings
Composite
XML Non-XML
Security
Policy
Discovery, N
egotiation, Agreem
ent
Atomic
ChoreographyChoreography ProtocolsProtocols StateState Components
ReliableMessaging Transactions
205
TransactionsReliableMessaging
The Bus And Standards
Transports
Interface + Bindings
Composite
XML Non-XML
Security
Policy
Discovery, N
egotiation, Agreem
ent
AtomicChoreographyChoreography GroupingGroupingBPEL
WS-RM WS-Security* WS-AT,WS-BA,…
WSDL* WS-Policy*
SOAP, WS-Addressing JMS, RMI/IIOP, ..
HTTP, TCP/IP, SMTP, FTP, …
UD
DI, W
S-A
ddressing, ME
X,…
WS-C,WS-N*,… StateStateWS-RF
Messaging
Quality of Service
Transport
Description
Components
209
Combinations of web services
hotel booking a
Travel Agent
flight booking a
hire car booking aService
Client
flight booking b
hotel booking b
hire car booking b
Value-added services from third parties provide
new functionality
210
Web Services – principles (1) Interface offers a collection of operations, provided
by a variety of different resources(programs, objects, databases, ...)
Messages XML-formatted SOAP messages call operations of
interfaces REST (representational state transfer): URLs and
HTTP messages used to manipulate data resources Amazon, Google, eBay, ... offer web services to
manipulate their web resources (e.g. Procurement application @Amazon, ‚sniping‘ @eBay)
211
Web Services – principles (2)
Communication patterns asynchronous exchange of documents synchronous request/reply event/notification also available
No particular programming model no remote object reference no garbage collection
XML representation: more space (human readable?) binary versions
available more time to process
212
Web Services – principles (3)
Service references: URL (URI) Activation of services: continous operation activation on demand
Transparency: none which need not be bad However, for convenience, handling of XML and
SOAP is hidden by APIs and/or tools. Proxies vs. dynamic invocation Conversion to SOAP/XML static or on-the-fly
235
What is BPEL A language to specify behavior of business processes Between Web services... ...and as Web services
Same language to define executable processes and business protocols
Executable processes Can be performed at all compliant environments (portability) Interoperability between heterogeneous environments
Abstract processes Specify constraints of message exchange Are “views” on internal processes
Combination of graph-based language (IBM WSFL) and calculus-based language (Microsoft XLANG)
244
Grid – motivation
Middleware to enable the sharing of resourceson a large scale (mainly data or computer power for data-intensive applications).
Management coordinates the use of resources.
245
Heterogeneous Resources
Distributed physical clusters and storage
246
Virtual clusters and storage
The Grid: Virtualizing Resources
Grid MiddlewareService “Bus” as GRID middleware
250
Cloud Computing
Computing Power as a configurable, payable Service
252
Peer to peer (P2P) – aims Enable large (global) scale by eliminating (centrally-) managed servers and
infrastructures (administration and fault recovery cost, bandwidth bottleneck).
Build a reliable resource sharing layer over an unreliable and untrusted collection of (unpredictable) nodes (probabilistic).
Exploit available resources and construct applications that are scalable, reliable, and secure. Data and computational resources are contributed by many
hosts (nodes) in an unmanaged way to participate in the provision of a uniform service.
253
Peer to peer (P2P) – challenges All nodes have the same functional capabilities
and responsibilities Key problem: Placement of data objects and
subsequent provision for access, while balancing workload and availability.
Algorithms for placement and retrieval are a key aspect: decentralized self-organising self-balancing (storage and processing load)
Most effective with immutable data (file sharing, web caching, ...).
255
Why P2P - from research point of view
Challenges for future Internet applications Scalability (nodes, users) Dynamics (mobility, QoS-aware flexibility) Heterogeneity (scopes of control) Security (anonymity, censorship, availability) Dependability (and performance)
Centralized systems single point of failure single point to attack bottleneck
256
Why P2P - Reality
267
Other systems
Mobile and pervasive (ubiquitous) systems: wireless connectivity of portable devices device miniaturization and integration of computing
devices with our everyday physical world deal with frequent change!
Distributed multimedia Continous streams of data in real-time (Video,
VoIP, ...) timely processing and delivery Flow specifications and QoS
contracts/management Voice-data convergence and service integration
Technology Overview
Distributed object-based systems Components and EAI Coordination-based systemsWWW, SOA, Grid, P2P Lecture summary
273
Design goals in distributed systems Resource sharing (collaborative, competitive) Transparency Hiding internal structure, complexity
Openness Portability, interoperability, ... Services provided by standard rules Separating policy from mechanism
Scalability Ability to expand the system easily
Concurrency inherently parallel (not just simulated)
Fault Tolerance (FT), availability
277
Dealing with complexity
Abstraction (and modeling) Client, server, service Interface versus implementation
Information hiding (encapsulation) Interface design
Separation of concerns Layering (filesystem example: bytes, disc blocks,
files) Client and server Components (granularity issues) Policy vs. mechanism
278
Communication models
Multiprocessors: shared memory Multicomputers: message passing Synchronization in shared memory: Semaphores (atomic mutex variable) Monitors — an abstract data type whose
operations may be invoked by concurrent threads; different invocations are synchronized
Synchronization in multicomputers: blocking in message passing
279
Architectural Styles
Important styles of architecture for distributed systems
Layered architectures (OSI) Object-based architectures (and components) Data-centered architectures (file based,
database, resourceful WS, ...) Event-based architectures .... and combinations thereof
280
Essentially everyone, when they first build a distributed application, makes the above eight assumptions. All prove to be false in the long run and all cause big trouble and painful learning experiences. (Peter Deutsch)
The 8 Fallacies of Distributed Computing
1. The network is reliable2. Latency is zero 3. Bandwidth is infinite 4. The network is secure 5. Topology doesn't change 6. There is one administrator 7. Transport cost is zero 8. The network is homogeneous
281
Concepts of distributed systems
Communication Concurrency and operating system support
(competitive, cooperative) Naming and discovery Synchronization and agreement Consistency and replication Fault-tolerance Security
282
Communication
Communication is the distinguishing characteristic of distributed applications
Communication mechanisms may be explicit or implicit
Different models of communication exist Synchronization and persistence Discrete and continuous media have distinct
communication requirements Remote procedures, distributed objects,
message queues, and streams are just four types of abstractions that can be used
283
Operating System Support Concurrency is naturally present in a
distributed system and needs operating system support
Concurrency may be exploited in several ways in distributed systems: To improve performance by hiding delays due to
blocking To structure high-performance servers To structure clients that hide server replication
Other paradigms with different operating system support code migration virtualization
284
Naming and discovery Names are organized in name spaces; implemented
in hierarchies and layers A naming service provides the mapping (resolution):
name attribute (typically address) Consistency of distributed name service depends on
update algorithms used Caching and replication increase
performance/availability Directory service provides a way to structure a name
space according to attributes Discovery service supports ad hoc networks,
dynamics, and large-scale (e.g., P2P) Mobility is supported by location services Distributed garbage collection is challenging
285
Synchronization and agreement Distributed processes need to synchronize
their actions to ensure cooperation or fair competition
Lack of a global clock makes synchronization difficult
Often, ordering is enough: Logical clocks and vector stamps reduce the cost of synchronization
Distributed agreement algorithms are required when processes need to coordinate their actions.
Mutex, Election, Global state, ...
286
Consistency and replication Replication can help to achieve better performance
and fault tolerance Chosen replication protocol depends on different
parameters: consistency requirements, read/write ratio, number of clients, etc. consistency model update propagation methods
Most important protocols: Primary-backup replication Coordinator-cohort/Update-everywhere replication Active replication Quorum-based protocols Epidemic protocols
Need to be adopted for domain: distributed object system, file system, database system,
service-oriented system, P2P system, etc.
287
Dependability and fault tolerance Dependability is a holistic concept Distributed systems can suffer partial failures Distributed systems can provide fault-tolerance Faults can be due to process failures or
communication failures Process replication (process groups) can help deal
with process failures Reliable communication can be built on top of
unreliable communication mechanisms Lost-reply problem has to be dealt with in client/server
architectures A reliable multicast (group communication) is in many
cases necessary for providing fault-tolerant distributed algorithms
288
Security Demand for security (unfortunately) obvious: e-
banking, e-government, online auctions, etc. Security services are necessary to protect
communications and transactions in open networks Security can be provided by secure channels and
authorization services Authorization requires authentication and access
control Encryption is used for secure communication Public key and secret key cryptography can be used
for authentication (e.g. digital signatures) Distribution of encryption keys must be managed by a
trusted third party or out-of-band communication.
292
Concepts, Paradigms, Technologies
CommunicationProcesses & Concurrency
Naming and DiscoveryCoordination & AgreementReplication & Consistency
Dependability and FTTransactions
Security(Persistency, Durability)
CO
RB
AJ2
EE
CO
M+
.NE
TW
eb S
ervi
ces
GR
IDP
2PP
erva
sive
Mul
timed
iaW
WW
Dat
abas
esV
oIP
...
How is naming and discovery realized in COM+ technology?
Systems Engineering
295
Future Prospects? SOA and Web services (standards-based) Grid and cloud computing (aggregation of nodes) Mobility (portable devices) and ad-hoc (MANET) Voice-data convergence and service integration Pervasive/ambient/ubiquitous computing (billions of
nodes and new kinds of applications) Ultra large scale systems (complexity, emerging
behaviour) Adaptive (self-*, autonomous) systems Bio-inspired methods
306
Continuative lectures
Advanced Distributed Systems Dependable Systems Adaptive Systems Autonomic and Bio-inspired systems Application examples: MMOG
Technologien Verteilter Systeme Software Architekturen Entwurfsmethoden für Verteilte Systeme More lecturing http://www.infosys.tuwien.ac.at/teaching/
307
Interested? Join our national and international projects –
where research meets industry!
Praktikum Diplomarbeit Dissertation
http://www.infosys.tuwien.ac.at/ http://www.infosys.tuwien.ac.at/staff/kmg/ http://www.dedisys.org/ http://www.dedisys.org/trade/