a fault tolerance concept for distributed osgi applications - fabian meyer
DESCRIPTION
Computer systems are becoming increasingly complex. This makes it more and more difficult to ensure their correct operation and to correct errors promptly. Due to their increasing use the focus of this work is fault tolerance for OSGi-based, distributed applications. The designed concept increases the reliability of such applications while it remains perfectly hidden and doesn't interfere with their normal operation. A service that has been made fault tolerant using the developed concept does not differ from any other OSGi service and can be used in the exact same way. To achieve fault tolerance, redundant instances of the service are distributed among several nodes. Each replica is given a special role, which is either active or passive. Active replica process service calls, passive replicas take their place in case they fail. How many replicas to use and what their specific role is can be configured according to the needs of any service. A proxy is used to consolidate service calls and relay them to the corresponding service instances. It analyzes every call and their result allowing the tolerance of hardware and software faults. The concept is designed to only use standardized OSGi interfaces and procedures. To identify the replicas of a service and what framework they are running on the RemoteServiceAdmin from the OSGi Enterprise Specification was used. It allows an imported service to be mapped to its origin. The described concept is available for free from the web server of the distributed systems lab of the Hochschule RheinMain.TRANSCRIPT
COPYRIGHT © 2008-2011 OSGi Alliance. All Rights Reserved
A Fault Tolerance Concept for
Distributed OSGi Applications
Patrick Deuster, Fabian Meyer, Reinhold Kröger
RheinMain University of Applied Sciences
September 21, 2011
OSGi Alliance Marketing © 2008-2010 .
All Rights Reserved
Page 1
OSGi Alliance Community Event 2011© 2008-2011. All Rights Reserved Page 2
Agenda
• Motivation
• Related Work
• Approach
• Evaluation
• Conclusion
Motivation
• OSGi is a commonly used service platform
• Distribution supported by Remote Service specification
in R4.2
• Remote Services can be used as a basis for a fault
tolerance concept
• Based on Patrick Deuster’s M.Sc. Thesis [1] at
Wiesbaden University of Applied Sciences
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 3
Requirements
• No modification of the underlying OSGi framework
implementation
• Transparency for service consumers
• Administration interface
• Configurable redundancy mechanisms
• Synchronization of replicas
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 4
Related Work
• DR-OSGi [2] • Aspect oriented
• Caching, redundancy
• Major disadvantage: Can only be bound to local replica
• Towards reliable OSGi framework and applications [3] • Proxy layer
• Service call forwarding to replicas
• Major disadvantage: Modification of the OSGi framework implementation
• FT-OSGi [4] • Proxy layer
• Configuration and synchronisation
• Major disadvantage: No transparency for service consumers
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 5
Approach
• Redundant service instances
• Active and passive replicas
• Proxy layer
• Service call forwarding from proxy to service replicas
• Flexible result evaluation in proxy to determine reply to
caller
• OSGi Remote Service specification used
• Apache Zookeeper used for group communication to
establish consistent views
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 6
Architecture
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 7
Configuration Setup
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 8
Runtime
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 9
Performance Evaluation (1/2)
Configuration Avg (ms) Min (ms) Max (ms) Std. Dev. (ms)
No Fault Tolerance 2,4 1 26 1,3
1 Active / 1 Passive 7,1 4 64 3,6
2 Active / 1 Passive 8,2 4 61 5,6
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 10
Type Avg (ms) Min (ms) Max (ms) Std. Dev. (ms)
Service Instance Failover 11,1 1 30 7,0
Framework Instance
Failover
3881,9 3013 5391 687,7
• Call time (2000 service calls)
• Reconfiguration time (50 failovers)
Setup:
CPU: Intel Core 2 6320
Phys. Mem.: 3 GB
Java: JDK 1.6
OSGi-FW.:
Eclipse Equinox 3.6
RS-Impl.:
Eclipse Communication
Framework
Performance Evaluation (2/2)
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 11
4,1 ms
4,1 ms
Conclusion
• Design and implementation of a fault tolerance concept
for distributed OSGi applications • Transparent proxy layer with call forwarding
• Automated replica distribution
• Active and passive redundancy
• Flexible result evaluation in proxy to determine reply to caller
• State synchronization between replicas
• Future Work: • Detailed performance evaluation
• Utilization in real application
• Cooperation with industrial partners
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 12
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 13
Thank you for your attention!
Questions?
References
OSGi Alliance Marketing © 2008-2011 . All Rights Reserved Page 14
[1] Deuster, Patrick. Ein Fehlertoleranzkonzept für verteilte OSGi-Anwendungen. Master Thesis, Wiesbaden University of Applied Sciences (2011)
[2] Kwon, Young-Woo; Tilevich, Eli; Apiwattanapong. Taweesup: DR-OSGi: Hardening Distributed Components with Network Volatility Resilency. In: Middleware (2009), 1-20.
[3] Ahn, Heejune; Oh, Hyukjun; Sung, Chang O.: Towards reliable OSGi framework and applications. In: Proceedings of the 2006 ACM synposium on Applied computing. ACM (SAC ´06).
[4] Torrão, Carlos; Carvalho, Nuno A.; Rodrigues, Luís: FT-OSGi: Fault Tolerant Extensions to the OSGi Service Platform. In: Proceedings of the Confederated International Conferences, CoopIS, DAO, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part 1. Springer-Verlag (OTM ´09)