phd defense slides
DESCRIPTION
Software is moving towards evolutionary architectures that are able to easily accommodate changes and integrate new functionality. This is important in a wide range of applications, from plugin-based end user applications to critical applications with high availability requirements.Dynamic component-based platforms allow software to evolve at runtime, by allowing components to be loaded, and executed without forcing applications to be restarted. However, the flexibility of such mechanism demands applications to cope with errors due to inconsistencies in the update process, or due to faulty behavior from components introduced during execution. This is mainly true when dealing with third-party components, making it harder to predict the impacts (e.g., runtimeincompatibilities, application crashes) and to maintain application dependability when integrating such third-party code into the application. Components whose origin or quality attributes are unknown could be considered as untrustworthy since they can potentially introduce faults to applications when combined with other components, even if unintentionally. The quality of components is harder to evaluate when components are combined together, especially if it happenson-the-fly. We are interested in reducing the impact that can be brought by untrustworthycomponents deployed at runtime and that would potentially compromise application dependability.This thesis focuses on applying techniques for moving a step forward towards dependabledynamic component-based applications by addressing different dependability attributes namely reliability, maintainability and availability. We propose the utilization of strong component isolation boundaries, by providing a fault-contained environment for separately running untrustworthy components. Our solution combines three approaches: (i) the dynamic isolation of components, governed by a runtime reconfigurable policy; (ii) a self-healing component isolation container; and (iii) the usage of aspects for separating dependability concerns from functional code.TRANSCRIPT
Kiev SANTOS DA GAMA Laboratoire d’Informatique de Grenoble
Université de Grenoble
Towards Dependable Dynamic �Component-Based Applications
Thèse soutenue publiquement le 6 Octobre 2011, devant le jury: Mme Claudia RONCANCIO Professeur, Ensimag - Grenoble INP, Président M Gilles MULLER Directeur de Recherche, INRIA, Rapporteur M Lionel SEINTURIER Professeur, Institut Univ. de France & Univ. de Lille, Rapporteur M Ivica CRNKOVIC Professor, Mälardalen University, Examinateur M Gaël THOMAS Maître de Conférences, Univ. Pierre et Marie Curie, Examinateur M Didier DONSEZ Professeur, Université Joseph Fourier, Directeur M Peter KRIENS Technical Director, OSGi Alliance, Invité
Extensible Applications
2
Different elements (components) easily pluggable into the application
6000+ extensions ���
at firefox.org
06 October 2011 PhD Defense Kiev Gama
Components from Many Sources
3 06 October 2011 PhD Defense Kiev Gama
Components from Many Sources
4
Crash
06 October 2011 PhD Defense Kiev Gama
Whose fault is it?
Who is liable? User/Administrator? Plugin Provider? Platform (i.e. the browser)?
What can be done about it?
Should the whole application pay the price for someone else’s fault?
5 06 October 2011 PhD Defense Kiev Gama
“A chain is as strong as its weakest link”
6
“A component system is only as strong as its weakest component” [Szyperski 2002]
06 October 2011 PhD Defense Kiev Gama
Main Question
How to provide a flexible mechanism for untrustworthy components execution minimizing risks to the application?
7 06 October 2011 PhD Defense Kiev Gama
Back to the browsers: �Isolation Trend
8
Fault is contained. Browser remains intact
06 October 2011 PhD Defense Kiev Gama
Limitations
No automatic recovery of faulty plugin No monitoring for diagnosing and fault avoidance
9
OK for browsers. What about other contexts?
06 October 2011 PhD Defense Kiev Gama
Critical Applications Availability > 99%������Unavailability = losses (money, data, lives) Business-Critical: Banking
eCommerce Non-stop systems
10
Dynamic reconfigurations needed at runtime���
with minimal system disruption
06 October 2011 PhD Defense Kiev Gama
Dynamic Reconfiguration �Potential source of faults �
11
System
Parts Repository (plugins, components, ���
elements, etc)
06 October 2011 PhD Defense Kiev Gama
Main Question
How to provide a flexible mechanism for untrustworthy components execution minimizing risks to the application in a dynamic environment?
12 06 October 2011 PhD Defense Kiev Gama
STATE OF THE ART OBJECTIVES AND PROPOSITIONS IMPLEMENTATION VALIDATION CONCLUSIONS AND PERSPECTIVES
06 October 2011 PhD Defense Kiev Gama 13
STATE OF THE ART � I. COMPONENTS II. DEPENDABILITY III. ISOLATION
06 October 2011 PhD Defense Kiev Gama 14
Components
Software Component
Component Platform Component Quality
15 06 October 2011 PhD Defense Kiev Gama
“A component is a static abstraction with plugs” [Nierstrasz 1995]
Software Component
16
“A software component is a unit of composition with contractually specified interfaces and explicit context dependencies only. A software component can be deployed independently and is subject to composition by third parties.”
[Szyperski 2002]
06 October 2011 PhD Defense Kiev Gama
Component Platform
17
“A platform is the substrate that allows for installation of components … such that these can be instantiated and activated.”
[Szyperski 2002]
06 October 2011 PhD Defense Kiev Gama
Component Quality “ilities” (reliability, maintainability, usability, etc) Quality attributes difficult to evaluate
Sometimes Subjective May involve many subcharacteristics
Combined components ≠ Combined attributes Hard to predict or test all possible compositions Worse in dynamic platforms
18 06 October 2011 PhD Defense Kiev Gama
Need to execute untrustworthy components but still ensuring system dependability
STATE OF THE ART I. COMPONENTS II. DEPENDABILITY III. ISOLATION
06 October 2011 PhD Defense Kiev Gama 19
Dependability
Dependability involves other attributes ���(e.g., availability, reliability, maintainability)
Dependability in a changing environment: Resilience Ability to recover/adjust from changes
20
“the ability to avoid service failures that are more frequent and more severe than is acceptable”
[Avizienis 2004]
06 October 2011 PhD Defense Kiev Gama
Fault Tolerance
Typically implemented through redundancy techniques
Fault containment as a means to reduce fault impact
21 06 October 2011 PhD Defense Kiev Gama
Types of Fault
• Deterministic – Programming errors
• Abnormal behavior (intentional or not) – Reproducible bugs
• Non-deterministic – Race conditions – Hardware origin
• Electric noise • Bit flips • Cosmic rays
22
It may happen with ���trustworthy code
06 October 2011 PhD Defense Kiev Gama
Recovery Mechanisms
23
Recovery
Self-healing Recovery-oriented Computing
Autonomic Computing Resilient Systems
06 October 2011 PhD Defense Kiev Gama
STATE OF THE ART I. COMPONENTS II. DEPENDABILITY III. ISOLATION
06 October 2011 PhD Defense Kiev Gama 24
Means of protection from other users (Humans, Systems, Components)
���
Avoiding Harms Destroyed/Modified data Data read without permission
Degraded service
Isolation
25
Privacy
Fault containment
06 October 2011 PhD Defense Kiev Gama
Isolation Techniques
Hardware-enforced Process-based Virtualization
Software-based
Application-level domains Security Managers
26
Process Process
Process
Domain
Process
,Policy
Domain
06 October 2011 PhD Defense Kiev Gama
Techniques Summary
27
Privacy Fault Containment
Process-based P P Virtualization P P Security Managers P O Application-level Domains P P
06 October 2011 PhD Defense Kiev Gama
Component Isolation • Component Object Model
– In-process – Out-of-process server
• .NET Platform – Application Domains – Security managers
• Java – Security managers – Class loaders – Isolates
28 06 October 2011 PhD Defense Kiev Gama
Component Isolation Summary
29
Privacy Fault Containment
COM (In-process) P O COM (out-of-process) P P
.NET Application Domains P P
.NET Security Managers P O
Java Security Managers P O
Java Class loaders P O Java Isolates P P
06 October 2011 PhD Defense Kiev Gama
Limitations of Studied Approaches as Dependable Component Platforms
30 06 October 2011 PhD Defense Kiev Gama
No automatic automatic recovery from faults
Lack of fault monitoring mechanisms
Decision about isolation is made at design time
STATE OF THE ART OBJECTIVES AND PROPOSITIONS IMPLEMENTATION VALIDATION CONCLUSIONS AND PERSPECTIVES
06 October 2011 PhD Defense Kiev Gama 31
Vision
Still live with failure
Minimize the impact of untrustworthy components
More dependable dynamic component-based applications
32 06 October 2011 PhD Defense Kiev Gama
Objectives
33 06 October 2011 PhD Defense Kiev Gama
Flexible Isolation of Components
Automatic Recovery from Faults
Propositions
Dynamic isolation of components I. Component Isolation Containers II. Runtime Reconfigurable Policy
Self-healing Container I. Continuous Monitoring II. Automatic recovery
34 06 October 2011 PhD Defense Kiev Gama
Example Scenario
06 October 2011 PhD Defense Kiev Gama 35
RFID Reader
Sensor
RFID Application
Report Generator Data Gathering
PROPOSITIONS� DYNAMIC ISOLATION OF COMPONENTS I. COMPONENT ISOLATION CONTAINERS II. RUNTIME RECONFIGURABLE POLICY SELF-HEALING CONTAINER I. CONTINUOUS MONITORING II. AUTOMATIC RECOVERY
Dynamic Isolation of Components
I. Component Isolation Containers Component quarantine A “sandbox” approach Fault confinement
II. Runtime Reconfigurable Policy Isolation at runtime (i.e. dynamic) Promotion of components
37 06 October 2011 PhD Defense Kiev Gama
Dynamic Isolation of Components
38
Report Generator Data Gathering
Communication
06 October 2011 PhD Defense Kiev Gama
Sensor X Sensor Y Reader A Reader B
I. Component Isolation Containers
Dynamic Isolation of Components
39
Report Generator Data Gathering
06 October 2011 PhD Defense Kiev Gama
Sensor X Sensor Y Reader A Reader B
Crash Crash The fault is contained
Dynamic Isolation of Components
40
Report Generator Data Gathering
06 October 2011 PhD Defense Kiev Gama
Sensor X Sensor Y Reader A Reader B
II. Runtime Reconfigurable Policy New Reader
Check Persistence
Dynamic Isolation of Components
41
Report Generator Data Gathering
06 October 2011 PhD Defense Kiev Gama
Sensor X Sensor Y Reader A Reader B
II. Runtime Reconfigurable Policy
Change
Apply changed ���policy
Promoted component
How Many Sandboxes?
N-sandboxes x One sandbox How to group components?
Trustworthiness
Different Levels Cohesion
Same provider Similar functionality
Coupling Dependencies Intensive communication
42 06 October 2011 PhD Defense Kiev Gama
Criteria
PROPOSITIONS DYNAMIC ISOLATION OF COMPONENTS I. COMPONENT ISOLATION CONTAINERS II. RUNTIME RECONFIGURABLE POLICY SELF-HEALING CONTAINER I. CONTINUOUS MONITORING II. AUTOMATIC RECOVERY
Self-Healing Container
I. Continuous monitoring Problem Diagnosis
Observation for future promotion (quarantine period)
II. Automatic Recovery Restablished execution
44 06 October 2011 PhD Defense Kiev Gama
Self-Healing Container
45
Report Generator Data Gathering
06 October 2011 PhD Defense Kiev Gama
Sensor X Sensor Y Reader A Reader B
I. Continuous Monitoring
Self-Healing Container
46
Report Generator Data Gathering
06 October 2011 PhD Defense Kiev Gama
Sensor X Sensor Y Reader A Reader B
Crash Crash
I. Continuous Monitoring
Sensor X Sensor Y Reader A Reader B
Recovery
Self-Healing Container
47
Report Generator Data Gathering
06 October 2011 PhD Defense Kiev Gama
II. Automatic Recovery
Summary
Propositions��� Dynamic Isolation of components I. Component isolation containers II. Runtime reconfigurable policy Self-healing container I. Continuous monitoring II. Automatic recovery
Differences against other approaches
Flexible isolation Self-healing isolation container
48 06 October 2011 PhD Defense Kiev Gama
STATE OF THE ART OBJECTIVES AND PROPOSITIONS IMPLEMENTATION VALIDATION CONCLUSIONS AND PERSPECTIVES
06 October 2011 PhD Defense Kiev Gama 49
IMPLEMENTATION COMPONENT ISOLATION
I. TARGET COMPONENT PLATFORM II. ISOLATION APPROACH III. ISOLATION TECHNIQUES USED
IV. RECONFIGURABLE POLICY SELF-HEALING SANDBOX
I. AUTONOMIC MANAGER II. FAULT MODEL
06 October 2011 PhD Defense Kiev Gama 50
Target Component Platform
(un)Installation of components at runtime
Non-stop applications
OSGi A module system for Java applications
Used in industry and academia
51 06 October 2011 PhD Defense Kiev Gama
Isolation Approach
Approach used for isolating components Two Component platforms:
Trusted Sandbox (Quarantine )
Replicated components (for type dependency purpose) ���
Mutual exclusive states
52 06 October 2011 PhD Defense Kiev Gama
Trusted Platform Sandbox Platform
Isolation Approach: Mutual Exclusive States
53
STARTED
OSGi
Virtual Perspective
Bundle A Bundle DBundle B Bundle C
STARTED
STARTED RESOLVED RESOLVED STARTED STARTED
MainOSGi
SandboxOSGi
Sandbox Platform
Bundle A Bundle B Bundle DBundle ABundle DBundle B
RESOLVED
Bundle C
STARTED
?? ? ?
Bundle C
RESOLVED
Trusted Platform
Fault Contained Environment
STARTEDSTARTED
?
Trustworthy
Untrustworthy
Legend
06 October 2011 PhD Defense Kiev Gama
Untrustworthy components are active on the sandbox platform
Trustworthy components are active execute on the trusted platform
Actually two���running platforms
Impression of having ���a single application
Isolation Approach: Virtual Perspective
54
STARTED
OSGi
Virtual Perspective
Bundle A Bundle DBundle B Bundle C
STARTED
STARTED RESOLVED RESOLVED STARTED STARTED
MainOSGi
SandboxOSGi
Sandbox Platform
Bundle A Bundle B Bundle DBundle ABundle DBundle B
RESOLVED
Bundle C
STARTED
?? ? ?
Bundle C
RESOLVED
Trusted Platform
Fault Contained Environment
STARTEDSTARTED
?
Trustworthy
Untrustworthy
Legend
06 October 2011 PhD Defense Kiev Gama
Actually two���running platforms
Impression of having ���a single application
Domain-based (Java Isolates) strong isolation containers ���with fault containment
Process-based (Java Virtual Machine)
�
55
Isolation Techniques Used
06 October 2011 PhD Defense Kiev Gama
Process (JVM)
Process (JVM)
Process (MVM)
Isolate Isolate
Communication between Containers
56
MainOSGi
SandboxOSGi
Bundle A Bundle B Bundle DBundle ABundle DBundle B Bundle C
?? ? ?
Bundle C
JVM(MVM)
Communicationvia
Sockets or Link API
(JSR-121)
Java Isolate Java Isolate
MainOSGi
SandboxOSGi
Bundle A Bundle B Bundle DBundle ABundle DBundle B Bundle C
?? ? ?
Bundle C
Communitationvia
Sockets
JVM JVM
06 October 2011 PhD Defense Kiev Gama
Reconfigurable Policy
06 October 2011 PhD Defense Kiev Gama 57
Isolation Policy Model
IMPLEMENTATION COMPONENT ISOLATION
I. TARGET COMPONENT PLATFORM II. ISOLATION APPROACH III. ISOLATION TECHNIQUES USED
IV. RECONFIGURABLE POLICY SELF-HEALING SANDBOX
I. AUTONOMIC MANAGER II. FAULT MODEL
06 October 2011 PhD Defense Kiev Gama 58
Self-healing Sandbox
The sandbox with an automatic recovery mechanism
An autonomic manager for the sandbox External application Control loop using a sense, analyze and react principle
Fault detection and forecast Pragmatic approach based on a fault model
59 06 October 2011 PhD Defense Kiev Gama
Self-healing Sandbox ���Architecture
Watchdog Strategy���Executor
Knowledge
Monitor Policy���Evaluator
Script Interpreter
Trusted Platform
Sandbox Platform
Autonomic Manager
<<use>> <<use>>
<<delegate>>
<<use>>
<<delegate>>
<<use>>
<<use>>
HeartbeatProbe SensorProbe EffectorProbe
���Monitoring���
MBean
���EffectorMBean
<<delegate>>
<<use>>
<<delegate>>
PlatformProxy���
���Core���
������
Isolation���Policy Eval. ���
<<use>>
PlatformProxy���
<<delegate>>
<<delegate>>
<<delegate>>
<<delegate>>
<<use>>
Service���Registry���
Service���Registry���
<<delegate>> <<delegate>>
<<use>>
<<use>>
<<use>>
<<use>>
<<use>>
<<use>>
<<use>>
���Core���
<<use>>
<<use>>
<<use>>
60 06 October 2011 PhD Defense Kiev Gama
AP
Self-healing Sandbox �Control Loop Details
K
Script Repository
Sandbox
M Watchdog Monitor E
Autonomic Manager
Knowledge
Strategy���Executor
Policy���Evaluator
Sys. Admin.
61
Monitor
Analyze and Plan
Execute
06 October 2011
Sensors Effectors
Fault Model Hypotheses of faults General issues
Resource Consumption (e.g. CPU, memory) Crashes (e.g., errors from wrapped native libraries)
Specific dynamism mishandling issues
Dangling objects (stale services)
06 October 2011 PhD Defense Kiev Gama 62
FaultyBehaviorResource Usage
Unresponsiveness
Excessive Thread
AllocationCPU
MemoryStale Service
Denial of Service Crash
ApplicationHang
Separation of Concerns Dependability as crosscutting concerns Aspect-oriented Programming approach All dependability code in aspects
63
Aspect Weaver
Application code
Woven code
Aspects
06 October 2011 PhD Defense Kiev Gama
Implementation Summary
06 October 2011 PhD Defense Kiev Gama 64
Self-healing container I. Continuous monitoring II. Automatic recovery
Dynamic Isolation of components I. Component isolation containers II. Runtime reconfigurable policy
Prop
ositi
ons
Domain-based (Isolates) Process-based (Multiple JVMs)
DSL
Autonomic Manager
STATE OF THE ART OBJECTIVES AND PROPOSITIONS IMPLEMENTATION VALIDATION CONCLUSIONS AND PERSPECTIVES
06 October 2011 PhD Defense Kiev Gama 65
VALIDATION EXPERIMENTS USE CASE DOMAIN-BASED X PROCESS-BASED TEST PLATFORM SELF-HEALING CONTAINER VALIDATION
06 October 2011 PhD Defense Kiev Gama 66
Experiments Use Case
Aspire RFID FP7 project RFID Network Non-stop servers collecting data Plug-and-play devices Native code for drivers puts stability in risk
67 06 October 2011 PhD Defense Kiev Gama
EPC ISPremise
Edge
Edge
RFID Readers + Sensors
ONSEdge
EPC IS
06 October 2011 PhD Defense Kiev Gama 68
RFID Reader
Sensor
RFID Application
EPC ISPremise
Edge
Edge
RFID Readers + Sensors
ONSEdge
EPC IS
Experiments Use Case
Criteria
Memory footprint Application startup Sandbox reboot time
Process-based x Domain-based���
Virtual Machines used
MVM (Java 1.5) Sun Oracle Hotspot JVM 1.5 Sun Oracle Hotspot JVM 1.6
69 06 October 2011 PhD Defense Kiev Gama
JVM 1.5 JVM 1.5
Trusted Platform Sandbox Platform
MVM (Java 1.5)
Isolate Isolate
Trusted Platform Sandbox Platform
JVM 1.6 JVM 1.6
Trusted Platform Sandbox Platform
Results
06 October 2011 PhD Defense Kiev Gama 70
0
10
20
30
40
50
60
70
80
90
MVM (2 Isolates) 2 x JVM 1.5 2 x JVM 1.6
Single JVM (Domain-based) Sandbox Trusted platform
MB
Isolation Containers Application Startup time (ms)
Sandbox Crash detection time (ms)
Sandbox Reboot time (ms)
MVM (Multi-Isolate) 3186 32 303
MVM 1.5 (Multi-JVM) 3449 697 3064
JVM 1.5 3945 660 3047
JVM 1.6 3859 658 2537
Mean time to repair on sandbox is faster when using Isolates
Footprint of our solution using process-based isolation is equivalent to domain-based isolation
Generic Test Platform Fault deployment instead of fault injection
– Emulation of erroneous behavior based on our fault model – Fault injection in the interface level does not represent actual
application usage
Management probes for triggering the faults
71
MBeanServer
Sandbox OSGi
ReportGenerator Sensor XCore
InterfacesSensor
Aggregator
JVM
Management and Monitoring Console
(JConsole, VisualVM)
JVM
Reader Simulator B
Reader Simulator ASensor Y
TestProbe
TestProbe
TestProbe
TestProbe
RMIConnector
06 October 2011 PhD Defense Kiev Gama
Self-healing Container Validation
Fault detection – Fault model
Event causality – Heuristic for events correlation – Updates that trigger abnormal behavior – Useful for finding faulty components
Prediction of faults ���(e.g., Stale service retainers, Out of memory error)
72 06 October 2011 PhD Defense Kiev Gama
Results
73 06 October 2011 PhD Defense Kiev Gama
Proper actions taken upon abnormal behavior
Correlation of events was possible
STATE OF THE ART OBJECTIVES AND PROPOSITIONS IMPLEMENTATION VALIDATION CONCLUSIONS AND PERSPECTIVES
06 October 2011 PhD Defense Kiev Gama 74
Conclusions and Perspectives
75 06 October 2011 PhD Defense Kiev Gama
Dynamic Isolation of Components
Component Isolation Containers (“sandboxes”) Runtime Reconfigurable Policy
Self-healing Container
Continuous Monitoring Automatic recovery
Missing Characteristics Fine grained monitoring
Automatic promotion of well-behaving components
Automatic replacement of faulty components (e.g. taken from a repository)
76 06 October 2011 PhD Defense Kiev Gama
Open issue: How to automatically evaluate component trust ?
Automated Component Promotion Correlation of Historical Events Rating Component Trustworthiness
Resource monitoring at component level
Perspectives
77 06 October 2011 PhD Defense Kiev Gama
Diversity of Isolation Environments Embedded Systems Cloud Computing
�
[Thanks|Merci|Obrigado|Gracias]
? �