linux kernel extensions to minimize effects of software aging - clei2010

Post on 22-Nov-2014

49 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Outline Introduction Prototype Summary

Linux Kernel extensions to minimize effects ofSoftware Aging

Ariel Sabiguero Andres Aguirre Fabricio Gonzalez

Daniel Pedraja Agustın Van Rompaey

Instituto de Computacion, Facultad de Ingenierıa, Universidad de la RepublicaJ. Herrera y Reissig 565, Montevideo, Uruguay

{asabigue|aaguirre}@fing.edu.uy {fabgonz|danigpc|fenix.uy}@gmail.com

20/10/2010

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

1 IntroductionConceptsFinner grained rejuvenation

2 PrototypeProblem definitionKey challenges addressedKernel modifications performedValidationPerformance testing

3 Summary...ongoing workfinally...

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Concepts

Soft Errors

A soft error is a transient failure in semiconductors causingthe eventual lose of data integrity in memory.

It implies a change in a program or a data value.

Soft errors do not imply a permanent damage on system’shardware, the only damage is to the data that is beingprocessed.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Concepts

Software Aging & Rejuvenation

The term Software aging refers to the deteriorating in the availabilityof OS resources caused by data corruption.

Software Rejuvenation aims at proactive fault management tech-niques addressing the restoration of system’s internal state in orderto prevent the occurrence of failures.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Concepts

Software Aging & Rejuvenation

The term Software aging refers to the deteriorating in the availabilityof OS resources caused by data corruption.

Software Rejuvenation aims at proactive fault management tech-niques addressing the restoration of system’s internal state in orderto prevent the occurrence of failures.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Finner grained rejuvenation

A new approach

Instead of a proactive full process/systemrejuvenation we address a finner grain.

We take advantage of the fact that programcode and parts of program data remainconstant during program execution.

We will apply reactive rejuvenation to the constant areas of thesystem when they get modified.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Finner grained rejuvenation

A new approach

Instead of a proactive full process/systemrejuvenation we address a finner grain.

We take advantage of the fact that programcode and parts of program data remainconstant during program execution.

We will apply reactive rejuvenation to the constant areas of thesystem when they get modified.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Finner grained rejuvenation

A new approach

Instead of a proactive full process/systemrejuvenation we address a finner grain.

We take advantage of the fact that programcode and parts of program data remainconstant during program execution.

We will apply reactive rejuvenation to the constant areas of thesystem when they get modified.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Finner grained rejuvenation

Relevance of R.O. memory

State-of-the-art software engineering techniques suggest thatwe do not code programs that change their own instructions.

Modern systems allows the definition of certain sections ofprograms to be read only, that means, that remain constantall through program execution.

Different portions of code and data are marked R.O. atcompile time.

Modern compilers enforce the usage of R.O. memory on theirnative formats (ELF - Executable and Linking Format and PE- Portable Executable for Linux and Windows respectively).

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Finner grained rejuvenation

Relevance of R.O. memory

State-of-the-art software engineering techniques suggest thatwe do not code programs that change their own instructions.

Modern systems allows the definition of certain sections ofprograms to be read only, that means, that remain constantall through program execution.

Different portions of code and data are marked R.O. atcompile time.

Modern compilers enforce the usage of R.O. memory on theirnative formats (ELF - Executable and Linking Format and PE- Portable Executable for Linux and Windows respectively).

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Finner grained rejuvenation

Relevance of R.O. memory

State-of-the-art software engineering techniques suggest thatwe do not code programs that change their own instructions.

Modern systems allows the definition of certain sections ofprograms to be read only, that means, that remain constantall through program execution.

Different portions of code and data are marked R.O. atcompile time.

Modern compilers enforce the usage of R.O. memory on theirnative formats (ELF - Executable and Linking Format and PE- Portable Executable for Linux and Windows respectively).

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Finner grained rejuvenation

Relevance of R.O. memory

State-of-the-art software engineering techniques suggest thatwe do not code programs that change their own instructions.

Modern systems allows the definition of certain sections ofprograms to be read only, that means, that remain constantall through program execution.

Different portions of code and data are marked R.O. atcompile time.

Modern compilers enforce the usage of R.O. memory on theirnative formats (ELF - Executable and Linking Format and PE- Portable Executable for Linux and Windows respectively).

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Problem definition

Objective & target platform

Detect and handle the occurrence of Soft Errors in R.O.memory.

Platform

O.S.: GNU Linux Kernel 2.6.25.9Distribution: OpenSuSE 11.0Architecture: Intel x86

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Problem definition

Objective & target platform

Detect and handle the occurrence of Soft Errors in R.O.memory.

Platform

O.S.: GNU Linux Kernel 2.6.25.9Distribution: OpenSuSE 11.0Architecture: Intel x86

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Key challenges addressed

Read-Only Memory in Linux

Characteristics

Frame GranularityProtection scheme: User space onlyFrames shared between tasks

Read-only subset: frames mapped to one or more processeswith Read-Only access in every instance

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Key challenges addressed

Read-Only Memory in Linux

Characteristics

Frame GranularityProtection scheme: User space onlyFrames shared between tasks

Read-only subset: frames mapped to one or more processeswith Read-Only access in every instance

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Key challenges addressed

Error detection mechanism

Memory change-detection algorithm

Frame levelError detection code: CRC32

Search Strategies

System level Frame PollingTask subset PollingTask scheduler checks

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Key challenges addressed

Error detection mechanism

Memory change-detection algorithm

Frame levelError detection code: CRC32

Search Strategies

System level Frame PollingTask subset PollingTask scheduler checks

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Key challenges addressed

Error Handling actions

Error correction code: Hamming

Automatic File Rejuvenation

User space rejuvenation assistance

Error details =⇒ high granularity actionsAgent notifications =⇒ Synchronous actions

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Kernel modifications performed

Kernel Map

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Validation

Ensuring correctness of the implementation

Motivation: Separate bugs from Soft Errors

Challenges

Low error probability in our typical scenarioHardware error generation difficult and expensive

Fault Injection

Software based memory error simulationKernel integrated vs High levelExposed as System call

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Validation

Ensuring correctness of the implementation

Motivation: Separate bugs from Soft Errors

Challenges

Low error probability in our typical scenarioHardware error generation difficult and expensive

Fault Injection

Software based memory error simulationKernel integrated vs High levelExposed as System call

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Performance testing

Case study

We decided to evaluate the impact on an IO-boundedapplication and a CPU-bounded one.

Methodologically, we contrast benchmarks run on a modifiedkernel and a standard one (vanilla).

Different levels of performance in tasks depending onresources used:

Memory corruption correction routines almost do not competewith IO-bounded loads.CPU-bounded applications compete for the same resourceimpacting on system performance.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Performance testing

Case study

We decided to evaluate the impact on an IO-boundedapplication and a CPU-bounded one.

Methodologically, we contrast benchmarks run on a modifiedkernel and a standard one (vanilla).

Different levels of performance in tasks depending onresources used:

Memory corruption correction routines almost do not competewith IO-bounded loads.CPU-bounded applications compete for the same resourceimpacting on system performance.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Performance testing

Case study

We decided to evaluate the impact on an IO-boundedapplication and a CPU-bounded one.

Methodologically, we contrast benchmarks run on a modifiedkernel and a standard one (vanilla).

Different levels of performance in tasks depending onresources used:

Memory corruption correction routines almost do not competewith IO-bounded loads.

CPU-bounded applications compete for the same resourceimpacting on system performance.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Performance testing

Case study

We decided to evaluate the impact on an IO-boundedapplication and a CPU-bounded one.

Methodologically, we contrast benchmarks run on a modifiedkernel and a standard one (vanilla).

Different levels of performance in tasks depending onresources used:

Memory corruption correction routines almost do not competewith IO-bounded loads.CPU-bounded applications compete for the same resourceimpacting on system performance.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

Performance testing

Case study: performance results

IO-bounded CPU-bounded

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

...ongoing work

Future work

Address lose of cache locality.

Consider power consumption due to continuous 100% CPUusage.

Focus in embedded solutions

Improve CPU usage (different approach than on desktops).Test in architectures different from x86

Wish: to be able to test in ambient with more probability ofsoft errors (EMI).

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

...ongoing work

Future work

Address lose of cache locality.

Consider power consumption due to continuous 100% CPUusage.

Focus in embedded solutions

Improve CPU usage (different approach than on desktops).Test in architectures different from x86

Wish: to be able to test in ambient with more probability ofsoft errors (EMI).

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

...ongoing work

Future work

Address lose of cache locality.

Consider power consumption due to continuous 100% CPUusage.

Focus in embedded solutions

Improve CPU usage (different approach than on desktops).Test in architectures different from x86

Wish: to be able to test in ambient with more probability ofsoft errors (EMI).

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

...ongoing work

Future work

Address lose of cache locality.

Consider power consumption due to continuous 100% CPUusage.

Focus in embedded solutions

Improve CPU usage (different approach than on desktops).Test in architectures different from x86

Wish: to be able to test in ambient with more probability ofsoft errors (EMI).

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

finally...

Conclusions

We built and tested a prototype with the expectedcharacteristics.

The software rejuvenation implementation is based onsoftware instead the traditional hardware based scheme.

Our approach avoids full system restart or full process restart,for the kind of errors addressed.

Being simple and non-intrusive, it is aplicable to any piece of(Linux) software without any modifications.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

finally...

Conclusions

We built and tested a prototype with the expectedcharacteristics.

The software rejuvenation implementation is based onsoftware instead the traditional hardware based scheme.

Our approach avoids full system restart or full process restart,for the kind of errors addressed.

Being simple and non-intrusive, it is aplicable to any piece of(Linux) software without any modifications.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

finally...

Conclusions

We built and tested a prototype with the expectedcharacteristics.

The software rejuvenation implementation is based onsoftware instead the traditional hardware based scheme.

Our approach avoids full system restart or full process restart,for the kind of errors addressed.

Being simple and non-intrusive, it is aplicable to any piece of(Linux) software without any modifications.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

finally...

Conclusions

We built and tested a prototype with the expectedcharacteristics.

The software rejuvenation implementation is based onsoftware instead the traditional hardware based scheme.

Our approach avoids full system restart or full process restart,for the kind of errors addressed.

Being simple and non-intrusive, it is aplicable to any piece of(Linux) software without any modifications.

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

finally...

Thank you for your time

Questions?

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

Outline Introduction Prototype Summary

finally...

Linux Kernel extensions to minimize effects ofSoftware Aging

Ariel Sabiguero Andres Aguirre Fabricio Gonzalez

Daniel Pedraja Agustın Van Rompaey

Instituto de Computacion, Facultad de Ingenierıa, Universidad de la RepublicaJ. Herrera y Reissig 565, Montevideo, Uruguay

{asabigue|aaguirre}@fing.edu.uy {fabgonz|danigpc|fenix.uy}@gmail.com

20/10/2010

A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging

top related