towards adaptive fault tolerance on ros for advanced driver … · 2017. 7. 3. · towards adaptive...

30
-1- -1 Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems Matthieu Amy Jean-Charles Fabre, Michael Lauer Toulouse, France

Upload: others

Post on 07-Sep-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-1--1

Towards Adaptive Fault Tolerance on ROS for Advanced Driver

Assistance Systems

Matthieu Amy

Jean-Charles Fabre, Michael Lauer

Toulouse, France

Page 2: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-2-

Context and trends From ADAS to autonomous driving, e.g. ACC (Adaptive Cruise Control), TJP (Traffic Jam Pilot)….. Etc.

Agile Development Process…. Rapid prototyping… meaningShort validation time…

Remote dynamic updates, maintenance, improvements, news features… novel business

Teslavehiclesregularlyreceivesover-the-airsoftwareupdatesthataddnewfeaturesandfunctionality.Whenanupdateisavailable,you’llbenotifiedonthecenterdisplaywithanoptiontoinstallimmediately,orscheduletheinstallationforalatertime.Connectyourvehicletoyourhome’sWi-Finetworkforthefastestpossibledownloadtime.

Page 3: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-3-

Context and trends From ADAS to autonomous driving, e.g. ACC (Adaptive Cruise Control), TJP (Traffic Jam Pilot)….. Etc.

Agile Development Process…. Rapid prototyping… meaningShort validation time…

Remote dynamic updates, maintenance, improvements, news features… novel business

Teslavehiclesregularlyreceivesover-the-airsoftwareupdatesthataddnewfeaturesandfunctionality.Whenanupdateisavailable,you’llbenotifiedonthecenterdisplaywithanoptiontoinstallimmediately,orscheduletheinstallationforalatertime.Connectyourvehicletoyourhome’sWi-Finetworkforthefastestpossibledownloadtime.

Safety critical system… stringent dependability issues

despite fast evolution!

Resilient Computing: persistence of dependability despite changes

Page 4: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-4--4

Motivations and objectives of our on-going work!

Fast evolution, Agile Dev. , time to market, Over-the-Air updates…..

Page 5: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-5-

Once the system is deployed, it faces changes due to maintenance or evolution.

System designers cannot predict everything in advance….

Persistence of dependability requires the adaptation of safety mechanism

Problem statement and key concepts

Key concepts for Adaptive Fault Tolerance (AFT) - Separation of concerns -  Design for adaptation - Remote fine-grained updates

Page 6: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-6-

Outline

u IntroductiontoAdaptiveFaultTolerantComputing

u WhatruntimesupportforAFTasaLegosystem:ROS?

u HowtocombineAFTwithover-the-airupdatesofcriticalADAS?

u AsimpleexperimentalplatformChange model

Design for adaptation of FTMs

Component-based implementation

Transitions between FTMs

Page 7: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-7-

Assumptions and FTM Characteristics

PBR

LFR LFR ⊕ TR

PBR ⊕ TR

FT

A, R A, R

FT

PBR=Primary-Backup Replication LFR=Leader-Follower Replication TR=Time Redundancy

Trigger:highrateofHWtransientfaultsobserved

Trigger:NondeterministicSWapplicationversion

Trigger:bandwidthdropbelowagiventhreshold

TRANSITIONS

Page 8: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-8-

Componentization of FTM

Change model

Design for adaptation of FTMs

Component-based implementation

Transitions between FTMs

àrequest

ßreplyapplication

servicefault tolerant processing

Page 9: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-9-

Componentization of FTM

Change model

Design for adaptation of FTMs

Component-based implementation

Transitions between FTMs

FTM

replyLog

syncBefore proceed

syncAfter

protocol àrequest

ßreplyapplication

service

Before-Proceed-AfterGenericFrameworkforImplementinganyFTM

Page 10: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-10--10

Is ROS a good candidate for AFT in automotive embedded systems?

“BMWhasbeenworkingonautomateddrivingforthelastdecade,steadilyimplementingmoreadvancedfeaturesrangingfromemergencystopassistanceandautonomoushighwaydrivingtofullyautomatedvaletparkingand360°collisionavoidance.Severaloftheseprojectswerepresentedatthe2015ConsumerElectronicsShow,andasitturnsout,thecarswererunningROSforbothenvironmentdetectionandplanning.”(MichaelAeberhard(BMW):AutomatedDrivingwithROSatBMW,May31,2016)

Page 11: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-11-

What is ROS ?Publish-subscribe middleware

Ø Rosmaster : Communication master

Ø Nodes : isolated processes

Ø TCP/IP communicationü Topic for asynchronous communicationsü Service for synchronous interaction

11

Implementationofaasynchronouscommunication(Topic)

Page 12: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-12-

Design for FTM adaptation on ROS

•  Topics(0) •  Nodes(2)–  Client–  Server

Generic computation graph for FTM

Services: clt2srv (client to server)

(Boxesrepresentnodes)

Client

clt2srv

Server

Page 13: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-13-

Design for FTM adaptation on ROS

•  Topics(6)–  pxy2pro–  pxy2bfr, bfr2prd,prd2aft–  aft2pro–  pro2pxy

•  Nodes(5+2)–  Client–  Server–  Proxy–  Protocol–  Before, Proceed, After

Generic computation graph for FTM

Services: clt2pxy (client to proxy) and prd2srv (proceed to server)

(Boxesrepresentnodes)

Client

Proxy

Before

Proceed

After

Protocol

bfr2prd

prd2aft

pro2pxy Service

Topic

pxy2pro

pro2bfr

aft2pro

clt2srvprd2srv

FTM

Server

Page 14: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-14-

Implementing PBR on ROS

Client

Recovery

Proxy

Before

Proceed

After

Server_M

Protocol

CLIENT PRIMARY

Before

Proceed

After

Protocol

BACK-UP

Server_S

CD_M

CD_S

clt2pxy

pxy2pro pro2bfr

bfr2prd

prd2aft

aft2pro pro2pxy

cd2rec

recovery

getstate

prd2srv_M

setstate

prd2srv_S

pro2bfr

aft2pro

MASTER

SLAVE

aft2aft

Service

Topic

bfr2prd

prd2aft

pxy2pro

pro2pxy

Page 15: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-15-

Implementing TR on ROS

Client

Proxy

Before

Proceed

After

Server_M

Protocol

CLIENT TR

clt2pxy

pxy2pro pro2bfr

bfr2prd

prd2aft

aft2pro pro2pxy

prd2srv_M

MASTER

getstate_M setstate_M

aft2bfr

Service

Topic

Page 16: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-16-

Combining FTM on ROS

•  Protocol node is a software rack of nodes–  Before –  Proceed è activation of services or protocols–  After

•  Protocol node can substitute for proceed node–  It can be view as a frontend of the server…

Generic composition graph for FTM

Client

Proxy

Before

After

Protocol

bfr2prd

prd2aft

pro2pxy

pxy2pro

pro2bfr

aft2pro

clt2srv

FTM1

prd2srv

Proceed Server

Service

Topic

Page 17: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-17-

Client

Proxy

Before

After

Protocol

bfr2prd

prd2aft

pro2pxy

pxy2pro

pro2bfr

aft2pro

clt2srv

FTM1

Before

Proceed

After

Protocol

bfr2prd

prd2aft

Service

Topic

pro2bfr

aft2pro

FTM2

Combining FTM on ROS

•  Protocol node is a software rack of nodes–  Before –  Proceed è activation of services or protocols–  After

•  Protocol node can substitute for proceed node–  It can be view as a frontend of the server…

Generic composition graph for FTM

prd2srv

Server

Page 18: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-18-

Combining PBR+TR on ROS

Client

Recovery

Proxy

Before

After

Protocol

CLIENT MASTER

Before

After

Protocol

BACK-UP

CD_M

CD_S

clt2pxy

pxy2pro pro2bfr

bfr2prd_S

aft2pro pro2pxy

/cd2rec

recovery

TR

Before

Proceed

After

Server_M

pro2bfr

bfr2prd

prd2aft

aft2pro

getstate_M setstate_M

prd2srv_M

aft2bfr

PRIMARY

TR

Before

Proceed

After

Server_M

pro2bfr

bfr2prd

prd2aft

aft2pro

getstate_S setstate_S

prd2srv_S

aft2bfr

SLAVE

prd2aft_S

bfr2prd_M

prd2aft_M

pro2bfr

aft2pro

aft2aft

getstate_M

restorestate_S

Protocol

Protocol

Page 19: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-19-

Lessons learnt•  ROS nodes

–  confinement area / space partitioning–  Graph of nodes / active components

•  Node control:–  Manipulation of the nodes (add, remove), –  Suspend/activate nodes done using Unix Signals sent by an Adaptation

Node–  Buffering of messages

•  Bindings–  Bindings at initialization only (notion of remapping).–  Port management function added to nodes and invoked by a Recovery

Node as a service

•  SummaryDynamicity of control and bindings solved using ROS features + Unix Signal + additional logic into the application nodes + sysadmin nodes Adaptation and Recovery Nodes.

Page 20: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-20--20

Experimental platform for development and validation of resilient ADAS

Page 21: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-21-

Why an experimental Platform

•  Motivation–  Development of a simulation platform for ADAS–  Failure mode analysis using fault injection techniques

•  Status–  Development of Traffic Jam Pilot ADAS–  Dependable computing architecture

•  On-going work–  Over-The-Air updates

•  Improvement / variants of the TJP•  Dynamic reconfiguration of FTM

–  Validation of by fault injection

Page 22: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-22-

Global Platform Architecture

•  Platform–  ROS Implementation of the TJP–  Duplex architecture and FT strategy

•  Gazebo: 3D simulator–  The Car dynamics–  Virtual sensors

•  Real sensors

PhysicalSensors

Page 23: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-23-

Use Cases – TJP ADAS Separa&ngdistance

Setpointdistance

The TJP automatically adjusts the speed of the follower car to maintain a safe distance to the master one

Page 24: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-24-

Simulation with Gazebo

•  TJP with two cars–  Master simulating a

traffic jam–  Follower with sensors

controlling a distance

•  Plugins: Sensors (Follower only)–  Laser sensor (distance)–  Inertial Measurement Unit (speed)

Gazebo3Dsimulationenvironment

Page 25: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-25-

Simulation with Gazebo

•  Master–  Speed profile

•  Follower–  Speed set point

•  TJP with two cars–  Master simulating a

traffic jam–  Follower with sensors

controlling a distance

Gazebo3Dsimulationenvironment

Page 26: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-26-

Functional validation

Entity Input Output Test

ROS

distanceSecurityCalculator

CurrentSpeed

Setpointdistance

Giveaspeedvalue,checkthesetpointdistancecalculated

controllerPID

Setpointdistance+

Separatingdistance

Speedcommand

Checkthespeedvalue

cmdManager

Speedcommand

Speedcommand

Checktheprioritymanagement

realUltrasonicSensor Realdistance

SpeedCommand Checkthedatareadbythesensor

GAZEBO

sensorSensor Car+obstacle

Separatingdistance Checkthedatareadbythesensor

imuSensor Carmoving

Currentspeed Checkthedatareadbythesensor

cmdFollowerCar

Speedcommand

X

Checkthatthecarismovingwhenaspeedcommandisreceived

Page 27: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-27-

FMEA (Failure Mode and Effects Analysis)

Page 28: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-28-

Prototyping Mock-up •  Simulator (PC)

–  Cars–  Virtual sensors

•  Physical platform–  Arduino Uno–  Raspberry Pi 3–  Real ultrasonic sensor

PhysicalSensors

Page 29: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-29--29

Conclusion

Page 30: Towards Adaptive Fault Tolerance on ROS for Advanced Driver … · 2017. 7. 3. · Towards Adaptive Fault Tolerance on ROS for Advanced Driver Assistance Systems ... Agile Development

-30-

Summary

SoC ROSnodes,componentmappingtonodes

D4A ComponentizedFTdesignpatternsProtocol-Before-Proceed-After

NodesMngmnt

UnixsystemcallsandROScommands

DynamicBinding

ROSservices,ports,topicsAdditionallogictocreateportsandtopics

ExperimentalPlatform

Mock-upforvalidation,Hardwaresupport,Executivesupport