deliverable d5.2: action library · 2018. 11. 27. · document info date and version number author...

TERESA - 611153 - FP7/2013-2016

Deliverable D5.2: Action Library

Project Acronym: TERESAProject Full Title: Telepresence Reinforcement-Learning Social Agent

Grant Agreement no. 611153

Due date: M20 - Jul. 31, 2015Delivery: July 31, 2015Lead partner: UvADissemination level: PublicStatus: SubmittedVersion: 1.0

DOCUMENT INFO

Date and Version Number Author Comments22.07.2015 v0.1 João Messias Initial version. Action De-

scriptions have been in-cluded.

29.07.2015 v0.2 João Messias Sections 4 and 5 completed.29.07.2015 v0.3 João Messias Added Figures.30.07.2015 v0.4 João Messias Full version. Pending format-

ting corrections.31.07.2015 v1.0 João Messias Final version.

Contents

1 Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Hierarchical Control for Human-Telepresence Robot Interaction . . . . . . . . . . 8

4.1 The Body Pose / HRI Control Problem . . . . . . . . . . . . . . . . . . . . 84.2 Actions as Generic Integrated Processes . . . . . . . . . . . . . . . . . . 12

5 Identifying Socially Normative HRI Actions . . . . . . . . . . . . . . . . . . . . . . 146 The Action Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

6.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186.2 Body Pose Control Actions . . . . . . . . . . . . . . . . . . . . . . . . . . 196.3 Communication Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256.5 An Example of the Use of the Action Library: a Preliminary Body Pose

Control Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276.6 Actions as Useful High-Level Pilot Commands . . . . . . . . . . . . . . . 29

7 Conclusions and Future Work in Task 5.3 . . . . . . . . . . . . . . . . . . . . . . 29

2

List of Figures

1 The TERESA telepresence robot adjusting its body pose in order to face aninteraction target at a specific distance, while matching his height. . . . . . . . . 6

2 The body pose of the TERESA robot can be fully specified through its baseposition (x, y), orientation ✓, head height ⇢ and head tilt '. . . . . . . . . . . . . . 8

3 The effects of low- and high-level robot control on decision-making complexity . . 114 Hierarchical decision-making / control scheme for the TERESA robot. . . . . . . 125 An example of a body pose action as a generic input/output process. . . . . . . . 136 A high-level representation of our proof-of-concept body pose control policy as

a Finite State Machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 Implementation of the proof-of-concept body pose control policy as a Hierarchi-

cal Finite State Machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3

List of Tables

1 A summary of the proposed actions, showing their input / output requirements,and how they are implemented. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4

TERESA - 611153

1 Executive Summary

This Deliverable describes the work that was carried out in the context of Task 5.3 of theTERESA project, which has the goal of specifying a set of socially appropriate actions thatthe TERESA telepresence robot may autonomously execute while its users are engaged inconversation. These actions are meant as abstractions of potentially complex and temporallyextended lower-level control rules that define the body pose of the robot with respect to itsinteraction targets (i.e. the orientation of the robot, its head height and tilt, etc.), and which cantherefore directly influence an ongoing conversation between its remote and local users. Theresulting action library forms a basis for the development of high-level body pose control poli-cies for the robot, which will later be developed in the context of Task 5.4, and which will selectthe most appropriate of these actions (or combination of actions) during execution, based onits expected social utility or cost.

In this document, we discuss in detail the definition and purpose of the aforementioned actionsas a part of the overall control architecture for the robot; and, based on the input from earlierwork on determining socially normative telepresence robot behavior (c.f. Deliverable 3.1), weidentify and specify the actions that were found to be relevant to the objective of maintainingsocially intelligent human-robot interaction.

2 Contributors

This work has resulted from a collaborative effort between the University of Amsterdam (UvA)and the University of Twente (UT), with additional input from Giraff Technologies (GT). UvA, asthe lead consortium partner on Work Package 5, provided the main functional requirements forthis action library, and designed the control architecture that will incorporate and combine theseactions into the complete body pose behaviors of the TERESA robot. UT provided insight intosocially normative robot behavior, which was fundamental in identifying the actions that couldbe relevant during human-robot interaction. GT was consulted regarding the hardware andsoftware limitations of the TERESA robot.

The main author of this deliverable is João Messias, post-doctoral researcher at UvA. Otherdirect contributors to this work are Shimon Whiteson (UvA), Maarten van Someren (UvA),Kyriacos Shiarlis (UvA), Jered Vroon (UT) and Gwenn Englebienne (UT).

D5.2: Action Library Page 5 of 30

TERESA - 611153

3 Introduction

The TERESA telepresence robot is intended as a semi-autonomous intelligent system thatcan behave as a social agent during its interaction with human users. That is, as an agent thatrespects social constraints and adheres to socially normative behavior. By being capable ofsocial behavior, the TERESA robot will be more than just a mobile communication platform,in that it will be capable of extending the ability of remote users to project themselves sociallythrough the robot. The semi-autonomous quality of the robot will facilitate its remote operation,and decrease the cognitive overload of its remote users during social interactions.

The combined goal of Tasks 5.3 and 5.4 of the TERESA project is to ensure the sociallyintelligent (semi-)autonomous behavior of the TERESA robot while its users are conversingthrough the robot. The autonomous behavior of the robot in those social situations involves,primarily, the control of its body pose – that is, the position and orientation of its base, and theheight and tilt of its head with respect to an Interaction Target1 (Figure 1). However, since therobot is essentially a communication device, the socially normative behavior of the robot mayalso require functionalities related to communication that do not have a direct physical effecton the body pose of the robot, such as controlling the volume of its speakers or displayingmessages to its users.

Deciding simultaneously on appropriate controls and communication behaviors poses a chal-lenging decision-making problem, which would be infeasible to approach directly and withoutany abstraction regarding the capabilities of the robot. Furthermore, it would be difficult toincorporate the existing results from previous tasks in the project (specifically, Task 3.1 [2]) onwhat constitutes socially normative robot behavior, since those results are descriptive in nature,and not directly translatable into constraints or desired properties on the low-level actuation ofthe robot.

A suitable approach to this decision-making problem is to view it as a hierarchical controlprocess. This means that, instead of attempting to derive control rules that operate directlyon the space of possible actuator inputs to the robot, we will attempt to learn or plan the

1In this Deliverable, we use the terminology of [2]. An Interaction Target is a user physically co-located with therobot, with whom the telepresent user (the Visitor ) interacts.

target

controls

controls

Figure 1: The TERESA telepresence robot adjusting its body pose in order to face an interac-tion target at a specific distance, while matching his height. At the lowest level of abstraction,this requires the control of the linear and angular velocity of the base of the robot, and of theheight and tilt of the robot’s screen.


TERESA - 611153

appropriate behavior of the robot at multiple levels of abstraction. In this multi-level hierarchicalcontrol scheme, the actions of each of these levels abstract the decision-making processes thathappen below it. This approach can leverage the regularities in the lowest levels of decision-making, and facilitates the integration of the existing prior knowledge on socially normativebehavior at a matching level of abstraction.

In this Deliverable, which describes the work of Task 5.3, we define a set of actions for human-telepresence robot interaction that still have a concrete connection with the low-level controlspace of the robot. This establishes the first level of abstraction of our hierarchical approachto decision-making. Future work in Task 5.4 will focus on combining these actions to producecomplex socially normative control policies for social conversation.

This library of actions is developed in a way that is sufficiently generic to accommodate poten-tially any decision-making formalism at higher levels of abstraction, while being seamless tointegrate with the ROS2-based software architecture of the robot. Furthermore, we explicitlytake into account a set of functional requirements, which were derived from the sociologicalobservations of Task 3.1, and that ensure that the library provides socially normative actionsfor all of the planned use-cases of the TERESA robot.

This document is organized as follows. In Section 4, we discuss the problem of controllingthe TERESA robot for the purpose of semi-autonomous human-telepresence robot interaction,and we review our motivation for viewing this control problem as a hierarchical decision-makingprocess. The functional requirements for our action library, which were derived from the obser-vations of Task 3.1 on socially normative robot behavior, are presented in Section 5. The actionlibrary itself is described in Section 6, where we detail each of our actions in a standardized,process-oriented format. We also present a small case-study of our action library and describethe work-in-progress to implement and integrate all of the proposed actions. Finally, Section 7contains concluding comments on Task 5.3 and discusses future work concerning the actionlibrary.

2Robot Operating System - the robot middleware used in the TERESA project


TERESA - 611153

4 Hierarchical Control for Human-Telepresence Robot Interaction

In order to develop solutions for the autonomous control of the TERESA robot during human-robot interaction (HRI), we must first specify the control problem that we are trying to address.In this section, we analyze our body-pose / HRI control problem in detail and explain why ahierarchical decomposition of problem is a particularly suitable approach (Section 4.1). Then,we discuss the formalization of each of the actions in our abstract action library as a genericprocess that can be seamlessly integrated with the software architecture of the robot (Sec-tion 4.2).

4.1 The Body Pose / HRI Control Problem

In terms of its actuation capabilities, the TERESA telepresence robot can be characterized asa differential-drive platform, equipped with a raised screen of controllable height and tilt (whichwe will refer to as the ‘head’ of the robot) [3]. At any given time, the configuration of the robot,which we refer to as its body pose, can be fully described as the position (x, y) of the base ofthe robot in a planar environment with respect to a static Cartesian frame (the world frame);the orientation of the base with respect to the same frame, ✓; the height of the head, ⇢, and itstilt, '. These quantities are represented in Figure 2.

The inputs to the actuators of the robot, in turn, consist of the wheel speeds (!l, !r), and thelinear (vertical) and angular speed of the head, (vh, !h).

We can relate the body pose of the robot to its control inputs over time, by following the basicanalysis of the kinematics of differential drive vehicles [4].

(0, 0, 0)

(x, y, 0)

(x, y, ⇢)

e

Wx

e

Wy

e

Wz

e

Rx

e

Ry

e

Ry

e

Rx

'

✓

Figure 2: The body pose of the TERESA robot can be fully specified through its base position(x, y), orientation ✓, head height ⇢ and head tilt '.


TERESA - 611153

Let r and d represent, respectively, the radius of each of the wheels of the robot and thedistance between the wheels along their common axis of rotation. Then:

2

66664

x

y

✓

⇢

'

3

77775=

2

66664

cos ✓ �sin ✓ 0 0 0sin ✓ cos ✓ 0 0 00 0 1 0 00 0 0 1 00 0 0 0 1

3

77775·

2

66664

r2

r2 0 0

0 0 0 0rd � r

d 0 00 0 1 00 0 0 1

3

77775·

2

664

!l

!r

vh

!h

3

775 (1)

If we represent the body pose of the robot as q = [x y ✓ ⇢ �]T and the control vector asu = [!l !r vh !h]

T , we can simplify this expression as:

q = J(q)u (2)

With this representation of the TERESA robot as a nonlinear dynamical system, and givena reference body pose qref , it is relatively straightforward to design control laws of the formu = g(q,qref ), that provide the appropriate controls over time in order to achieve the givenreference [4]. This is the approach that characterizes most (asocial) robot control applications.

However, since our ultimate goal is to develop mechanisms for social navigation and body posecontrol on the TERESA robot, we must consider additional constraints on its dynamics, and/orsocial costs that we will need to optimize. These social costs may depend on variables thatare extrinsic to the robot, such as the number of social actors in the vicinity of the robot, or thedistance and orientation of the robot with respect to an interaction target. Therefore, we needa more complete representation of the environment than the one used in Eqs. (1) and (2).

Let f = (f1, f2 . . . fk) represent a tuple of k heterogeneous features of the environment. Thedomain of each feature fi may be either discrete or continuous. A continuously-valued featuref1 may represent quantitative aspects of the environment that may be socially relevant forthe motion of the robot, such as the distance to a person in its vicinity; while a feature witha discrete domain may represent qualitative aspects of the environment, such as the facialexpressions of the telepresent visitor.

Let F represent the space of possible feature tuples describing the environment (not includingthe body pose of the robot), Q the space of possible body poses, and U the space of possiblecontrol inputs to the robot. The most generic representation of a social cost function is simplyas a function that maps these elements to a real-valued representation of cost, that is, of theform c : F ⇥Q⇥ U ! R. Given such a cost function, our objective is now to attempt to selectcontrols for the robot based on its body pose and environment features, which minimize (inexpectation) the social costs accrued by the robot (or maximize social utility). We refer to thisobjective as finding a control policy for the robot. This, however, raises important difficulties:

1. The features that are relevant for social navigation and body pose control are not fullyknown a priori. Identifying these features is a challenge in itself, and constitutes one ofthe running goals of Task 5.1;

2. The environment features are only partially observable to the robot. That is, if the ‘true’environment features at time t are encoded by ft 2 F , the robot cannot know with cer-tainty the values of ft, due to its limited sensing capabilities. This means, for example,that the robot can only estimate the facial expressions of its users in real time, but thisestimation may have some associated error.

More formally, there is a space of possible observations that the sensors of the robotcan produce, O. At time t the robot will only have access to a noisy and potentially


TERESA - 611153

incomplete observation ot 2 O, that is stochastically related to the true “state” of the en-vironment and respective controls of the robot (ft,qt,ut) through a probability distributionPr(ot | ft, qt, ut). The form of this distribution is not known a priori ;

3. The evolution of (some of) the environment features over time is unknown and cannotbe described as a closed-form dynamical system, such as in Eqs. (1) and (2). This isparticularly problematic for the qualitative features of the environment, such as the facialexpressions or body poses of the people interacting with the robot. Furthermore, it islikely that changes in these features occur on average at a much slower rate than thecontrol loop of the robot;

4. We must take into account that the telepresence system is still being partially operated bya human and that its main purpose is to be a (mobile) communication device. The socialacceptance of the robot by its users during conversation may depend on the executionof communication-oriented functions that do not have a direct effect on the body poseof the robot. These functions can include, for instance, raising or lowering the volumeof the speakers on the robot, or displaying a visual message to the telepresent visitor orthe interaction targets. These actions are not contemplated in a control-theoretic view ofthe problem, but the control of the robot may indirectly depend on the ability of the robotto perform these functions autonomously during execution. Essentially, we are facing adecision-making problem that is simultaneously related to various aspects of HRI, andnot exclusively to the control of the body pose of the robot.

From the above, points 3 and 4 are those that most clearly invalidate a purely control-theoreticapproach to the problem of controlling the TERESA robot during conversation. Indeed, point 4implies that we are dealing not only with the problem of deciding on actuator speeds so as tocontrol the physical pose of the robot, but with a problem of selecting generic “commands” /inputs to the robot during its interaction with its users. We will refer to this problem hereafter ina more general sense as the HRI control problem of the TERESA robot.

These points also suggest that the HRI control problem may be best approached from adecision-theoretic perspective, using for instance the framework of Markov Decision Processes(MDPs) to model the robot and its environment as a stochastic system, in which the actionsof the robot and the state of the system are semantically open-ended [7]. However, due tothe partial observability of the problem (point 2 above), and the fact that the stochastic dynam-ics of the system are mostly unknown, the application of these decision-theoretic formalismsbecomes challenging. Partially Observable MDPs have had limited application in scenarioswith a priori unknown dynamics (i.e. in the class of Reinforcement Learning (RL) problems),although there are encouraging recent results [9]. Moreover, these frameworks are known tobe difficult to scale up to large decision-making problems (with many actions, observations,and possible “states” of the environment).

This issue is compounded by the fact that the robot has a fast, synchronous control rate(100Hz), but any changes that occur in terms of the qualitative features that are relevant forsocial conversation are asynchronous3 and may be separated by relatively long periods of time(point 3 above). This means that any control policy that operates directly at the control rate ofthe robot must consider the influence that those controls may have on occurrences that mayonly take place after a large number of iterations of that control loop (see Figure 3a). A directconsequence of this issue is that the problem of determining the best control policy becomespotentially intractable due to the large number of decisions, and their respective possible con-sequences, that must be accounted for.

3An asynchronous process is one in which each step of the process can take a random amount of time to elapse.


TERESA - 611153

controls

(actuator velocities)

possible features?

possible observations?

social costs?

(a) Low-level control

action

(”Move to Interaction Range”)

possible features?

possible observations?

social costs?

best next action?

(b) High-level control

Figure 3: The effects of low- and high-level robot control on decision-making complexity. a) Ifthe full HRI control policy operates directly on the low-level control inputs of the robot, then itneeds to predict the possible states of the system and their corresponding observations at eachfuture “tick” of the controller up to a large decision-making horizon. Since the system is notdeterministic, the fan-out of possible future states and observations increases exponentially,leading to a blow-up in the complexity of solving the decision-making problem. b) If, instead,the HRI control policy has access to a library of abstract “actions” that encapsulate simplebehaviors which are known to be socially normative (such as moving nearer to an interactiontarget that demonstrates hearing problems), then that policy needs only to reason over thepossible outcomes of that temporally extended action.

A better approach to this problem is to exploit existing knowledge about the robot, its environ-ment, and its desired functionality, to divide the decision-making problem into smaller, easier tosolve sub-problems (Figure 3b). This is the central motivation behind hierarchical approachesto decision-making. In the context of Decision Theory, hierarchical approaches have beensuccessfully applied to challenging robot applications [10, 5].

The hierarchical approach to decision-making splits the control problem across multiple levelsof abstraction, establishing a tree-like structure of decision-makers (see Figure 4). If we look atsuch a structure from the top-down, then at each level an abstract decision-making policy mayselect from one of its possible abstract “actions” (the elements in the discrete codomain of thedecision-making policy of that level). Each of those actions can be thought of as “waking up” alower-level decision-maker, which contains its own decision-making policy.

In general, a decision-making hierarchy can be arbitrarily deep, but there is one important con-straint — at the lowest level of abstraction, that decision-making policy must actually produce anon-abstract output. That is, for robot control to be feasible, the first (lowest) level of abstractionmust represent processes that actually produce control inputs or commands to the robot.

This establishes the motivation for the present task, and its definition of an HRI control “actionlibrary”: we aim at taking a hierarchical approach to the problem of controlling the TERESArobot during conversation, thereby breaking down that problem into one of selecting betweendifferent body pose or communication actions that are socially normative, instead of decidingdirectly on the control inputs of the robot at each time. Doing so will allow us to approachthe HRI control problem as a whole (instead of focusing solely on body pose control) andwill significantly reduce the complexity of the decision-making problem at the higher levels ofabstraction. This approach, however, requires that we establish a first level of abstraction, upon


TERESA - 611153

Social Conversation

Approach

Retreat

Action #1 Action #2 Action #N

Action Library

Task 5.3

Body Pose Control Policies

Task 5.4

Navigation Modules

WP 4

Robot Controls

Social Navigation

Robot Control Mode

Comm. Functions

Manual Control

Figure 4: Hierarchical decision-making / control scheme for the TERESA robot. The actionlibrary that is presented in this Deliverable provide a first level of abstraction upon which theHRI control policies of Task 5.4 can be built. Besides abstracting the control processes that arenecessary to drive the robot to socially useful body poses, these actions can also encapsulatecommunication-oriented functions that are not related to body pose control (such as controllingthe volume of the speakers).

which the remainder of the hierarchy can be learned or built. The action library described inthis Deliverable constitutes this first abstraction layer.

As hinted to above, most of the actions in this action library encapsulate low-level controlprocesses that are able to drive the TERESA robot to certain body poses, or perform certaincommunication-specific functions, which are socially normative (and hopefully useful) to itsinteraction targets. Indeed, as we will discuss in Section 5, these actions are defined accordingto the results of the identification of socially normative robot behavior carried out in Task 3.14.Each of the actions that are here proposed follows from a functional description of an HRIbehavior that is known to be socially normative at least in some conditions. Equipped withthis action library, the job of the HRI control policy that will be developed in Task 5.4 will beto provide a way of selecting between these actions so that the robot behaves in a sociallynormative way at all times during conversation (potentially not only in a socially normative waybut in one that maximizes perceived social utility).

4.2 Actions as Generic Integrated Processes

Even though we have argued above for the advantages of a decision-theoretic formalizationto our HRI control problem, this is by no means the only formalism that is applicable to thatproblem, and that may allow us to obtain a hierarchical HRI control policy that respects socialcosts or constraints. Other possibilities include, for instance, methods for Supervisory Controlin Discrete-Event Systems [1]; Machine Learning methodologies such as Structured Learn-ing [8]; or logic-based planning methods such as Constraint Satisfaction Problem solvers [6].Each of these approaches is associated with a different formalization of the decision-makingproblem and consequently may require a different representation of what constitutes an action.

Rather than binding the semantic definition of our HRI actions to one of these formalisms,which could have later consequences with respect to the formal approaches that we may take

4“Development of socially normative robot behaviors for navigation and body-pose”. The progress of this Taskis described in [2].


TERESA - 611153

“Move To Interaction

Range”

start / stop

Output Data(robot wheel speeds)

Outcome(success / failure)

Input Data(depth images,LRF data)

Arguments(Person ID;

target distance)

Figure 5: An example of a body pose action as a generic input/output process.

to this problem, we here opt to provide a generic interpretation of an action as an input-outputprocess that can be easily integrated with the existing software architecture of the robot. Thisembodies the main motivation of defining a library of abstract actions that nevertheless produceconcrete outputs to the rest of the robot architecture.

Taking this generic, process-oriented approach to action definition implies that any higher-levelmodules that implement HRI control policies should be responsible for casting these processesinto their own specific ontological representations of the decision-making problem. At the sametime, it allows for arbitrarily complex HRI control policies to be defined, while being seamlessto deploy on the actual robot.

Our process-oriented view of actions is exemplified in Figure 5. Each action takes in a specificset of arguments and input data (those described in the figure are simply examples); andproduces an output that is usable by the rest of the system. The action can be externallytriggered or interrupted through a control signal, which is meant to be provided by higher-leveldecision-making modules. Once the action terminates, it can also return a signal representingthe type of outcome of the process (typically the success or failure of the action).

As we have discussed in the previous section, the actions of the robot that are relevant for HRIinclude not only body pose control processes, but also communication-specific functions. Bothof these cases can be modeled as input / output processes. The only necessary differencebetween these is in the form of their output, which will take values either in the space of controlinputs of the robot (in the case of body pose control actions), or in a discrete domain of scriptedcommands (in the case of communication-specific actions).

It is also important to note that some actions are parallelizable, in the sense that they can beconcurrently executed, if and only if their execution is independent of one another and theiroutputs are non-overlapping. More formally, an action a can represent any process with atransfer function of the form ga : Xa ! Ya, where Xa and Ya are respectively the input andoutput domains of action a. Furthermore, if two actions a and a

0 have transfer functions withcodomains Ya and Ya0 such that Ya \ Ya0 = ;, then a and a

0 can be executed concurrently. Thisis the case, for instance, between body pose and communication-specific actions, which arealways parallelizable. But even within each of these two classes there may be parallelizableactions — body pose actions that control only the base, or only the head of the robot can alsobe executed concurrently. This highlights the need to clearly identify the control resources thatare used by each abstract action.


TERESA - 611153

5 Identifying Socially Normative HRI Actions

Past work in Task 3.1 of the TERESA project has sought to characterize socially normativetelepresence robot behavior in controlled experimental conditions [2]. The observations thatwere collected in that work, coupled with additional input from ongoing work in Task 3.2, providea basis for the definition of the functional requirements of our socially normative HRI controlaction library. An identification of these requirements is fundamental in order to ensure that thesystem is capable of socially normative behavior.

In this section, we collect and discuss the functional requirements that were identified from ananalysis of the aforementioned input, and through a direct collaborative effort between UT andUvA.

Requirement 1. While operating autonomously, the robot must attempt to behave in a sociallyappropriate way at all times.

Discussion: This general requirement ensures that the action library contains a set of actionsthat is sufficient, in the sense that at least one action in the library is socially appropriate forany given situation within the planned use-cases of the TERESA robot5.

Requirement 2. During interaction, the robot must be capable of a limited ( local) form ofnavigation. This form of navigation is characterized as follows: the robot should be able totraverse sufficiently small distances that the social costs of its motion are negligible and do notinfluence its path. However, the robot must be capable of avoiding collisions, and the motionof the robot should be smooth (according to the observations of [2]).

Discussion: The most basic use-case of the TERESA robot, which has been discussed in [2],covers three phases: an Approach phase, in which the robot moves from an initial positiontowards an interaction target, or a group of interaction targets; a Conversation phase, duringwhich the telepresent visitor converses with the interaction target(s); and a Retreat phase, dur-ing which the robot moves away from the interaction target (group). Functionally, the Approachand Retreat phases depend on the social navigation modules being developed in Work Pack-age 4, while the Conversation phase is to be handled by the HRI control policy that is the focusof tasks T5.3 and T5.4. This implicitly establishes two distinct modes of operation: Social Nav-igation and Social Conversation (see Figure 4). Each of these two modes of operation requireexclusive control of the base of the robot, and as such, there must be a well-defined set ofconditions that define the situations in which each of these modes should be active.

This implies that, whenever the HRI control policy is initiated, it is reasonable to assume that therobot should already be in the vicinity of its interaction target(s). However, since the system isstochastic and the robot has only access to limited and potentially noisy sources of information,the exact initial conditions in which the robot starts executing its HRI control policy are notnecessarily always the same. For example, while approaching the interaction target(s), therobot may erroneously estimate that it has reached its desired position. Further informationcould then reveal the need to reposition the robot before continuing interaction, which wouldrequire a local form of navigation. The HRI control policy must ensure that this is possible,either by handing over control of the robot to the social navigation mode once again, or byselecting HRI actions which are capable of performing this local navigation within their own

5This requirement also implies that the HRI control policy should select from the action library (one of) theaction(s) that is deemed to be socially appropriate for a given situation. However, this is beyond the scope of thedefinition of the action library itself.


TERESA - 611153

control processes. The latter option may be preferable since it avoids delays and potentialinconsistencies in switching between different modes of operation.

Requirement 3. The robot must be capable of rotating in place in order to face an interactiontarget, or the center of an F-formation of a group of interaction targets.

Discussion: One of observations of Task 3.1 was that the heading of the robot with respectto its interaction targets during conversation can significantly influence the social acceptanceof the robot. The experimental results of that work also confirmed that the socially normativebehavior for the robot while interacting with a group of social actors is to face the center of theF-formation of the group (i.e. the mean of the positions of all of the social actors, includingthe robot). The robot should, therefore, be capable of correcting its orientation if necessary.Moreover, the robot should not move its base forward or backward while doing so, since thiscould force the interaction target(s) to correct their own orientation with respect to the robot,and could influence its social acceptance. This restriction on angular motion is also supportedby the findings of Task 3.1, which showed that human controllers tend to adjust the body poseof the robot during conversation through simple in-place rotations.

Requirement 4. The robot must be capable of controlling its distance with respect to an inter-action target, or to the center of an F-formation of a group of interaction targets.

Discussion: Although the experimental observations of Task 3.1 did not find a significantcorrelation between the social acceptance of the robot and the distance at which the robotinteracts with its users, it is still possible to conclude that:

1. Human controllers had a tendency to position the robot at a distance with respect totheir interaction targets that is between the personal and social spaces of the subjects(1.25m). This provides a good baseline distance at which the robot should attempt to po-sition itself, if there is no other prior information regarding the preferences of a particularinteraction target;

2. Elderly subjects sometimes prefer a smaller interaction distance, especially those thatsuffer from hearing problems.

Although some functionality that would allow the robot to navigate to a particular position in itsvicinity is already prescribed by Requirement 2, here we make it explicit that the HRI controlpolicy needs to reason over proxemics during interaction, and that therefore it may be neces-sary to control the motion of the robot in terms of its distance to its interaction target(s).

Furthermore, this requirement, coupled with Requirement 2, also contemplates the possibleneed to follow a moving interaction target to a different location.

Requirement 5. The robot must be capable of determining, in real-time, the best position fromwhich it can interact with a group of interaction targets. If that reference position changesduring conversation, the robot should be capable of repositioning itself to that reference.

Discussion: This requirement arises from the potential need to accommodate for changesin the positions of a group of interaction targets, or to a varying number of participants in aconversation.


TERESA - 611153

Requirement 6. The robot must be capable of adjusting the volume of its speakers duringconversation.

Discussion: Concurrent work in Task 3.2 has led to the conclusion that, in situations wherethe interaction target(s) suffer from hearing difficulties, raising the volume of the speakers onthe robot is the preferred socially normative response. Therefore, the HRI action library mustaccount for this functionality.

Requirement 7. If the telepresent visitor is disconnected during conversation, the robot mustbe capable of terminating the ongoing interaction in a socially acceptable manner. Further-more, the robot must be capable of autonomously returning to a predefined location in itsoperational environment.

We cannot place any restrictions or provide performance guarantees on the connection of thetelepresent visitor to the TERESA system. Therefore, the system must be prepared to handlean abrupt disconnection by the visitor, or other connectivity issues, at arbitrary times duringits execution. If this situation occurs during a conversation, the robot should be capable ofusing its multimedia devices to inform the interaction targets of these connectivity issues, andpotentially terminate the conversation in a socially normative way.

The last part of this requirement implies that the HRI control policy must be prepared to handover the control of the robot to the navigation module in such a situation.

Requirement 8. The robot must be capable of querying the telepresent visitor for input inreal-time.

This requirement ensures that, functionally, the software architecture of the robot can commu-nicate with the interface that is used by the telepresent visitor in real-time. This is important inorder to ensure that we can develop synergetic HRI control policies that explicitly account forthe input of the telepresent visitor during execution.

Requirement 9. The telepresent visitor may take manual control of the robot at any time,except in situations that may be unsafe to any person or to the robot. If the visitor takesmanual control of the robot, the robot will only resume autonomous operation following anexplicit command from the visitor.

Discussion: This requirement ensures the manual control by the visitor is explicitly modeledas one of the possible control modes of the robot, in addition to those for Social Navigation andSocial Conversation (see Figure 4). Additionally, this control mode should take precedenceover the other two, as much as possible.

Functionally, this requirement also implies that the HRI control policy of the robot must bepreemptable, that is, that any action may be interrupted without due termination at any time.

As a final note, the functional requirements for the HRI action library that were here presentedhave been formulated while taking the capabilities of the TERESA robot into account. SomeHRI actions that would otherwise be considered relevant for socially normative behavior arenot contemplated by these requirements, since the hardware and/or planned software of therobot lack the functionalities to satisfy them. One particular example concerns the behaviorof the robot during interactions that require the robot to turn to a specific object in its vicinitythat may be the subject of the conversation, or having the robot focus on particular aspectsof the interaction target (such as his/her hands). These use-cases would require perceptual


TERESA - 611153

functionalities that are not available. In such cases, the best solution is to allow the telepre-sent visitor to control the robot directly, thereby compensating for the limitations of the robot.This also highlights the need to explicitly consider the HRI control problem as a collaborativedecision-making process between the robot and its telepresent user.


TERESA - 611153

6 The Action Library

6.1 Preliminaries

In this section, we will describe in detail the contents of our action library. This library wascreated by identifying a minimal set of socially normative behaviors that cover all of the require-ments discussed in Section 5, while taking into account the hardware and software limitationsof the TERESA robot.

Each of the actions in this library is defined as a process (c.f. Section 4.2) that can be charac-terized by the following elements:

• Name: A descriptive, human-readable label for the action.

• Functional Description: A description of the desired functionality for the action, matchingone or more of the requirements discussed in Section 5. The functional descriptionestablishes the internal constraints of the process, in that any implementation of theaction must guarantee that all of the elements of the desired functionality are respected;

• Use-case Examples: An example of one or more use-cases for the action, within thescope of one or more of the target scenarios of the TERESA system as a whole;

• Arguments: A set of input variables that must be passed onto the action, and whichencode the controllable parameters of its process;

• Terminating Conditions: Since the execution of an action is meant to be mediated byhigher-level modules, those modules must be informed whenever the action has com-pleted its intended function. The terminating conditions for an action define its possibleoutcomes, which constitute the possible ways in which an action may end its execution,and are associated with labels that the action may return upon termination. The ter-minating conditions of an action are representative of the specific goals that the actionaims to achieve, and are fundamental to the control flow between the different levels ofthe decision-making hierarchy. As it will become apparent in our action library, differentactions may abstract very similar processes, but with different terminating conditions;

• Input Requirements: The input data that must be made available to the action duringthe its execution. This input may be composed of sensor data, features extracted fromthat data, or any other data type as long as it is readable during run-time by the processunderlying an action;

• Output Requirements: The actuators of the robot that the action will use (in the case of abody pose control action) or the communication functionalities that it will need to access.These represent the elements of the robot architecture to which the action must haveexclusive access during its execution;

• Implementation: A proposed mechanism which can implement (or in some cases alreadyimplements) the process which the action represents.

• (If applicable) Notes: Any additional notes concerning the definition of the action. Forthose actions that are already implemented in the sofware architecture of the TERESArobot, we include references to videos showcasing their execution.


TERESA - 611153

6.2 Body Pose Control Actions

Name Turn to Interaction Targets

Functional Description The robot should rotate its base so as to face a specific interactiontarget, or the centroid of an F-formation of multiple interaction targets.Once the robot is facing its interaction targets, this action terminates,and the robot should no longer update its orientation (unless the actionis executed again). The robot should not move forward or backwardduring the execution of this action.

Use-case Examples � While in a conversation with an interaction target, the robot mayneed to (re-)orient itself towards the interaction target before perform-ing other actions;� While in a conversation with a group of people, the robot may needto turn to face a specific interaction target in the group, for instance inorder to face the speaker, or the person that is being spoken to.

Arguments A set of IDs of interaction targets (with known locations), which therobot should turn to face.

Terminating Conditions Succeeds if: the relative orientation with respect to the centroid of theset of positions of the interaction targets is below a given thresholdand the rate of change of that relative orientation is below a maximumresidual value;Fails if: robot motion is blocked or I/O requirements not satisfied.

Run-Time Input Req. Person localization / tracking data; odometry.

Run-Time Output Req. Dedicated control of wheel motors.

Implementation PID Controller.


TERESA - 611153

Name Maintain Heading to Interaction Targets

Functional Description The robot should rotate its base so as to continuously face a specificinteraction target, or the centroid of an F-formation of multiple interac-tion targets, even if the interaction target(s) move(s) from their initiallocations. The robot should not move forward or backward during theexecution of this action.

Use-case Examples � While in a conversation with an interaction target, the default sociallynormative robot behavior is to face the person directly and continuouslyduring the conversation (or the centroid of the F-formation of social ac-tors in the case of multiple interaction targets). However, the robotshould not adjust its distance w.r.t. the interaction target(s) if this dis-tance is still within a maximum proxemic range.

Arguments A set of IDs of interaction targets (with known locations), which therobot should continuously face.

Terminating Conditions Fails if: robot motion is blocked or I/O requirements not satisfied.

Run-Time Input Req. Person localization / tracking data; odometry.



Notes This action is different from the ‘Turn to Interaction Targets’ action inthat it does not have a successful termination condition, which meansthat the execution of this action must be interrupted, following otherexogenous events. This also means that the underlying control pro-cess must consider the velocities of the interaction targets as well astheir positions, in order to ensure stability. In Control-Theoretic terms,this action addresses a posture tracking problem, whereas the Turn toInteraction Targets action concerns a posture stabilization problem [4].

This action is already implemented. Its execution can be seen at:


https://staff.fnwi.uva.nl/j.v.teixeiradesousamessias/teresa_vids/action2.html


TERESA - 611153

Name Move to Interaction Range

Functional Description The robot should move its base so as to achieve a specific distanceto an interaction target, or to the centroid of an F-formation of multipleinteraction targets. While moving, the robot should avoid obstacles inits path. Once the robot achieves its reference distance to the interac-tion target(s), the action terminates, and the should no longer updateits target position (unless the action is executed again).

Use-case Examples � Before a conversation can take place through the telepresence robot,it should move to a socially acceptable distance of the interaction tar-get(s). Afterwards, the interaction target should be free to set his orher own preferred distance to the robot. This means that the robotshould not move continuously in an attempt to maintain the exact samedistance to the interaction target throughout the conversation;� This action can also be used to back away from an interaction targetthat is too close to the robot. This can occur, for instance, if the robot isinadvertently blocking someone’s path. In that case, this action may becalled with a reference distance that is larger than the current distancebetween the robot and the user, which would cause the robot to drivebackwards.

Arguments A set of IDs of interaction targets (with known locations); A scalar rep-resenting the desired distance to the interaction target (if there is onlyone such target), or to the centroid of the F-formation of multiple inter-action targets.

Terminating Conditions Succeeds if: distance to interaction target (or centroid of the group oftargets) is within a maximum deviation of the specified reference andrate of change of the distance w.r.t. the interaction target(s) is below amaximum residual value;Fails if: distance to interaction target (or centroid of the group of tar-gets) is above an absolute maximum range or robot motion is blockedor I/O requirements not satisfied.

Run-Time Input Req. Person localization / tracking data; odometry; laser-range finder data.


Implementation Lyapunov posture stabilization function.

Notes This action is already implemented. Its execution can be seen at:




TERESA - 611153

Name Maintain Interaction Range

Functional Description If the distance between the robot and an interaction target, or to thecentroid of an F-formation of multiple interaction targets, is greater thena given maximum distance, the robot should move its base so as toreduce the distance to its interaction target(s). If the robot is closer toits interaction target(s) than the maximum distance, it should control itsheading so as to face the interaction target(s), but should not move itsbase. While moving, the robot should avoid obstacles in its path.

Use-case Examples � This action can be used to ensure that a maximum proxemic rangebetween the robot and its interaction targets is maintained, for instance,while following a user to a different location in the environment. How-ever, this action still allows the user to come closer to the robot (i.e. therobot does not back away from the user while executing this action).

Arguments A set of IDs of interaction targets (with known locations); A scalar rep-resenting the maximum distance to an interaction target, to the centroidof an F-formation of multiple interaction targets.

Terminating Conditions Fails if: robot motion is blocked or I/O requirements not satisfied.





Name Stabilize Head Pose

Functional Description The robot should adjust its head height so as to match the height of aninteraction target (or average height of a group of interaction targets). Itshould also adjust its head tilt so as to minimize the interaction target’sviewing angle to the screen of the robot (or the average viewing angleof a group of interaction targets).

Use-case Examples � This action can be used concurrently with most other body posecontrol actions, ensuring that the screen of the robot is always appro-priately positioned to allow the interaction target(s) to easily view thetelepresent visitor’s facial expressions, and vice-versa.

Arguments A set of IDs of interaction targets (with known head positions).

Terminating Conditions Fails if: I/O requirements not satisfied.

Run-Time Input Req. Person localization / tracking data.

Run-Time Output Req. Dedicated control of stalk and head tilt actuators.








TERESA - 611153

Name Reposition for Conversation

Functional Description While in a conversation with a group of interaction targets, the robotshould determine the most socially appropriate position to interact withthe group. The robot should then reposition itself to that location bymoving its base. While moving, the robot should avoid obstacles in itspath.

Use-case Examples � While conversing with one or more interaction targets, if another per-son joins the group, or if one of the current social actors leaves thegroup, the robot may need to readjust its position in order to accommo-date the updated group of interaction targets.

Arguments A set of IDs of interaction targets, associated with known positions inthe local frame of the robot.

Terminating Conditions Succeeds if: the base pose of the robot matches the perceived optimalpose to interact with the specified group and the absolute linear andangular velocities of the robot are below a residual value;Fails if: robot motion is blocked or I/O requirements not satisfied.




Name Look Around

Functional Description The robot should pan the scene by rotating its base in place, executinga pre-defined motion. The robot may also adjust its head height or tiltduring the execution of this action. The robot should not move forwardsor backwards.

Use-case Examples � The image-based sensors of the robot cover a limited field-of-viewin front of the robot. This partial observability induces a measure ofuncertainty on the features of the environment surrounding the robot.In some situations. it may be beneficial to perform a scripted actionsuch as the one described above, in order to gather more informationand reduce perceptual uncertainty;� This action may also be useful for the telepresent visitor to manuallytrigger this action in order to quickly become aware of the surroundingsof the robot.

Arguments —

Terminating Conditions Succeeds if: the scripted motion of the robot was carried out;Fails if: robot motion is blocked or I/O requirements not satisfied

Run-Time Input Req. Odometry

Run-Time Output Req. Dedicated control of wheel motors; Dedicated control of stalk and headtilt actuators;

Implementation Open-loop controller


TERESA - 611153

6.3 Communication Actions

Name Adjust Speaker Volume

Functional Description The robot should change the volume of its speakers to a specified set-ting.

Use-case Examples � As a response to perceived hearing problems of the interaction tar-gets.

Arguments The level to which the speaker volume should be set.

Terminating Conditions Succeeds if: a response is received from the Giraff software, via Ac-tiveMQ, confirming that the speaker level has been updated. This re-sponse must arrive within a fixed time limit.Fails if: a successful response to the request to adjust the speakervolume is not received within a fixed time limit or a negative responseto the request is received.

Run-Time Input Req. —

Run-Time Output Req. Connection with the ActiveMQ server.

Implementation ActiveMQ publish / subscribe mechanism.

Name User Disconnect Fallback

Functional Description The robot should plays back recorded media on its screen and throughits speakers, to inform its current interaction targets that the telepresentvisitor has been disconnected or that there has not been any activityby the visitor within a fixed time limit.

Use-case Examples � Whenever the telepresent visitor’s connection is dropped, or when-ever the visitor is away from the interface for an extended period oftime. This action may be followed by a navigation action to return therobot to its ‘home’ position. This would prevent the robot from being leftin the environment without a human operator.

Arguments —

Terminating Conditions Succeeds if: the scripted behavior associated with this action is exe-cuted until its end;Fails if: the behavior returns errors in its execution.





TERESA - 611153

Name Ask for Input

Functional Description The robot should send a request for input from the user, in the form of aselection from a list of possible entries. This request should cause theGiraff interface to display a dialog that will prompt the visitor to selectone element from that list as a response to the input request.

Use-case Examples � A response to a yes/no input request, by using this action, may beused as a precondition to other body pose actions, in order to preventthe autonomous control policy from inopportunely taking control of therobot away from the telepresent visitor. This may be especially relevantfor those actions which involve moving the base of the robot duringconversation (e.g. the ‘Reposition for Conversation’ and ‘Look Around ’actions), since this may prevent the telepresent visitor from interactingwith the people in the vicinity of the robot while this motion is takingplace.

Arguments A list of options, from which the user should select one (and only one).

Terminating Conditions Option #N if: The user selects the N -th element from the list, and thatresponse is received within a fixed time limit;Timeout if: no reply is received from the user within a fixed time limit.




6.4 Summary

We summarize the contents of our action library in Table 1. There, we provide an overview ofthe input / output requirements of each of the proposed actions. The actions that are paralleliz-able can be clearly identified as those with non-overlapping output requirements.


TER

ES

A-611153

Requires Exclusive Control of Requires data from

WheelMotors

Stalk & HeadMotors ActiveMQ Person

Tracking

LaserRange-Finders

Odometry Implementation

Turn to Interaction Targets X X X PID ControllerMaintain Heading to Interaction Targets X X X PID ControllerMove to Interaction Range X X X X non-linear controllerMaintain Interaction Range X X X X non-linear controllerStabilize Head Pose X X PID ControllerReposition for Conversation X X X X non-linear controller

Bod

yPo

seA

ctio

ns

Look Around X X X Open-loop

Adjust Speaker Volume X ActiveMQpublish / subscribe

User Disconnect Fallback X ActiveMQpublish / subscribe

Com

m.A

ctio

ns

Ask for Input X ActiveMQpublish / subscribe

Table 1: A summary of the proposed actions, showing their input / output requirements, and how they are implemented.

D5.2:A

ctionLibrary

Page26

of30

TERESA - 611153

6.5 An Example of the Use of the Action Library: a Preliminary Body PoseControl Policy

We will now describe a small case-study of our action library. As a way of validating ourproposed hierarchical HRI control methodology on the TERESA robot, we have implementeda subset of the actions that were described in this document, and combined them through aproof-of concept body pose control policy, in a way that is fully integrated with the softwarearchitecture of the robot. This case-study was demonstrated in the First Review Meeting of theTERESA project in Seville, Spain, as a way of showcasing the ability of the TERESA systemto consider proxemics.

The proof-of-concept body pose control policy, which is summarized in Figures 6 and 7, wasimplemented as a two-level Hierarchical Finite State Machine (FSM), and can be described asfollows:

• At the top level of abstraction, the policy consists of a simple three-state FSM (Figure 6).These states, Normal Mode, Close Mode and Far Mode, abstract the control of the robotat different proxemic ranges to an interaction target;

• While in Normal Mode, which is the initial state of the policy, and upon detecting thepresence of the interaction target, the robot moves to a predefined baseline distance ofthat person (by executing the Move To Interaction Range action), while maintaining anappropriate head height an tilt (by concurrently executing the Stabilize Head Pose action).After reaching its designated baseline distance, the robot executes the action MaintainHeading To Interaction Targets. This allows the interaction target to set his own distanceto the robot;

• The policy will enter Close Mode whenever the robot detects that the interaction targetis showing signs of hearing problems. This detection is produced by analyzing multiplesources of input in real time: a set of hand-written classifiers over the body pose of theinteraction target, which is in turn tracked by the on-board depth camera of the robot; andan audio-processing module that estimates the background noise of the scene based onthe data from the on-board microphone array.

While in Close Mode, the robot will reduce its distance to the interaction target whilemaintaining an appropriate orientation and head pose, essentially by executing the sameactions as in Normal Mode, albeit with different arguments (a smaller reference distanceto the interaction target). In this state, however, the robot can also back away from theinteraction target if the user comes too close to the robot. This is achieved by re-executingthe Move To Interaction Range action.

• The policy will enter Far Mode if the interaction target moves outside of the social spaceof the robot (in our implementation, this corresponds to a maximum distance of 2m).While in Far Mode, the robot will execute the action Maintain To Interaction Range, sothat it follows the user at this maximum proxemic range, enabling the telepresent visitorto converse with a moving interaction target.

A video of the execution of this policy can be found at:.

This policy was implemented in our ROS-based software architecture, by using a specializedsoftware library for the implementation of FSMs (the SMACH library6). Figure 7 shows thelayout of the hierarchical FSM as it was implemented. We emphasize, however, that the actionsthemselves are implemented as modular, independent processes, and are not dependent upon

6


https://staff.fnwi.uva.nl/j.v.teixeiradesousamessias/teresa_vids/rev_meeting_demo.html

http://wiki.ros.org/smach

TERESA - 611153

Normal Mode

Far ModeClose Mode

hearing probs.

detected

user outside

social space

user outside

social space

hearing probs.

detected

Figure 6: A high-level representation of our proof-of-concept body pose control policy as aFinite State Machine.

Figure 7: Implementation of the proof-of-concept body pose control policy as a HierarchicalFinite State Machine. The shaded rectangles represent the top-level abstract states of theFSM, and the ellipses represent the states of the bottom-level FSM, most of which corresponddirectly to actions which are described in this document (other low-level states which do notcorrespond to actions are also necessary to mediate the control flow of the policy, but do notcorrespond to robot behaviors). The green-shaded states represent the initial conditions of theFSM. The edges between the states are associated with specific termination conditions.

this Hierarchical FSM interpretation of the control policy. Later work on Task 5.4 will focus onlearning / developing more sophisticated HRI control policies using potentially different controlformalisms.


TERESA - 611153

6.6 Actions as Useful High-Level Pilot Commands

As we have previously discussed, our main motivation for the development of this action libraryis to enable the development of (semi-)autonomous HRI control policies. However, these ac-tions may provide one additional advantage: since they represent well-defined robot behaviorsthat aim at being socially appropriate in certain conditions, they may also facilitate the controlof the robot by the telepresent visitor, by letting the visitor select and trigger the execution ofone of these behaviors, instead of controlling the robot directly through its actuators. For thispotential use-case, only the body pose control actions are applicable.

We plan to integrate additional functionality into the Giraff Pilot interface that will enable thetelepresent user to control the robot using the proposed body pose control actions. This willalso enable us to evaluate if this control mode is useful to the users of the TERESA system,and if so, which actions are best suited to high-level human control of the robot.

7 Conclusions and Future Work in Task 5.3

In this task, we have identified a set of abstract, socially normative actions that enable thehigh-level control of the TERESA robot during social conversation.

This identification followed from the experimental observations of Task 5.3 regarding the char-acteristics of socially normative robot behavior [2], and from an analysis of the capabilities ofthe robot. These actions will enable the progress of Task 5.4, allowing us to develop higher-level HRI control policies.

Our action library was defined in a process-oriented way that enables its easy integration withthe software architecture of the TERESA robot. A subset of the actions presented in this libraryhave already been implemented and have been demonstrated in practice.

We consider the proposed objectives of Task 5.3 to have been attained according to the workplan for Work Package 5 without significant deviations. Future immediate work regarding theaction library will focus on the implementation of all of the actions that have been here pro-posed. These actions will be deployed in Experiment 3 of the TERESA project, as a basis fornovel HRI control policies that will be concurrently developed in the context of Task 5.4.


Bibliography

[1] Christos G. Cassandras and Stephane Lafortune. Introduction to Discrete Event Systems.Springer Publishing Company, Incorporated, 2nd edition, 2010.

[2] TERESA Consortium. Deliverable D3.1: Normative Behavior Report.

.

[3] TERESA Consortium. Deliverable D6.3: First version of semi-autonomous telepresencesystem.

.

[4] Carlos Canudas de Wit, Bruno Siciliano, and Georges Bastin. Theory of robot control.Springer Science & Business Media, 2012.

[5] Amalia Foka and Panos Trahanias. Real-time hierarchical pomdps for autonomous robotnavigation. Robotics and Autonomous Systems, 55(7):561–571, 2007.

[6] Vipin Kumar. Algorithms for constraint-satisfaction problems: A survey. AI magazine,13(1):32, 1992.

[7] Martin L. Puterman. Markov decision processes: discrete stochastic dynamic program-ming. John Wiley & Sons, 2014.

[8] Ioannis Tsochantaridis, Thorsten Joachims, Thomas Hofmann, and Yasemin Altun. Largemargin methods for structured and interdependent output variables. In Journal of MachineLearning Research, pages 1453–1484, 2005.

[9] Nikos Vlassis and Marc Toussaint. Model-free reinforcement learning as mixture learning.In Proceedings of the 26th Annual International Conference on Machine Learning, pages1081–1088. ACM, 2009.

[10] Stefan J. Witwicki, José Carlos Castillo, Jesús Capitán, João V. Messias, João C. Reis,Pedro U. Lima, Francisco S. Melo, and Matthijs T. J. Spaan. A testbed for autonomousrobot surveillance. In Proceedings of the 2014 international conference on Autonomousagents and multi-agent systems, pages 1635–1636. International Foundation for Au-tonomous Agents and Multiagent Systems, 2014.

30

http://teresaproject.eu/wp-content/uploads/2015/03/D3_1_NormativeBehaviourReport_TERESA.pdf

http://teresaproject.eu/wp-content/uploads/2015/03/D3_1_NormativeBehaviourReport_TERESA.pdf

http://teresaproject.eu/wp-content/uploads/2015/03/D6_3_FirstVersionOfSemi-AutoTelepresenceSystem_TERESA.pdf

http://teresaproject.eu/wp-content/uploads/2015/03/D6_3_FirstVersionOfSemi-AutoTelepresenceSystem_TERESA.pdf

deliverable d5.2: action library · 2018. 11. 27. · document info date and version number author...

Documents