lecture notes inartificial intelligence 5641 · theoretical approach to the cross-modal analysis...

30
Lecture Notes in Artificial Intelligence 5641 Edited by R. Goebel, J. Siekmann, and W. Wahlster Subseries of Lecture Notes in Computer Science

Upload: others

Post on 29-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Lecture Notes in Artificial Intelligence 5641Edited by R. Goebel, J. Siekmann, and W. Wahlster

Subseries of Lecture Notes in Computer Science

Page 2: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Anna Esposito Robert Vích (Eds.)

Cross-Modal Analysisof Speech, Gestures, Gazeand Facial Expressions

COST Action 2102 International ConferencePrague, Czech Republic, October 15-18, 2008Revised Selected and Invited Papers

13

Page 3: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Series Editors

Randy Goebel, University of Alberta, Edmonton, CanadaJörg Siekmann, University of Saarland, Saarbrücken, GermanyWolfgang Wahlster, DFKI and University of Saarland, Saarbrücken, Germany

Volume Editors

Anna EspositoSecond University of Naples, Department of Psychologyand IIASS, International Institute for Advanced Scientific StudiesVia G. Pellegrino 19, 84019 Vietri sul Mare (SA), ItalyE-mail: [email protected]

Robert VíchInstitute of Photonics and ElectronicsAcademy of Sciences of the Czech RepublicChaberská 57, 182 52 Prague 8, Czech RepublicE-mail: [email protected]

Library of Congress Control Number: 2009931057

CR Subject Classification (1998): I.5, H.5, I.2.7, I.2.10, I.4

LNCS Sublibrary: SL 7 – Artificial Intelligence

ISSN 0302-9743ISBN-10 3-642-03319-9 Springer Berlin Heidelberg New YorkISBN-13 978-3-642-03319-3 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material isconcerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publicationor parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,in its current version, and permission for use must always be obtained from Springer. Violations are liableto prosecution under the German Copyright Law.

springer.com

© Springer-Verlag Berlin Heidelberg 2009Printed in Germany

Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, IndiaPrinted on acid-free paper SPIN: 12731275 06/3180 5 4 3 2 1 0

Page 4: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

This book is dedicated to:

Maria Marinaro A person of exceptional human and ethical qualities

and a scientist of outstanding value.

and to all those:

who posit questions whose answers raise new questions driving the emotional sense of any scientific work

Page 5: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Preface

This volume brings together the peer-reviewed contributions of the participants at the COST 2102 International Conference on “Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions” held in Prague, Czech Republic, October 15–18, 2008.

The conference was sponsored by COST (European Cooperation in the Field of Scientific and Technical Research, www.cost.esf.org/domains_actions/ict) in the do-main of Information and Communication Technologies (ICT) for disseminating the research advances developed within COST Action 2102: “Cross-Modal Analysis of Verbal and Nonverbal Communication” http://cost2102.cs.stir.ac.uk.

COST 2102 research networking has contributed to modifying the conventional theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication with that of body to body communication as well as developing the idea of embodied information. Information is no longer the result of a difference in perception and is no longer measured in terms of quantity of stimuli, since the research developed in COST 2102 has proved that human information processing is a nonlinear process that cannot be seen as the sum of the numerous pieces of information available. Considering simply the pieces of infor-mation available, results in a model of the receiver as a mere decoder, and produces a huge simplification of the communication process. What has emerged from COST 2102 research is that human information processing does rely on several communica-tion modes but also, more importantly, strongly depends on the context in which the communication process is instantiated. The implications are a change of perspective where the research focus moves from “communicative tools” to “communicative in-stances” and ask for investigations that take into account the environment and the context in which communicative acts take place. The consequences in ICT research should lead to the development of instantiated interactive dialogue systems and instan-tiated intelligent avatars able to act by exploiting contextual and environmental signals and to process them by combining previous experience (memory) adapted to the prob-lem instance.

Currently, advances in COST 2102 research have shown the implementation of in-teractive systems such as:

• Visitors controlled an avatar in a lively multi-user 3D environment where characters follow natural and realistic behavior patterns. • Virtual trainer that monitors the user's behavior. • Demonstration of the use of motion capture, physical simulation and kinematics on a single body. • Multimodal signal processing for dance synthesis by analysis. • Alternative augmentative communication systems. • Infant voice-controlled robot.

Page 6: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

VIII Preface

• Data of interactions and vocal and facial emotional expressions that are exploited for developing new algorithms and mathematical models for vocal and facial expression recognition. • Software showing progress made in HMI, as far as spoken dialogue is concerned. • Remote health monitoring. • Telecommunication.

Some of these implementations and results were presented at ICT 2008 in Lyon, France, during November 23–25, 2008 http://www.cost.esf.org/about_cost/cost_stories/ICT-2008 http://www.cost.esf.org/events/ICT-2008-I-s-to-the-Future

The conference in Prague was developed around the COST 2102 main themes and benefited from a special session on “Emotions and ICT” jointly organized with COST 298 (http://www.cost298.org/).

This book is roughly arranged into three sections, according to a thematic classifi-cation, even though the COST 2102 research field is largely interdisciplinary and complex, involving expertise in computer graphics, animation, artificial intelligence, natural language processing, cognitive and psychological modeling of human–human and human–machine interaction, linguistics, communication, and artificial life and cross-fertilization between social sciences and engineering (psychology, sociology, linguistic, neuropsychology).

The first section “Emotion and ICT,” deals with themes related to the cross-fertilization between studies on ICT practices of use and cross-modal analysis of ver-bal and nonverbal communication.

The second section, “Verbal and Nonverbal Features of Computational Phonetics,” presents original studies devoted to the modeling of verbal and nonverbal phonetics.

The third section, “Algorithmic and Theoretical Analysis of Multimodal Inter-faces,” presents theoretical and practical implementations of original studies devoted to the analysis of speech, gestures, face and head movements as well as to learning issues in human–computer interaction and to algorithmic solutions for noise environ-ments in human–machine exchanges.

The editors would like to thank the COST- ICT Programme for supporting the re-alization of the conference and the publication of this volume, and in particular the COST Science Officers Gian Mario Maggio, Francesca Boscolo and Sophie Beaubron for their constant help, and guidance.

Our gratitude goes to the staff of the Charles University in Prague, and in particular to Jan Volin for making available the space and people to help in the conference organization. The Prague Academia of Science, Institute of Photonics and Electronics, is deeply acknowledged for contributing to the event, in particular, Petr Horák for his hard and invaluable work.

Special appreciation goes to the International Institute for Advanced Scientific Studies, and in particular to Tina Marcella Nappi, Michele Donnarumma, and Antonio Natale, for their invaluable editorial and technical support in the organization of this volume .

Page 7: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Preface IX

The editors are extremely grateful to the contributors and the keynote speakers, whose work stimulated an extremely interesting interaction with the attendees, and to COST 2102 International Scientific Committee for the accurate review work, for their dedication, and their valuable selection process. May 2009 Anna Esposito

Robert Vích

Page 8: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Organization

International Advisory and Organizing Committee

Robert Vích Institute of Photonics and Electronics, Prague, Czech Republic

Anna Esposito Second University of Naples and IIASS, Italy Eric Keller University of Lausanne, Switzerland Macos Faundez-Zanuy University of Mataro, Barcelona, Spain Petr Horák Institute of Photonics and Electronics, Prague,

Czech Republic Amir Hussain University of Stirling, UK Dagmar Dvořáková Media Communication Department, Prague,

Czech Republic Jitka Veroňková Institute of Phonetics, Charles University, Prague,

Czech Republic Jitka Pečenková Institute of Photonics and Electronics, Prague,

Czech Republic Irena Vítková Media Communication Department, Prague,

Czech Republic Jan Volín Institute of Phonetics, Charles University, Prague,

Czech Republic

International Scientific Committee

Uwe Altmann Technische Universität Dresden, Germany Hicham Atassi Brno University of Technology, Czech Republic Nikos Avouris University of Patras, Greece Ruth Bahr University of South Florida, USA Gérard Bailly ICP, Grenoble, France Marian Bartlett University of California, San Diego, USA Štefan Beňuš Constantine the Philosopher University, Nitra, Slovakia Niels Ole Bernsen University of Southern Denmark, Denmark Jonas Beskow Royal Institute of Technology, Sweden Horst Bishof Technical University Graz, Austria Peter Birkholz Aachen University, Germany Jean-Francois Bonastre Universitè d'Avignon, France Nikolaos Bourbakis ITRI, Wright State University, Dayton, USA Maja Bratanić University of Zagreb, Croatia Antonio Calabrese Istituto di Cibernetica – CNR, Naples, Italy Paola Campadelli Università di Milano, Italy Nick Campbell ATR Human Information Science Labs, Kyoto, Japan Antonio Castro Fonseca Universidade de Coimbra, Portugal

Page 9: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Organization XII

Aleksandra Cerekovic Faculty of Electrical Engineering, Croatia Josef Chaloupka Technical University of Liberec, Czech Republic Mohamed Chetouani Universitè Pierre et Marie Curie, France Gerard Chollet CNRS-LTCI, Paris, France Muzeyyen Ciyiltepe Gulhane Askeri Tip Academisi, Ankara, Turkey Anton Čižmár Technical University of Košice, Slovakia Nicholas Costen Manchester Metropolitan University, UK Vlado Delic University of Novi Sad, Serbia Marion Dohen ICP, Grenoble, France Francesca D’Olimpio Second University of Naples, Italy Thierry Dutoit Faculté Polytechnique de Mons, Belgium Laila Dybkjær University of Southern Denmark, Denmark Matthias Eichner Technische Universität Dresden, Germany Aly El-Bahrawy Faculty of Engineering, Cairo, Egypt Engin Erzin Koc University, Istanbul, Turkey Anna Esposito Second University of Naples, and IIASS, Italy Joan Fàbregas Peinado Escola Universitaria de Mataro, Spain Sascha Fagel Technische Universität Berlin, Germany Nikos Fakotakis University of Patras, Greece Marcos Faundez-Zanuy Escola Universitaria de Mataro, Spain Dilek Fidan Ankara University, Turkey Leopoldina Fortunati Università di Udine, Italy Carmen García-Mateo University of Vigo, Spain Björn Granström Royal Institute of Technology (KTH), Sweden Marco Grassi Università Politecnica delle Marche, Italy Maurice Grinberg New Bulgarian University, Bulgaria Mohand Said Hacid Universitè Claude Bernard Lyon 1, France Jaakko Hakulinen University of Tampere, Finland Ioannis Hatzilygeroudis University of Patras, Greece Immaculada Hernaez University of the Basque Country, Spain Javier Hernando Technical University of Catalonia, Spain Wolfgang Hess Universität Bonn, Germany Dirk Heylen University of Twente, The Netherlands Rüdiger Hoffmann Technische Universität Dresden, Germany David House Royal Institute of Technology (KTH), Sweden Amir Hussain University of Stirling, UK Ewa Jarmolowicz Adam Mickiewicz University, Poznan, Poland Kristiina Jokinen University of Helsinki, Finland Jozef Juhár Technical University Košice, Slovak Republic Zdravko Kacic University of Maribor, Slovenia Maciej Karpinski Adam Mickiewicz University, Poznan, Poland Eric Keller Université de Lausanne, Switzerland Adam Kendon University of Pennsylvania, USA Stefan Kopp University of Bielefeld, Germany Jacques Koreman University of Science and Technology, Norway Robert Krauss Columbia University, New York, USA Maria Koutsombogera Inst. for Language and Speech Processing, Greece

Page 10: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Organization XIII

Bernd Kröger Aachen University, Germany Gernot Kubin Graz University of Technology, Austria Alida Labella Second University of Naples, Italy Yiannis Laouris Cyprus Neuroscience and Technology Institute, Cyprus Børge Lindberg Aalborg University, Denmark Wojciech Majewski Wroclaw University of Technology, Poland Pantelis Makris Neuroscience and Technology Institute, Cyprus Raffaele Martone Second University of Naples, Italy Dominic Massaro University of California - Santa Cruz, USA David McNeill University of Chicago, USA Nicola Melone Second University of Naples, Italy Katya Mihaylova University of National and World Economy, Sofia,

Bulgaria Michal Mirilovič Technical University of Košice, Slovakia Peter Murphy University of Limerick, Ireland Antonio Natale Salerno University and IIASS, Italy Eva Navas Escuela Superior de Ingenieros, Bilbao, Spain Delroy Nelson University College London, UK Géza Németh Budapest University of Technology, Hungary Friedrich Neubarth Research Inst. Artificial Intelligence, Austria Giovanna Nigro Second University of Naples, Italy Anton Nijholt University of Twente, The Netherlands Jan Nouza Technical University of Liberec, Czech Republic Igor Pandzic Faculty of Electrical Engineering, Croatia Harris Papageorgiou Inst. for Language and Speech Processing, Greece Ana Pavia Spoken Language Systems Laboratory, Portugal Catherine Pelachaud Université de Paris 8, France Bojan Petek University of Ljubljana, Slovenia Harmut R. Pfitzinger University of Munich, Germany Francesco Piazza Università Politecnica delle Marche, Italy Neda Pintaric University of Zagreb, Croatia Isabella Poggi Università di Roma 3, Italy Jiří Přibil Academy of Sciences, Czech Republic Anna Přibilová Slovak University of Technology, Slovakia Michael Pucher Telecommunications Research Center Vienna, Austria Jurate Puniene Kaunas University of Technology, Lithuania Giuliana Ramella Istituto di Cibernetica – CNR, Naples, Italy Kari-Jouko Räihä University of Tampere, Finland José Rebelo Universidade de Coimbra, Portugal Luigi Maria Ricciardi Università di Napoli “Federico II”, Italy Matej Rojc University of Maribor, Slovenia Algimantas Rudzionis Kaunas University of Technology, Lithuania Vytautas Rudzionis Kaunas University of Technology, Lithuania Milan Rusko Slovak Academy of Sciences, Slovak Republic Zsófia Ruttkay Pazmany Peter Catholic University, Hungary Bartolomeo Sapio Fondazione Ugo Bordoni, Rome, Italy Yoshinori Sagisaka Waseda University, Tokyo, Japan

Page 11: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Organization XIV

Silvia Scarpetta Salerno University, Italy Ralph Schnitker Aachen University, Germany Jean Schoentgen Université Libre de Bruxelles, Belgium Stefanie

Shattuck-Hufnagel MIT, Cambridge, USA

Zdeněk Smékal Brno University of Technology, Czech Republic Stefano Squartini Università Politecnica delle Marche, Italy Piotr Staroniewicz Wroclaw University of Technology, Poland Vojtěch Stejskal Brno University of Technology, Czech Republic Marian Stewart-Bartlett University of California, San Diego, USA Jianhua Tao Chinese Academy of Sciences, P.R. China Jure F. Tasič University of Ljubljana, Slovenia Murat Tekalp Koc University, Istanbul, Turkey Kristinn Thórisson Reykjavík University, Iceland Isabel Trancoso Spoken Language Systems Laboratory, Portugal Luigi Trojano Second University of Naples, Italy Wolfgang Tschacher University of Bern, Switzerland Markku Turunen University of Tampere, Finland Henk Van Den Heuvel Radboud University Nijmegen,The Netherlands Robert Vích Academy of Sciences, Czech Republic Klára Vicsi Budapest University of Technology, Hungary Leticia

Vicente-Rasoamalala Alchi Prefectural Univesity, Japan

Hannes Högni Vilhjálmsson

Reykjavík University, Iceland

Jane Vincent University of Surrey, Guilford, UK Vogel University of Dublin, Ireland Jan Volín Charles University, Czech Republic Rosa Volpe Université De Perpignan Via Domitia, France Yorick Wilks University of Sheffield, UK Matthias Wimmer Technische Universiät München, Germany Matthias Wolf Technische Universität Dresden, Germany Bencie Woll University College London, UK Bayya Yegnanarayana Institute of Information Technology, India Jerneja Žganec Gros Alpineon Development and Research, Slovenia Goranka Zoric Faculty of Electrical Engineering, Croatia

Sponsors

• COST - European Science Foundation: COST ACTION 2102: “Cross-Modal Analysis of Verbal and Nonverbal Communication”

• Institute of Photonics and Electronics, Academy of Sciences, Prague, Czech Republic

• Media and Communication, Academy of Sciences, Prague, Czech Republic • Institute of Phonetics, Faculty of Philosophy and Arts, Charles University,

Prague, Czech Republic

Page 12: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Organization XV

• Institute of Applied Physics, Johann Wolfgang University, Frankfurt/Main, Germany

• Institute of Phonetics, Johann Wolfgang University, Frankfurt/Main, Germany • Institute of Acoustics and Speech Communication, Dresden University of

Technology, Dresden, Germany • International Institute for Advanced Scientific Studies, Italy • Second University of Naples, Caserta, Italy • Regione Campania, Italy • Provincia di Salerno, Italy

Page 13: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Table of Contents

I Emotions and ICT

Cross-Fertilization between Studies on ICT Practices of Use andCross-Modal Analysis of Verbal and Nonverbal Communication . . . . . . . . 1

Leopoldina Fortunati, Anna Esposito, and Jane Vincent

Theories without Heart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Leopoldina Fortunati

Prosodic Characteristics and Emotional Meanings of Slovak Hot-SpotWords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Stefan Benus and Milan Rusko

Affiliations, Emotion and the Mobile Phone . . . . . . . . . . . . . . . . . . . . . . . . . 28Jane Vincent

Polish Emotional Speech Database – Recording and PreliminaryValidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Piotr Staroniewicz and Wojciech Majewski

Towards a Framework of Critical Multimodal Analysis: Emotion in aFilm Trailer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Maria Bortoluzzi

Biosignal Based Emotion Analysis of Human-Agent Interactions . . . . . . . 63Evgenia Hristova, Maurice Grinberg, and Emilian Lalev

Emotional Aspects in User Experience with Interactive DigitalTelevision: A Case Study on Dyslexia Rehabilitation . . . . . . . . . . . . . . . . . 76

Filomena Papa and Bartolomeo Sapio

Investigation of Normalised Time of Increasing Vocal Fold Contact as aDiscriminator of Emotional Voice Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Peter J. Murphy and Anne-Maria Laukkanen

Evaluation of Speech Emotion Classification Based on GMM and DataFusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Martin Vondra and Robert Vıch

Spectral Flatness Analysis for Emotional Speech Synthesis andTransformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

Jirı Pribil and Anna Pribilova

Page 14: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

XVIII Table of Contents

II Verbal and Nonverbal Features of ComputationalPhonetics

Voice Pleasantness of Female Voices and the Assessment of PhysicalCharacteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Vivien Zuta

Technical and Phonetic Aspects of Speech Quality Assessment: TheCase of Prosody Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

Jana Tuckova, Jan Holub, and Tomas Dubeda

Syntactic Doubling: Some Data on Tuscan Italian . . . . . . . . . . . . . . . . . . . . 133Anna Esposito

Perception of Czech in Noise: Stability of Vowels . . . . . . . . . . . . . . . . . . . . . 149Jitka Veronkova and Zdena Palkova

Challenges in Segmenting the Czech Lateral Liquid . . . . . . . . . . . . . . . . . . . 162Radek Skarnitzl

Implications of Acoustic Variation for the Segmentation of the CzechTrill /r/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

Pavel Machac

Voicing in Labial Plosives in Czech . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182Annett B. Jorschick

Normalization of the Vocalic Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190Jan Volın

III Algorithmic and Theoretical Analysis ofMultimodal Interfaces

Gaze Behaviors for Virtual Crowd Characters . . . . . . . . . . . . . . . . . . . . . . . 201Helena Grillon, Barbara Yersin, Jonathan Maım, andDaniel Thalmann

Gestural Abstraction and Restatement: From Iconicity to Metaphor . . . . 214Nicla Rossini

Preliminary Prosodic and Gestural Characteristics of Instructing Actsin Polish Task-Oriented Dialogues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

Maciej Karpinski

Polish Children’s Gesticulation in Narrating (Re-telling) a Cartoon . . . . . 239Ewa Jarmo�lowicz-Nowikow

Prediction of Learning Abilities Based on a Cross-Modal Evaluation ofNon-verbal Mental Attributes Using Video-Game-Like Interfaces . . . . . . . 248

Yiannis Laouris, Elena Aristodemou, and Pantelis Makris

Page 15: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Table of Contents XIX

Automatic Sentence Modality Recognition in Children’s Speech, andIts Usage Potential in the Speech Therapy . . . . . . . . . . . . . . . . . . . . . . . . . . 266

David Sztaho, Katalin Nagy, and Klara Vicsi

Supporting Engagement and Floor Control in Hybrid Meetings . . . . . . . . 276Rieks op den Akker, Dennis Hofs, Hendri Hondorp,Harm op den Akker, Job Zwiers, and Anton Nijholt

Behavioral Consistency Extraction for Face Verification . . . . . . . . . . . . . . . 291Hui Fang and Nicholas Costen

Protecting Face Biometric DCT Templates by Means of Pseudo-randomPermutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306

Marcos Faundez-Zanuy

Facial Expressions Recognition from Image Sequences . . . . . . . . . . . . . . . . 315Zahid Riaz, Christoph Mayer, Michael Beetz, and Bernd Radig

Czech Artificial Computerized Talking Head George . . . . . . . . . . . . . . . . . . 324Josef Chaloupka and Zdenek Chaloupka

An Investigation into Audiovisual Speech Correlation in ReverberantNoisy Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331

Simone Cifani, Andrew Abel, Amir Hussain, Stefano Squartini, andFrancesco Piazza

Articulatory Speech Re-synthesis: Profiting from Natural AcousticSpeech Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344

Dominik Bauer, Jim Kannampuzha, and Bernd J. Kroger

A Blind Source Separation Based Approach for Speech Enhancementin Noisy and Reverberant Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356

Alessio Pignotti, Daniele Marcozzi, Simone Cifani,Stefano Squartini, and Francesco Piazza

Quantitative Analysis of the Relative Local Speech Rate . . . . . . . . . . . . . . 368Jan Janda

Czech Spontaneous Speech Collection and Annotation: The Databaseof Technical Lectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377

Josef Rajnoha and Petr Pollak

BSSGUI – A Package for Interactive Control of Blind Source SeparationAlgorithms in MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386

Jakub Petkov and Zbynek Koldovsky

Accuracy Analysis of Generalized Pronunciation Variant Selection inASR Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399

Vaclav Hanzl and Petr Pollak

Page 16: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

XX Table of Contents

Analysis of the Possibilities to Adapt the Foreign LanguageSpeech Recognition Engines for the Lithuanian Spoken CommandsRecognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409

Rytis Maskeliunas, Algimantas Rudzionis, and Vytautas Rudzionis

MLLR Transforms Based Speaker Recognition in Broadcast Streams . . . 423Jan Silovsky, Petr Cerva, and Jindrich Zdansky

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433

Page 17: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

A. Esposito and R. Vích (Eds.): Cross-Modal Analysis, LNAI 5641, pp. 1–4, 2009. © Springer-Verlag Berlin Heidelberg 2009

Cross-Fertilization between Studies on ICT Practices of Use and Cross-Modal Analysis of Verbal and Nonverbal

Communication

Leopoldina Fortunati1, Anna Esposito2, and Jane Vincent3

1 Faculty of Education, Multimedia Science and Technology, University of Udine Via Prasecco, 3 – 33170 Pordenone, Italy

[email protected] 2 Second University of Naples, Department of Psychology, and IIASS, Italy

[email protected] 3 Faculty of Arts and Human Sciences, University of Surrey,

Stag Hill, Guildford, Surrey GU2 7XH [email protected]

Abstract. The following are comments and considerations on how the Information Communication Technology (ICT) will exploits research on cross modal analysis of verbal and nonverbal communication.

Keywords: Communication behaviors, multimodal signals, human-machine interaction.

1 Introduction

In the first place let us to propose some reflections on how the cross-fertilization between the field of studies on ICT use practices and the cross-modal analysis of Verbal and Nonverbal Communication [1-4] could take place in a reciprocal and productive way. One notion that might be corrected in the current studies is that of face-to-face communication with body-to-body communication. In sociological studies the expression face-to-face, that is still used in many technical studies, has been criticized and overcome in favour of the expression body-to-body [6]. In fact, as all the tradition of studies on non-verbal language points out, individuals talk with all their body not only with their face, although the face is a very strategic site of the communication process. Also in the terminology one should consider, or at least be aware of, all the variables that compose the communication process.

Another notion that might be more deeply problematised is that of information. There is an approach to information which can be summarized with Bateson’s words “Information is the perception of a difference”. Maybe the result of a perception can be measured and counted. But the possibility of measuring it might give the illusion that it is possible to construct a scientific analysis only by reducing the problem to the measurement of the perception result. In effect this would be a wrong approach, because the reality of information is much more complex since it concerns human beings. If one enlarges the notion of information through the lens of a sociological contribution and one claims that information is also a relational concept, the

Page 18: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

2 L. Fortunati, A. Esposito, and J. Vincent

measurability of information becomes complicated, since in the field of research of cross-modal analysis of Verbal and Nonverbal Communication it might be necessary to activate also the notion of e-actors and their power[9, 7]. So, looking at the problem of information from a user’s point of view, it comes out that the interest of the receiver/e-actor is not simply to have the greatest possible amount of information. This means that it not true that the more information one has, the better it is, because considering only the amount of information has the consequence of conceptualising the role of the receiver only as being a mere decoder of information and to propose a reductionist vision of the communication process. On the contrary, the amount of information which one tends to obtain is related to several aspects of the communication process. Let us focus on two of these aspects: the power of the receiver and the quality of information. In regard to the first aspect, compare, for example, reading a book and seeing a movie on TV. In the first case the only channel involved, writing, gives to the reader information that is limited in detail, but this means that the reader can handle the information better. In this case the reader is able to exploit his/her imagination and co-operate with the writer to a great extent. In the second case, the co-operation by the audience is reduced, since the product that is consumed is formalized at a much higher level. What is at the stake in these two examples is the difference in the audience’s power over the consumption of the product, which in the latter case is in a certain sense reduced by the greater amount of information. But the problem of course is not only of the power of e-actors on the product, otherwise one should not be able to understand why audiences celebrated the advent of television. The exercise of this power by the e-actor is maybe addressed in some contexts and situations by other characteristics of the product, others than the amount of information contained in it, such as, for example, its pleasantness and relaxing qualities. These are the cases in which e-actors like to consume products which require less commitment and involvement from them. This has to do with the ambiguous essence of the consumption which actually should be understood as productive labour and that often e-actors aspire to reduce to a minimum. This aspiration has been often misunderstood by scholars studying patterns and styles of TV consumption through the optics of “audiences’ passivity”.

Another aspect of the information process is which setting we can study the issue of information in regard to its quality, that is its efficacy. An indirect measurement of the efficacy of the communication process might be memory. In the middle and long terms one remembers what one sees much better than what one reads [5]. This means that more detailed information is more effective than less detailed information in the memorization process.

Third, continuing our attempt to cross-fertilize the field of cross-modal analysis of Verbal and Nonverbal Communication with the main tenets of the field of ICT users’ studies, we recall that the communication process should always be studied in its multiple social contexts, because it is shaped also by the social organization of relationships and so it is intelligible only in concrete situations, local practices and contexts. It is the same concern which was expressed by de Saussure with regard to language. Meaning is not understandable if it is not situated in a broad context, which maybe is the sentence, as minimal unit of a text. Take, for example, the research carried out on audio and visual cues in interpersonal communication: their results might be fully understandable only when they will be situated in a social context. But this would imply, at the same time,

Page 19: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Cross-Fertilization between Studies on ICT Practices of Use and Cross-Modal Analysis 3

that we reflect on the need to bring together different methodological approaches. To what extent does it make sense to continue to study these issues only in laboratory? Does it, instead, make more sense to study them in laboratory but also in social contexts and so try to design a completely different research?

Fourth, another important and recent tenet of psycho-sociological studies has been to see the fundamental role of emotion in the communication process and in the use of ICTs [8]. This approach is very important to overcome the implicit premise that the communication process is rational, without taking into account the role played by emotions in it. This is a debate, that of the electronic, mediated emotion, which is still in its infancy, although it is very vivid and ever-growing. Now it is recognized quite broadly that emotion always accompanies the process using technologies, their practices and the transformations of social organizations in which the technologies are used. This should also inspire research projects in physical or engineering studies.

Fifth, another important issue concerns the signal. If it is acceptable that for the operationalisation of a research study a complex concept such as communication has to be reduced to a more simple notion such as a signal, then later on it is always necessary to come back to the complex notion of communication. This is the only way to avoid dangerous shortcomings in the design of the research. Following the same line of reasoning, when the signal is chosen for operational reasons it should not be seen as part of a communicative act, but rather as part of a communication process, constituted by immaterial labour and of a message which is its product and which has an economic and normative impact, affects social roles and power, the organization and structure of social relationships, and so on. In this case it would be more wise to start from the multi-dimensionality of the signal and then to declare that, given the difficulty in analysing this multi-dimension nature, only one aspect is selected. In that way, the researcher would be more easily aware that it is possible to arrive only at a partial and limited conclusion, avoiding a metonymical conclusion (in which a part is understood and presented for the all). Moreover it would be better to consider all the variables and then to decide on a post-selection of the significant variables by means of a factor analysis. So, in this case the choice of the considered variables in the design of a research might be justified, otherwise not.

Finally, just to concluding this attempt to cross-fertilize the field of cross-modal analysis of Verbal and Nonverbal Communication with the main tenets of the field of ICT users’ studies, we would propose some points coming out from the first interdisciplinary dialogue. When research on emotion is designed, it is always the case to remember that a) emotions are a continuum; b) different cultural perceptions of emotion derive from the fact that emotion have different archetypical, symbolic and metaphorical history; c) in our multicultural society more inter-cultural experiments on and studies of emotion are needed; d) one should problematise more emotions: fear is both positive and negative (relating it to different contexts).

References

1. Esposito, A., Hussain, A., Marinaro, M., Martone, R. (eds.): Multimodal Signals: Cognitive and Algorithmic Issues. LNCS, vol. 5398. Springer, Heidelberg (2009), http://www. springer.com/computer/artificial/book/978-3-642-00524-4

Page 20: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

4 L. Fortunati, A. Esposito, and J. Vincent

2. Esposito, A., Bourbakis, N., Avouris, N., Hatzilygeroudis, I. (eds.): HH and HM Interaction. LNCS, vol. 5042. Springer, Heidelberg (2008)

3. Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds.): COST Action 2102. LNCS, vol. 4775. Springer, Heidelberg (2007)

4. Esposito, A.: The Amount of Information on Emotional States Conveyed by the Verbal and Nonverbal Channels: Some Perceptual Data. In: Stilianou, Y., et al. (eds.) COST 277. LNCS, vol. 4391, pp. 249–268. Springer, Heidelberg (2007)

5. Fortunati, L.: Gli italiani al telefono. Angeli, Milano (1995) 6. Fortunati, L.: Is Body-to-body communication still the prototype? The Information

Society 21(1), 1–9 (2005) 7. Fortunati, L., Vincent, J., Gebhardt, J., Petrovčič, A. (eds.): Interaction in Broadband

Society. Peter Lang, Berlin (2009) 8. Fortunati, L., Vincent, J.: Introduction. In: Vincent, J., Fortunati, L. (eds.) Electronic

Emotion. The Mediation of Emotion via Information and Communication Technologies. Peter Lang, Oxford (2009)

9. Haddon, L., Mante, E., Sapio, B., Kommonen, K.-H., Fortunati, L., Kant, A. (eds.): Everyday Innovators. Researching the Role of Users in Shaping ICTs. Springer, Heidelberg (2005)

Page 21: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

A. Esposito and R. Vích (Eds.): Cross-Modal Analysis, LNAI 5641, pp. 5–17, 2009. © Springer-Verlag Berlin Heidelberg 2009

Theories without Heart

Leopoldina Fortunati

Faculty of Education, Multimedia Science and Technology, University of Udine Via Prasecco, 3 – 33170 Pordenone, Italy

[email protected]

Abstract. In general, sociological theories are, or at least seem, to be without heart. However many fundamental sociological notions such as solidarity, social cohesion and identity are highly emotional. In the field of information and communication technologies studies there is a specific theory, that of domestication (Silverstone and Hirsch [58], Silverstone and Haddon [59], Haddon and Silverstone [31]), inside which several research studies on the emotional relationship with ICTs have flourished (Fortunati [19], [20]; Vincent [63]). In this paper I will focus on this theory, which is one of the frameworks most commonly applied to understand the integration of ICTs into everyday life. I argue that emotion empowers sociological theories when its analysis is integrated into them. To conclude, I will discuss the seminal idea proposed by Star and Bowker [62] to consider emotion as an infrastructure of body-to-body communication.

Keywords: Domestication theory, emotion, information and communication technologies, infrastructure, body, body-to-body communication, mediated communication.

1 Introduction

In general, sociological theories do not worry about emotions, they are theories without heart. Nevertheless, if one examines seminal sociological notions such as solidarity, social cohesion, identity, what are they if not a “bundle of emotion, binding social actors to the central symbols of society”, as Illuz ([36]: 2) writes? So, sociological theories are apparently without heart as they are inhabited by emotions although not in any explicit way. Theories on new media are not an exception at this subject. However, there is a theory among those currently applied in the studies on ICTs (Information and Communication Technologies) – domestication theory (Silverstone and Hirsch [58], Silverstone and Haddon [59], Haddon and Silverstone [31]) - that is in a certain sense an exception, since it includes several research starting from the observation of emotional reactions to the diffusion and integration in the domestic setting of ICT. It is not by chance that this theory has allowed the development of several studies which have taken into account also the emotional aspects of the integration of ICTs in the domestic sphere (Fortunati [19], [20]; Haddon [33]). Apart from these studies related to domestication theory, another seminal approach to emotion is that advanced by Star and Bowker [62] who see

Page 22: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

6 L. Fortunati

emotion and the body as an infrastructure of body-to-body communication. Considering emotion as an infrastructure maybe allows us also to reflect in a more sophisticated way on the role of emotion in mediated communication.

The aim of this article is to show how the explicit integration of emotion into sociological theorization cannot but lead to a more powerful conceptualisation. Domestication theory, which through cross-fertilization with domesticity theory has included emotion in its conceptual framework, is a good example of the capacity of theories to acquire symbolic and conceptual strength when they integrate emotion. In the next section I will analyze the development of sociological theorization between rationality and emotion. Then, I will examine domestication theory as the theory that has allowed the development –even if slowly - of emotion as one of the key elements in the understanding of the integration of information and communication technologies in everyday life. To conclude, I will discuss the seminal idea to consider emotion as an infrastructure of body-to-body communication (Star and Bowker [62]).

2 Sociological Theory between Rationality and Emotions

Sociological analysis is generally based on the development of a discourse which focuses on the rational layer of social behaviour. However this rational aspect often remains in the background or it presents itself like a taken for granted premise. Sociological theoretical activity in particular represents itself as an even more radically rational undertaking. However, Roger Silverstone, who was one of the main theorists of “domestication” approach, reminds us ([61]: 229) that “all concepts are metaphors. They stand in place of the world. And in so doing they mask as well they reveal it”.

The approach proposed by Silverstone opens up a lot of issues: the first is that theoretical activity is much less rational than it represents itself. In fact metaphors introduce tension between two aspects of meaning which might be contradictory or which stress a latent or an unexpected relation (Lakoff and Johnson [40]). Metaphors, as well as symbols (Jung [37]), are the rhetoric tools that convey in the language the ambiguity that psychologically one needs to express. The need to express ambiguity derives from the fact that we are emotionally ambiguous, since, as Freud pointed out, we love and hate at the same time. Metaphors and symbols are two of the features that Umberto Eco characterises as forms of hyper-codification [11]. They are such since they require a supplementary effort from the speaker or writer to be formulated and from the receiver to be understood. The tension of metaphors and symbols corresponds to an emotional dimension which enters in the rational sphere breaking up the game (Marchese [46]). The game is always that when one speaks, what one says and what one does not say (or what one says in an allusive way) are both relevant.

Sociological theorization is in particular an activity where a new statement accompanies people in conceptual territories that need to be explored, where the old encounters the new, which rarely is clear, as in the case of domestication theory. But this element, far from being a limitation, represents instead an advantage, as Silverstone himself underlines, allowing multiple interpretations and empirical experimentation and research. The success of a sociological theory is exactly given by the precarious equilibrium between what is said and what is not said, between the

Page 23: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Theories without Heart 7

powerfulness of logos and the perturbing presence and energy of emotion. But this equilibrium is very much strengthened by the simplicity of a theory. Only simple concepts survive over time and domestication is not an exception, argues Silverstone ([61]: 229).

The second aspect of the question is that the major part of our conceptual system is on the one hand partially structured in terms of other concepts, that is it is structured in a metaphoric sense (Lakoff and Johnson [40]), and that, on the other hand, metaphors are mainly a conceptual construction. Even the very name of this theory, ‘domestication’, is a metaphor, which was inspired by the work of the anthropologist Kopytoff [39]. Kopytoff proposed that we should see cultural objects as biological beings and to analyse their ‘life’ as such. He was in fact convinced that they cease to be mere commodities when they are culturally integrated into cultivated home environments. When the object is a technological artefact, this taming is an even more necessary requirement. In fact, technology has a specific strength, consisting in its capacity to produce movement. Movement is the watershed between beings and inanimate objects (Fortunati [13]). It is a capacity which qualifies technology as a particularly powerful object. It is exactly this power that needs to be domesticated to avoid a situation where the technologies revolt against those who use them. When, then, the technologies in question are ICTs, the power of the movement technology brings with it is particularly strong. In fact ICTs deal with the intellectual and communicative processes of individuals. Not by chance Maldonado [44] calls them “intellective machines”, Rheingold [55] “tools for thought”. It is the particular power of ICTs that has in a certain sense pushed scholars to compare them with ‘wild animals’ which needed to be tamed. The process of their integration in the household’s context was termed as domestication. In Kopytoff’s analysis of the taming of cultural objects seen as “wild animals” there is an implicit emotional narrativity. What is the emotion one feels towards wild animals if not fear, terror? The process of domestication has enabled human beings to overcome the fear by building a process of familiarity and intimacy with wild animals, by taming them. This feeling of fear (and terror) is one of the classical emotions that humankind felt when confronted by technology. The other one is wonder or what Mosco [49] calls the “digital sublime”. The sense of wonder as an emotional reaction to the technology is well documented since ancient times and has found a new reformulation in the Weberian notion of enchantment for technology.

This opposition between the two main emotions tracing the mood towards technology has inspired the debate on technology and society until the second world war and continues to present itself again in within many of the studies on society and technology in the guise of technological determinism. Silverstone [61], in his last attempt to take stock of the domestication theory, analyses the technological determinist approach, arguing that its internal logic is based on the most important emotional pole - wonder and fear - that has traditionally surrounded the attitude of society towards technology. Recently Sally Waytt [66] points out that technological determinism continues to live and be revived in those mainstream studies and theories that want to refuse it in principle. It is a matter of fact that technological determinism is strong since technologies, as I mentioned above, are perceived as the powerful extension of the human body and are humanized. Machines are human projections and become human creatures (Katz [38]).

Page 24: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

8 L. Fortunati

The relevant evidence for my discourse is that in the first period when we encounter a technology the mainstream narrativity on domestication describes a path in which the technology is integrated into everyday life, becomes invisible and a kind of normativity elbows in with the consequence that emotion disappears from the majority of research frameworks referring to domestication theory. Second, one cannot understand the persistence of technological determinism if one does not analyze technologies in their complex reality, which is constituted not only by rationality and science, but also by emotions (Haddon et al. [32], Vincent and Fortunati [64]), symbols, metaphors, myths (Mumford [51], [52]; Mosco [49]; Fortunati [13]; Fortunati, Katz, Riccini, [22]) and narrativity (Carey [4]; Silverstone [60]). If one shares Susan Langer’s idea that technique is a way to handle problems, one should conclude that technologies have an emotional technique.

However in the quantitative research projects I have done with other colleagues both in Italy and in Europe, it emerges that fear and wonder have disappeared from the emotional repertoires which people activate in their relationship with technologies. This relationship is changed and transformed from an episodic and extraordinary event to the spread of ownership of these devices and practices of use based on the daily routine of an increasing number of people. One can distinguish two emotional dimensions –one is satisfaction, made up by emotions such as interest, joy, relaxation, amusement, satisfaction, curiosity, enthusiasm and surprise, and the other is dissatisfaction, made up by emotions such as indifference, irritation, boredom, anger, frustration, anxiety/stress, unpleasantness, embarrassment (Fortunati [17]). So, the emotional repertoires can by now be conceptualized in term of an assessment of consumption. Clearly the domestication of ICTs has been the process which has also accompanied the disenchantment for the technologies of information and communication, which has occurred because of standardized operativity in the domestic setting.

In effect Kopytoff’s approach and the metaphor of domestication proposed by Silverstone have captured the meaning of the process of “technological naturalization” that has taken place with the ICT. As Contarello et al. [8] have showed, in a research on social representations of the human body and the ICTs carried out at international level (including countries such Italy, Netherlands, Spain, Romania and Russia), the fact that the technologies of information and communication get closer and penetrate the human body is interpreted by respondents as a process of naturalization of technologies and thus “a naturalization of what is artificial, rather than the other way around”. So, far from a rhetoric that makes the ICTs appearing as a ‘deus-ex-machina’, this research stresses instead the necessity to see them as imitating the human body, which is the model that has inspired all machines and technologies. The results of this research do not surprise since the most usual examples of ontological metaphors are those in which physical objects are considered as if they were persons (Lakoff and Johnson [40]: 53-54). The personification of objects allows humans to understand a wide range of experiences with non-human entities in terms of human motivations, characteristics and activities. So, this behaviour describes the fact that many people are able to develop a meaning of technological artefacts only by conceptualising them in human terms. In the particular case of the domestication theory, the shift is even more complex, because it happens through an intermediate stage which passes by means of a metaphor with wild species.

Page 25: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Theories without Heart 9

3 Domestication Theory and the Coming to Light of Studies on Emotion and ICT

Domestication theory, which is one of the theories widely drawn upon to understand and describe the use and the integration of ICTs in everyday life (Silverstone and Hirsch [58], Silverstone and Haddon [59]), is also the one which has allowed and encouraged in a second stage the coming to light of a series of studies on the emotional relationship with information and communication technologies. For this reason I will focus here on domestication theory, trying to expose its main tenets and to show through which paths and theoretical cross-fertilization this theory has created a fertile terrain to develop a discourse and an empirical tradition of research on emotion and ICT.

Domestication, argues Roger Silverstone ([61]: 232), has to be seen as a consumption process in which consumption is framed as “linked to invention and design, and to the public framing of technologies as symbolic objects of value and desire”. The inspirers were Jean Baudrillard [1], Michel de Certeau [10] and Daniel Miller [48], who at the end of 1980’s were all observing that the classical boundary between production and consumption was blurring. The world was changing under the mobilization and struggles not only of the working class but also of other new social actors, such women and youth. The hierarchical structures and the strategies of separation among the old economic and social structures were losing their strength. Their studies showed that in the consumption process commodities are subject to an intense activity of attribution and the transformation of particular meanings as well of symbolic and affective production. The domestic sphere emerged from their studies not as the place where the command scheme embodied in the commodity was merely executed, but as the place where the subjectivity and agency of consumers/users reinterpreted the commodity, often in an unexpected way. Of course the space of rebellion or at least of non-cooperation with the logics of the capitalistic process in the consumption sphere remained of limited influence until the moment in which capital discovered that not only workers but also consumers could be an independent variable in the process. The recognition by these authors of the consumption sphere as a productive sphere in an economic sense and also as a sphere where struggles, conflicts and autonomous determination by a multitude of actors take place has been of a great importance.

The merit of domestication theory is that it has linked this discourse about consumption to several other strands such as the innovation theory (Rogers [56]), showing in a certain sense that the diffusion and process is not linear as it was imagined by Rogers. Another strand which domestication theory included under its umbrella is the question of the design of technology, which should consequently be seen as an on-going activity which does not stop at the factory walls, but which continues in the sphere of everyday life and which should consequently be seen as an activity which also involves consumers/users (Schuler and Namioka [57]). These two issues – innovation and design – have been developed considerably over the past years, reflecting a large amount of empirical research and debate. Users/consumers have been conceptualised as e-actors including in this notion the complexity of the rich debate on the role and agency of buyers/users/consumers/stakeholders (Fortunati et al [24], Gebhardt et al [27]). Further strands that in principle can be linked to this

Page 26: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

10 L. Fortunati

theory include that of Goffman’s tradition of the public frame [29] and that of social psychologist tradition of social thinking which has been explored in the framework of research on social representations (Moscovici [50]).

This latter framework of research on social representations has constituted the theoretical background of a series of studies on the new media that have been carried out in the last fifteen years (Fortunati and Manganelli [23], Contarello and Fortunati [7]). This group of studies is important, as, I will discuss below, they add an important dimension regarding the integration of technologies of communication and information into the household context: the dimension of the integration of new technologies at the socio-cognitive level, in the system of social thought as developed in the public sphere. In these studies the process of domestication is investigated by exploring how these technologies are socially elaborated and shared with the purpose of constructing a common understanding of them. To better understand the role of new media, Joachin Höfflich [35] has recently proposed combining the powerful notion of ‘frame’ with that other powerful notion of ‘social representation’. A final connection can be made again with the analysis of Baudrillard regarding the system of objects [2] and their symbolic and cultural value.

In practice, domestication theory describes the adoption and use of ICTs in four dimensions: the first is appropriation, the second objectivation, the third is integration and the fourth is conversion. Appropriation is the process that involves human agency and which describes the interaction between the human and the technological in a constant dialectic of change. Objectivation is constituted by tactics of placing the new technology inside the domestic sphere, by reorganizing the house space and restructuring the micro-politics of gender and generation relationships and the command over the domestic space. Integration is the process of injecting the practices of using the new media into the rhythms and pauses of domestic life, inside and outside the formal boundaries of the household. In practice, time management has been found to be in Europe one of the three main reasons for using the ICTs (Fortunati and Manganelli [21]). Conversion, which implies recognition, is made up of “the perpetuation of the helix of the design-domestication interface” (Silverstone [61]: 234) and involves display and the development of skills, competences and literacies. This last element of domestication has also results in many studies and research (Chiaro and Fortunati [5]: Williams, Stewart, Slack [65]). The late Silverstone stressed that consumption includes five dimensions: commodification, appropriation, objectivation, integration and conversion. These different dimensions are often confused and they are somehow inspired by the two stages of the social integration at cognitive level depicted by Moscovici [50] in his theory on social representation. These two stages are anchorage and objectivisation: the former enables people to incorporate something with which they are not familiar into their network of categories, integrating it cognitively in the pre-existing system of thought. Objectivisation (whose name recalls the second element of domestication) helps to make something abstract concrete, to give ideas a material consistency and to embody conceptual schemes. By the way, these respective stages have the purpose of describing the conceptual and social integration of technologies in everyday life. While domestication is sensitive to the material and immaterial part of the integration process, understood as practical behaviour, but also as attitudes, opinions and

Page 27: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Theories without Heart 11

values/symbols, the social representation approach is developed mainly on the cognitive side.

In both theories, domestication and social representations, however, although emotion plays an important role, this role was not explicitly discussed, at least in a first stage of research in this field. And yet almost all human cognition depends on and uses concretes structures such as the sense-motor system and emotion (Lakoff and Johnson [40]). This initial ignoring of the emotion role in the domestication process is even more surprising since the domestication metaphor develops, as I showed above, from the starting point of the conceptualisation of the new technologies as being like wild animals and so implicitly it include a noticeable emotional tension inside it. This indifference towards emotion however was already overcome in the early 1990’s when domestication theory was cross-fertilized by domesticity theory (Fortunati [12]). It was through the contribution coming from domesticity theory that domestication theory found the way to develop further its approach to consumption, including the immaterial part of it, which is constituted by emotion, affection, communication, information and so on. In effect the approach formulated by domestication theory to consumption was in part new and in part old. It was new since this theory understood the consumption sphere as a production sphere, it was old because for the first period of research it did not try to understand what this could imply. The questions which have been avoided in this phase were: production of what? By whom? For which purposes? With what effects? Only when the analysis on the social functioning of the labour-force reproduction has been made merge with domestication theory (Fortunati [19], [20]) has it been possible to answer all these question in an appropriate way. Fortunati’s analysis showed that the consumption process should be seen as part of the process of reproducing the labour force which takes place on the domestic sphere and which has a specific worker: women. The cross-fertilization between domestication and domesticity theory allowed us to understand that production inside consumption should be seen as the production of value, of surplus value, and that this process is the main and productive process of the whole economic system (Fortunati [15], [16]; Hochschild [34]). In this framework, emotion represents a substantive part of the immaterial labour carried out in the reproductive process and technology represents a tool to make women intensify their labour. Studying the short-circuit of the emotional relationship with ICTs and the social role of electronic emotion allowed the blossoming of a series of studies on this topic (Vincent [63]; Lasen [41]). The specific role of emotion has been implemented in several studies that have been carried out later both at quantitative level and qualitative level and which now constitute an important strand of domestication studies (Vincent and Fortunati [64]).

4 Emotions as Infrastructure of Body-to-Body Communication

On top of this cross-fertilization between domestication and domesticity theories, another original approach to the understanding of emotion in the communicative process is that proposed by Star and Bowker [62]. These two scholars define emotion and the human body as components of the infrastructure in face-to-face, or even better body-to-body, communication (Fortunati [14]). Their approach allows us to capture

Page 28: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

12 L. Fortunati

better the role of emotion in accompanying the process of communication in co-presence, but also in accompanying the practices of using information and communication technologies as well as the transformation of social organizations in which these technologies are used.

What transformation of emotion do we experience as an infrastructure of body-to-body communication when the communication process is mediated? For understanding and depicting this transformation I refer to a classical concept which was introduced by Marx [47] in the Economic and Philosophical Manuscripts of 1844, and which is ‘alienation’. Many intellectuals and scholars of 20th century have reflected on the issue of alienation and offered valid contributions. Among them I cite here, for example, Gorz [30] who stated that the loss of control by individuals of their selves has meant the ever-growing loss of control of their needs, desires, thoughts and the image that they have of their selves. I argue that in the 20th century alienation has also involved the communication sphere, where by means of the technologies of information and communication the separation of the body from the communication of emotion, words and non-verbal signals has been produced. The mediation of an artificial transformer (such as the mobile phone and the internet) has in fact strengthened the communication process but at the same time has provoked structurally a separation between the body and the personal and social capacity of individuals to communicate verbally or by written and also visual representations.

This aspect of alienation is inevitable since the development of the capital system has implied the rupture of the unit between mind and body and consequently a separate development of the mind from the body. This separation means that the mind has more chances than the body to be protagonist in the communication process. As Manovich [45] argues, the main point of tele-presence is not in individuals’ presence through ICTs, but their absence through ICTs (anti-presence). In fact it is no more necessary to be physically present in a certain place to affect the surrounding reality. As consequence, the distance between the subject and the object or another subject becomes the crucial point, since it is the distance that shapes perception. The tele-absence of the body confines it not only to a kind of secondary role in mediated communication but also to a condition of being a minority. Its affordances, needs and desires are mainly ignored. With the advent of the tele-absence of the body, the physical and emotional infrastructures of the communicative process become separated and have a different destiny. While emotions, for their specific essence of inner energy which simultaneously implicate cognition, affect, evaluation, motivation (Illouz [36]), adapt themselves to mediated communication in various ways, the physical infrastructure of the body expresses more inertia towards the change in the communication sphere. In fact, the separation of emotions from the body leads them to a better destiny, since this separation does not automatically imply that individuals are destined to live emotion in a way that is worse. On the contrary, the destiny of the body with mediated communication is that to be ignored in its potency and peculiarities and to be treated as in absentia. The body in computer mediated communication is expected to be steady, sit down on a chair. See all the health problems affect the body and especially some parts of it such as the neck, the arm, the wrist and so on that have pain. The role of the body is less sacrificed, of course, in mobile communication in which it can move. However, in both cases, the script of the body is in fact reduced to micro-gestures, often to wrong postures, to the use of only

Page 29: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

Theories without Heart 13

two senses, mainly sight and hearing, and also the voice. This limited use of the body cannot help but also distort seriously the communication of emotions, which, although they are linked to social and cultural contexts and shared norms, remain “body matters” (Frijda [26]). Here a question is inevitable: to what extent are emotions sacrificed in their separation from the body? When the body is separated from the communication in a certain sense a sort of alienation of emotion is produced, since these simultaneously implicate cognition, affect, evaluation, motivation and the body. As consequence, conceding a growing share of our own communicative activity to mediated communication might involve risks in terms of psychophysical well-being (Contarello [6]). The problem is that on many occasions one cannot choose the best channel but rather use the channel that one has at one’s disposal to communicate and that often ICTs are precious tools to overcome spatial, temporal and economic limitations, but also to explore other dimensions of oneself. The definition of electronic emotion advanced by Fortunati and Vincent [25] might help to figure out what these dimensions are. For them an electronic emotion is: “an emotion felt, narrated or showed, which is produced or consumed, for example in a telephone or mobile phone conversation, in a film or a TV program or in a website, in other words mediated by a computational electronic device. Electronic emotions are emotions lived, re-lived or discovered through machines. Through ICTs, emotions are on one hand amplified, shaped, stereotyped, re-invented and on the other sacrificed, because they must submit themselves to the technological limits and languages of a machine. Mediated emotions are emotions which are expressed at a distance from the interlocutor or the broadcaster, and which consequently take place during the break up of the unitary process which usually provides the formation of attitudes and which consists of cognition, emotion and behaviour”.

A specific study on the social representation of the human body and ICTs has investigated the relationship between the human body and new technologies (namely the mobile phone and the computer/internet) with the purpose of understanding how this relationship is socially conceptualized in this period that has been named ‘mass prosthetization’. The results indicate that social thought still sees a clear opposition between the human body and the new media (Contarello and Fortunati [7]), which seem to be destined to a divergent development. For respondents, however, the importance of the body remains central, although in absentia. Its centrality emerges in an oblique but constant way in many studies carried out on the use of telecommunications. But here I argue that the importance of the body resonates also from the importance that respondents attribute to the dimension of convenience. Elsewhere I argued that convenience might be considered as a major need and at the same time a true principle which governs the use of ICT. In the majority of quantitative and qualitative studies carried out in these two decades on the technologies of information and communication, convenience was a recurrent motivation to use these tools (Fortunati [19], [20]; Chiaro and Fortunati [5]). But what is there behind convenience if not the reasons of the body, that is, the concern to avoid useless efforts and fatigue caused to it? A concern regarding bodily efficiency and health? The notion of convenience in the use of ICTs represents the application of Occam’s Law. The convenience law, which is the systematic behavior to save energy and avoid causing fatigue to the body through the use of these technologies, is connected with the reasons of the body, which in the context of the decision making

Page 30: Lecture Notes inArtificial Intelligence 5641 · theoretical approach to the cross-modal analysis of verbal and nonverbal communica-tion changing the concept of face to face communication

14 L. Fortunati

process regarding the question of whether to use a device or to resort to body-to-body communication, which device to use and how much to use the device in question becomes a priority. This law is particularly evident in the research carried out in the period of the diffusion and appropriation of these devices, that is for the mobile phone and the internet the second part of 1990’s. This law puts in motion, however, a chain of contradictions in the sense that the body, which in principle inspires actions and strategies to save it from fatigue and efforts, often ends up, in reality, being sacrificed.

5 Conclusion

By presenting and discussing the example of domestication theory I tried to show that: 1) domestication theory is powerful because it is highly metaphorical and so emotional; 2) the cross-fertilization of this theory with domesticity theory has allowed the development of a variety of research studies both qualitative and quantitative on the emotional integration of ICTs in everyday life. Furthermore, by presenting and discussing the notion of emotion as an infrastructure of body-to-body communication, I tried to show to what extent this perspective proposed by Star and Bowker [62] is seminal in detecting what happens to emotion in mediated communication.

Body-to-body and mediated communication represent broad fields which increasingly require multidisciplinary approaches and challenges traditional methods of research. The problem is that it is not easy to merge different traditions of investigation, theories and methodologies. Recently there have been some attempts to merge inside the same strand different sociological approaches – namely the Sociology of Technology and Science, Communication studies, Mass Media studies, ICT Users’ studies (Boczkowski and Lievrouw [3]; Lievrouw and Livingstone [42]; Fortunati [18]). And this paper represents another attempt to cross-fertilize the tradition of ICT user studies with the field of cross-modal analysis of Verbal and Nonverbal Communication. Only the future will show if these operations will open a fruitful dialogue among different disciplines. However, it is a matter of fact that it is really necessary to merge different traditions in order to mutually correct inconsistencies and errors.

From this analysis has emerged the necessity to strengthen the investigation of electronic emotion both at a theoretical and an empirical level and to develop further this cross-fertilization between these two fields of research, which until yesterday were very disconnected. Theories with heart are needed in order to understand properly processes so complex as body-to-body and mediated communication.

References

1. Baudrillard, J.: Selected Writings: Jean Baudrillard. Poster, M. (ed.). Polity Press, Cambridge (1988)

2. Baudrillard, J.: The system of objects. Verso, London (2005) 3. Boczkowski, P., Lievrouw, L.A.: Bridging STS and Communication Studies: Research on

Media and Information Technologies. In: Hackett, E.J., Amsterdamska, O., Lynch, M., Wajcman, J. (eds.) New Handbook of Science and Technologies Studies. MIT Press, Cambridge (2008)