real-time face tracking and gesture recognizing embedded quadruped robot with a tele-operation...

Upload: mer-fro

Post on 09-Mar-2016

8 views

Category:

Documents


0 download

DESCRIPTION

Robotics

TRANSCRIPT

  • 16th IEEE International Conference on Robot & Human Interactive CommunicationAugust 26 - 29, 2007 / Jeju, Korea

    Real-Time Face Tracking and Gesture Recognizing EmbeddedQuadruped Robot with a Tele-operation Server

    H.C. Shin', Y.K. Kim2, D.H. Hwang 313Intelligent Robot Division, Electronics and Telecommunication Research Institute, Daej eon, Korea,

    [email protected], [email protected] of Computer Software & Engineering, Korea University of Science and Technology, Daej eon, Korea,

    placeo(&etri.re.kr

    Abstract- In this research, real-time face tracking and gesturerecognizing embedded quadruped robot with a tele-operationserver is presented. Using i.MX21 embedded system, this robottransfers MPEG-4 video to tele-operation server which proc-esses image data for gesture recognition and face detection, andhuman following. The developed quadruped robot has 3 DOF 4legs and pan-tilt actuators. This system can recognize user' sface and hand motion and serve various music contents to robotuser.

    I. INTRODUCTION

    A service robot market shows rapid growth for vacuumcleaning, home security and content service of education andentertainment. For useful home service robot, a robot musthave high reliability, low price, low power consumption,simple structure and high performance. An embedded robotsystem can be proposed for these requirements. But becausethe embedded system is generally planed for specified func-tion and small system, an embedded robot system may haveinsufficient computing power. For this problem, URC(Ubiquitous Robotic Companion) was proposed [1, 2]. In thissystem the complex computationaljobs such as face detection,face recognition, voice recognition, SLAM (SimultaneousLocalization and Mapping), TTS (Text to Speech) are proc-essed on a remote server and the results are transferred tomobile robot as shown in Fig. 1. Because the mobile robot isonly tele-operated mobile terminal, we can make high reli-ability, low price, low power consumption, simple structuredmobile robot. On the other hand there are various locomotionmethods for mobile robot. A quadruped robot has advantagessuch as easy rough road movement, visual and emotionalimpression. In this study we developed real-time face track-ing and gesture recognizing embedded quadruped robot witha URC tele-operation server.

    II. SYSTEM CONFIGURATION

    A. Robot ConfigurationThe developed robot is quadruped robot and has 12 ac-

    tuators for 4 legs, 2 actuators for pan-tilt, CMOS camera forface tracking and gesture recognition, microphone andspeaker as shown in Fig.2.

    The main processor is i.MX21 from Motorola and op-erating system is Linux 2.4.20. The vision camera is low costCMOS OV9650 module. The i.MX21 processor encodes raw

    wireless,commr nication

    :ldrpInternet

    Fig.1 URC (Ubiquitous Robotic Companion)

    vision data to MPEG-4 30fps QVGA (320x240). It can playAC97 format stored sound file or 16 bit 16kHz audio streamtransferred from the server. The robot microphone transfers16 bit 16kHz audio stream to the server. The ATMEGA128processor controls the actuators through RS485.

    B. Server Structure

    The transferred data from robot through wireless LAN(IEEE 802.1 Ig) flows into the data handler and classified intovision and audio data. The data handler hands received videoover to MPEG-4 decoder and still image sampler and theMPEG-4 decoder generates motion vectors. The sampled stillimage transferred to face detector and the face detection re-sult is reported to robot main controller. The server maincontroller analyzes dominant motion vector and commandslocomotion and pan-tilt control to follow robot user. As aservice robot, human finding and tracking is essential func-tion and a visual tracking can provide more natural and usefulservice [3]. If decided robot approached human enough re-

    978-1 -4244-1 635-6/07/$25.00 i2007 IEEE.

    WP-30

    Mobile Robot Terminal- Minimum embedded processor- Minimum sensor- Minimum actuator- Minimum component forhuman robot interaction

    Teleo-operation Server- Navigation- Face dptectton & recognition- Voice recognition- Text to Speech- Multimedia content handling- Etc.

    956Authorized licensed use limited to: Khajeh Nasir Toosi University of Technology. Downloaded on December 21, 2009 at 05:23 from IEEE Xplore. Restrictions apply.

  • garding face size and hand gesture, the server command robotto present various audio contents. If there is no human, thenavigation

    CMOS CameraMi bSpeaker

    Main board a EMicrophoneLeg acttuators ri.D ^.

    controller offers locomotion data for free navigation in agiven space. The audio stream generator generates audiostream from TTS, music file and server microphone. Thegenerated audio stream is transferred to robot and robot playsthe down loaded audio stream.

    Server (Windows XP)

    Mohnitor | ; Face Dete-tor L'. ~~~~~~stream At lboation|*0Spealcer + , | |~~~Still Ima:ge Samrpler |

    stream A+I

    * rnpressedVision | |Visilon Data Decoder' audlib, sehisbr data

    W%ireless LANS DaLta H andl(|IEE:E802? vlit-jector& a:udib -str-ear X 3

    * NFavigation Cbntrbller Magootlhll Cbhtrbller |*- -robot locomotior-

    |Audlib Streat Gen erator ;> -Dan &: tilt actuation(texkt to siDeech i -audio Cornmand

    *|music, mnicrophOn~e)S | -rnotjbr Vector anmallysis |i-..............................................................................

    Fig.2 Developed quadruped robotFig.5 Tele-operation server structure

    The face detector detects the human face location in robotvision data with AdaBoost algorithm from Intel open com-puter vision library. As shown in Table 1, the face detectorcan detect 7.0-13.1 fps depending on a background com-plexity in Intel' s Xeon 3 GHz processor. The face detectorcan detect human face within 3.5m from robot approximatelyas shown in Fig.6. Even though the face detection success rateis not 1.0, the robot can approach to human if face detectedonce.

    Table 1 Face detection performanceBackground Complex SimpleFrame rate | 7.0+0.2fpss 13.1+0.3fps

    Fig.3 Embedded main board

    CD

    PD

    0 03

    0c- m

    o4

    0C)) 0.4m4oq,aJ

    02Ui-

    (RS 232)

    A

    + * O,x< AS S HOsmri 1l6b 6 1 1.5 2 295 3

    HumIhanLRobot distance (in )Fig.6 Face detection performance

    Fig.4 Robot configurationC. Communication between robot and serverThe robot transmits QVGA MPEG-4 up to 30 fps videostreaming and 16 bit 16kHz audio stream to server. We can

    957

    u

    laRd,;treamI

    XFhril(RS485) POOR

    culaw`.

    Authorized licensed use limited to: Khajeh Nasir Toosi University of Technology. Downloaded on December 21, 2009 at 05:23 from IEEE Xplore. Restrictions apply.

  • recommend the adequate protocol for massive video andaudio stream as UDP/IP protocol. Because of short data delayand high stability, the UDP/IP show better closed loop con-trol performance than TCP/IP in case of poor wireless LANenvironment [4]. In this research, robot and server are con-nected using UDP/IP protocol and IEEE 802.1 Ig wirelessLAN. The end-to-end delay of vision stream through CMOScamera, MPEG-4 encoding, wireless LAN, MPEG-4 de-coding and monitor display was 119+33ms. The receivedvideo stream from robot was 27.0 + 0.4 fps and 302+17.3Kbps bitrates.

    III. QUADRUPED GAIT CONTROLThe developed quadruped robot has forward, backward,

    left/right turn static gaits. There are various studies forquadruped robot gaits [5, 6], and leg and body movementsequences makes robot locomotion as shown in Fig.7 [7].

    Leg2q i.

    y

    Log3

    Leg 1

    /Leg

    Leg4

    2 1 2 -1 2- 1

    +A IB +C

    3 3 4 31*-4-A - -C

    Fig.7 Possible gait combinations

    In this study, +A are applied to forward and backwardgaits and +C are applied to right, left turn gaits. Fig.8 showsforward gait sequence as an example. In step 1, the robotsteps forward leg 1 and draws leg 4. In step 2, it moves masscenter forward. In step 3, it draws and steps down leg 3 andsteps forward leg 2. In step 4, it moves its mass center forward.In step 5, it draws and steps down leg 4 and steps forward leg1. These sequences are repeated for forward locomotion.

    2 Stepl

  • Fig. 11 Tele-operation control

    The face detection result of MPEG-4 video stream fromrobot's CMOS camera is shown in Fig.12. The face detectorreturns the face location (Xface,Yface) and face size (FW,Fh) torobot controller. With face location and pan-tilt angle, we cancalculate the human face direction with respect to robotcamera and robot body as shown in Fig. 13. Because field ofview ofCMOS camera is approximately 50 by 40 degree, theface position direction Pface xi Pfacey and Oface can be deter-mined as shown in (1).

    Fig. 12 Definition of face position

    xi+l J Xpan if ( 5face x < Ath )O(/pan + (/face _ x if ( (O)face x > (th )

    Otilt+ (/)face

    (2)if ((Oface y < ()th )

    yif(|(/)face y > (th )The robot locomotion command is determined as shown in (3).

    fturn right if(Oface >Oface th )~robot-

    l turn left if(0face < Oface th )_ forward if(F .Fh th )

    Vrobot - backward if(Fh > Fh th )

    (3)

    ?4OpixelV. GESTURE RECOGNITION

    We can obtain motion vectors from MPEG-4 decodingprocedure, and it provides useful information of humangesture [9, 10]. For motion compensation, MPEG-4 decodercombines the so-called error picture to the reference frame,using the movement information carried by the decoded mo-tion vector. This motion vector is generated by IDCT block asshown in Fig. 14. In this study, we used Microsofta FDAMdecoder.

    II

    0 E.c4 (1)face7

    e

    Fig. 13 Parameter definition

    Oface x =25(Xface /160-1)Oface y =20(Yface /120-1)

    Fig. 14 Motion vector generation

    Decoded QVGA MPEG-4 produces 20x15 motion vectorsasshowninFig.15 18.

    (1)

    Oface = Oface x + Opan

    The pan-tilt control command is determined as shown in (2).

    959

    i+lOtilt I.

    Otilt

    Authorized licensed use limited to: Khajeh Nasir Toosi University of Technology. Downloaded on December 21, 2009 at 05:23 from IEEE Xplore. Restrictions apply.

  • 10 15 20

    Fig. 16 Horizontal gesture of user

    Horizontal directionFig. 18 Horizontally dominant motion vector trajectory

    As shown in Fig.15 16, existing motion vectors aregrouped and dominant movement was calculated. The centerofdominant motion vector group was traced, as shown in Fig.17 18. These trajectories are classified to vertical or hori-zontal using trajectory size, slope information. We assignedvertical trajectory as robot calling command and horizontaltrajectory as sound play command

    If face detection is true and the dominant motion vectortrajectory is vertical, the robot follows user. If the robot ap-proached enough to human, the robot and server regards it asmusic play command if the dominant motion vector is hori-zontal as shown in Fig. 19.

    5 10Horizontal direction

    15 20

    Fig. 17 Vertically dominant motion vector trajectory

    Fig. 19 Gesture recognition and human following procedure

    960

    15

    10k0F:C)

    L) 5

    00 5

    15

    0.-lo

    CD)01

    _t 5

    0

    I

    0

    I L X-

    1, :/..v ozi, K

    Authorized licensed use limited to: Khajeh Nasir Toosi University of Technology. Downloaded on December 21, 2009 at 05:23 from IEEE Xplore. Restrictions apply.

  • VI. CONCLUSION

    In this research, real-time face tracking and gesture rec-ognizing embedded quadruped robot and a tele-operationserver are presented. Using i.MX21 embedded system, thisrobot transfers MPEG-4 video to tele-operation server whichprocesses image data for gesture recognition and face detec-tion, and human following. We realized face detection usingopen CV library and gesture recognition function usingMPEG-4 decoded motion vector analysis. The developedrobot and server recognize user' s vertical hand motion ascalling gesture during the face detection. The developedquadruped robot follows user using 3 DOF 4 legs and 2 DOFpan-tilt actuators. If the distance between robot and user iseffective, the robot and server detects user' s horizontalhand motion as music play command using dominant motionvector analysis.

    VII. REFERENCE

    [1] Ha, Y., Sohn, J., Cho, Y., and Yoon, H., " Towards Ubiquitous Ro-botic Companion: Design and Implementation of Ubiquitous RoboticService Framework," ETRI Journal. 27(6), Dec., 2005, pp.666-676.

    [2] Cho Y.J. and Oh S.R.," Fusion of IT and RT: URC (UbiquitousRobotic Companion) program," JOURNAL- ROBOTICS SOCIETYOF JAPAN, vol. 23, no. 5, 2005, pp. 22-25.

    [3] Matsumoto Y. and Zelinsky A., " Real-time face tracking system forhuman-robot interaction," Systems, Man, and Cybernetics, 1999.IEEE SMC '99 Conference Proceedings. 1999 IEEE InternationalConference on, vol. 2, 1999, pp. 830-835.

    [4] Ploplys, N.J. and Alleyne, A.G., " UDP network communications fordistributed wireless control," American Control Conference 2003,v.4, pp.3335-3340.

    [5] Duffert, U. and Hoffmann, J., " Reliable and Precise Gait Modeling fora Quadruped Robot," Lecture notes in computer science, v.4020,2006, pp.49-58.

    [6] Fujita, M. and Kitano, H.," Development of an autonomous quad-ruped robot for robot entertainment," Autonomous Robots, v.5 no. 1,1998, pp.7-18.

    [7] McGhee, R.B. and Iswandhi, G.I., " Adaptive Locomotion of a Mul-tilegged Robot over Rough Terrain" , IEEE Trans. on Systems, Manand Cybernetics, Vol. 9, No. 4, 1979, pp. 176-182.

    [8] Koo, T.W. and Yoon, Y.S.," Dynamic instant gait stability measurefor quadruped walking robot" , Robotica, v.17 no.1, 1999, pp.59-70.

    [9] Lee, J.S., Rhee, K.Y. and Kim, S.D., Moving target tracking algo-rithm based on the confidence measure of motion vectors" , ImageProcessing, 2001. Proceedings. 2001 International Conference on,2001 v.1, 2001, pp.369-372.

    [10] Velten, J. and Kummert, A.," Motion Vector Estimation for Trackingof Hands in Video Based Industrial Safety Applications" , Visionmodeling and visualization, 2000, 2000, pp.65-70

    961Authorized licensed use limited to: Khajeh Nasir Toosi University of Technology. Downloaded on December 21, 2009 at 05:23 from IEEE Xplore. Restrictions apply.