[ieee 2014 15th ieee international conference on mobile data management (mdm) - brisbane, australia...

A Framework for Continuous Group Activity Recognition Using Mobile Devices:

Concept and Experimentation Amin BakhshandehAbkenar,

Department of Computer Science and Computer Engineering, La Trobe University,

Melbourne, Australia [email protected]

Advisors: Seng W. Loke and Wenny Rahayu

Department of Computer Science and Computer Engineering, La Trobe University,

Melbourne, Australia {s.loke,[email protected]}@latrobe.edu.au

Arkady Zaslavasky,CSIRO, [email protected]

Abstract—Group Activity Recognition (GAR) is a challenging research area in context-aware computing which has attracted much attention recently. Many studies have been conducted in the field of activity recognition (AR) along with their applications in domains such as health, smart homes, daily living and life logging. However, still many open issues exist. Lack of an energy-efficient approach is one of the most vital issues in the context of AR. GAR work often suffers from energy consumption issues for the reason that, apart from AR process, there is the requirement to have more interaction among members of the group and a need to run more complex recognition processes. Moreover, almost all work in GAR are technology-oriented and assume that our real-life environment remains fixed once the system has been established, but this may not be the case. Hence, we propose a framework called Group1Sense for GAR towards addressing these issues. Also, a relatively simple scheme for GAR, with a protocol for the exchange of information required for GAR, has been implemented, tested and evaluated. We then conclude with lessons learnt for GAR.

Keywords— group activity recognition; energy efficient group activity recognition; context-aware mobile computing; mobile sensing

I. INTRODUCTION Activity Recognition (AR), which can be exploited

significantly in developing context-aware systems, has attracted more and more attention recently. So far, a number of techniques and models have been used to recognize activities in pervasive systems. Activity Recognition has been used in several domains such as health, Home/Work automation, and ADL such as walking, running, jogging and providing services based on the current undertaken activity. Also, Group/collaborative Activity Recognition (GAR) is an open issue in the context-aware research area [1]. Being able to recognize group activities such as walking together, being in a meeting, participating in a party, performing group soccer exercises, bush walking, running in a marathon, and working together in a warehouse, can be useful in different applications. Time and energy optimization of workers’ jobs in a warehouse, helping trainers and trainees in group games, sharing group activities/context status with those who are interested to be

1 https://www.dropbox.com/sh/6egaesforegdc52/_nuXQkGokg

joined by others in social activities, and notifying about a lost person during bush walking, might be some practical uses of GAR we aim to address in the broader goals of this paper. Another open issue in any context-aware mobile application is the energy consumption problem, especially for continuously sensing applications in which various sensors such as GPS, accelerometer, microphone, gyroscope or other sensors are used to recognize context and situations; these applications should be able to run long enough and continuously.

The aim of this research is to propose a novel multi-layer framework for continuous group activity recognition in which the components (e.g., sensors, classifiers, message senders and receivers) can work in an energy-efficient way. Moreover, there is a trade-off between accuracy or responsiveness (e.g., quick change detection) and energy consumption for this sort of systems. Hence, our model would adaptively adjust these parameters based on current conditions.

Our GAR framework consists of three main layers and the novelty of our work is that, by separating GAR functionality into layers, any suitable energy saving method for a given layer can be applied within that layer, so that multiple energy saving methods can be applied in each layer to provide cumulative energy-saving from all layers. There are other potential applications by taking advantages of further analysis of stored data so that some knowledge can be be extracted from these recognized group activities’ data such as who (at when / in where) are interested to go jogging with another, where is the most popular dancing club or even, on a large scale perspective, the level of social interaction might be detected in a suburb, town, state or country. Our conceptual design has been presented in such a way to be generic as much as possible, and also can provide flexibility by detecting new devices in a group or devices leaving the group.

We can summarize the contributions of this research as follows:

� proposing a conceptualization and a model for group activity recognition;

� designing an extensible three layer framework called GroupSense for continuous group activity recognition in which energy saving methods can be applied in each layer; and

2014 IEEE 15th International Conference on Mobile Data Management

978-1-4799-5705-7/14 $31.00 © 2014 IEEE

DOI 10.1109/MDM.2014.62

23

� prototyping and experimentation that shows the practicality of our conceptualization and highlights issues.

II. RELATED WORK Group activity recognition in context-aware computing is a

process of detection or inference of members who are performing the same activity or collaborating and having interaction among the group to achieve a particular and more complex goal. A number of studies have been conducted in this area which can be categorized in two main categories: 1) vision-based and 2) sensor-based. Inherently, working with sensor data especially mobile devices is very challenging. Accelerometer values are extremely sensitive to its position and movement. Hence, for physical activities detection wearable accelerometers [3] are used to achieve more accuracy. According to our survey, the majority of works in GAR are classified in the first category. Social interaction detection [4, 5] , sport team activity detection [6], collective activity activity recognition [7] are examples of exploration of image processing techniques in GAR using video cameras as input in which infrastructure must be established and the use of video streams solely does not provide sufficient informative data for more complex group activity detection.

Sensor-based approach has been used in many human activity recognition works in a wide-range of domains. Taking advantage of wearable sensors (smart phones as most available and non-obtrusive device) in GAR seems to be a necessity. Another major advantage of mobile sensing results from the fact that the sensors are part of all contemporary mobile phones and provide more rich information about a user, him/his undertaken activity and context. Gordon, et al. [2] defined MAR (Multi User Activity Recognition) and differentiate it from GAR. When GAR intends to recognize an activity within a group as an entity MAR aims to obtain multiple users’ activity in parallel. Also they proposed a distributed architecture using coffee cups, each of which is equipped with sensors such as accelerometer to detect whether individuals are drinking together or not. Group activity can be recognized by signal pattern detection and visualizing them for grouping individuals [3]. Each GA has some properties which make recognition process challenging, for example, the number of participants and roles. Hirano and Maekawa [8] employed both supervise and unsupervised techniques to cluster participants into groups based on activity model and classify participants to their most likely role. Also Bluetooth signal strength has been used for measuring user’s closeness which is not very accurate and it is sensitive to the noise and obstacles in the environment. Gordon, et al. [9] conducted a survey on energy efficiency methods on AR and summarized all methods into three categories: 1) dynamic sample rate adaptation of sensors 2) turning sensors on and off intelligently and 3) designing the components of the system hierarchically in which more energy consuming components wake up only if required. Up to the time of writing, only the work in [2] has considered the energy efficiency matter in GAR where the effect of abstraction level on the accuracy of activity detection was

investigated. Gordon, et al. [2] showed that by exchanging more abstract information (less communication volume) through the wireless network, energy consumption is reduced by 10%, while GAR accuracy reduction is negligible. Also, unlike other work, we focus on continuous recognition of group activities, not just once-off recognition.

III. MODELLING AND CONCEPTUAL DESIGN This section briefly explains our formal model and

conceptual design of our proposed framework.

A. Formal Modelling and Definition.

We provide a conceptualization of GAR as follows, defining the following elements. Sensor(s), Context(c), Atomic Activity (AA), and Complex Activity (CA) as essential parts of every activity recognition system has been defined. There are several key definitions of our framework as follows:

� Sensor Weights Table. SWT is a table/list which contains all activities together with the sensors that might be used for their recognition; each sensor, for each activity, is associated with a weight that denotes how essential the particular sensor is for the recognition of that activity. Each activity can be recognized by a set of sensors. For example, the atomic activity of “Driving” can use only accelerometer or a combination of accelerometer and GPS.

� Individuals � or Observers/Clients in GroupSense are smart phones (��) of a person which receives sensor data (using their own sensors or from environment) and our system intends to monitor their behaviours.

� Monitor m is a member of the individuals set (� � �) who is responsible for receiving all other group members’ messages and continuously performs the inference over received data. In our P2P Model, one device is selected as monitor. Also, in larger scale scenarios, we might need to share different group’s information among each other. In this case, each group’s monitor communicates with the monitors from all other groups provides facilitated by a central cloud. This approach can be used in crowd behaviors detection which is a novel field of research.

� SRI (Same Range Individuals). SRI � � is a set of individuals who are physically located in the same range.

� Group Activity. Given a group , a group activity of which is an activity that is performed by a set of individuals is defined by an expression of the form:

��

��

��

� � �� !� �!�� " �� " " ��!� �!��

�

��

�� where: o �� ! � , # � $ o �% �& '( )*+ (,'- % ', . % /*+,+ % � �

and � � �� 0 1� which contains all atomic

� � �%� �%�#

� � � " � �%� �%� #

� � �

24

and complex activities that might be used in the group activity recognition process.

o In 23435 , 6� and 6� denotes minimum and

maximum duration of that might be undertaken respectively.

o In 7�4�5 , �� and �� denotes minimum and

maximum number of individuals that can participate in .

o � denotes 8'�98 ,+8)�':&*�;& between pairs of individuals and activities in a group: 8'�98 ,+8)�':&*�;& � � < =�>�?@

� = A 9':BC:9)�': DEFGHIJKLMIN means O')* take place. [AND operator]

� > A P�&BC:9)�': DEFGHIJKLMIN means one of )/' must happen. [OR operator]

� ? Q RS98C&�T+ DEFGHIJKLMIN means if ':+ occurs, )*+ ')*+, must not occur. [XOR operator] 2ME " operation denotes temporal relationships between individual activities, whereas the � operation indicates only occurrence of an activity by an individual and it is time independent.

o �� is a matrix presenting all context information for all individuals.

o GAT (Group Activity Type) Each group activity has been categorized into three different types:

1.Multiple-Individual-One-Activity (MIOA.) 2.Multiple-Individual-Multiple-Activities (MIMA). 3.Multiple-Individual-Collaborative-Activities (MICA).

B. Conceptual Design and Generic Architecture for Energy-Efficient GAR

As can be seen in Figure 1, we have proposed a three layer conceptual model for group activity recognition as follows:

� Sensing Layer (Observer Side) � Individual’s Activity Recognition Layer (Observer

Side): Essentially, our conceptualization in the previous section assumes that every group activity

consists of combining the atomic or complex activities undertaken by individuals of the group. Therefore, this layer recognizes every individual activity using data from the Sensing Layer. Any sort of classifiers or activity ontologies for individual activity recognition can be used in this layer.

� Group Activity Recognition Layer (Monitor Side): Initially, Group Detector detects groups based on either level of physical closeness or similarity of individuals (not implemented) and then Message Receiver passes all messages from observers to the Group Activity Recognizer, which infers group activities using rules in a rule repository. Rule repository contains all predefined expressions for each group activity

� Energy Controller. According to our three layer design, it is possible to apply energy saving methods for group activity recognition in all the different layers.

Dynamic/Adaptive Sensor Sampling, Sensor Selection, Prediction, Sensing Offloading, and hierarchical component-based design are examples of techniques which have been used for energy saving in the context of AR and almost all work only use one technique. By exploiting our approach, any energy saving method can be applied within its relevant layer.

IV. IMPLEMENTATION AND EXPERIMENTS

As far as possible, we build on existing tools. On the mobile client side, any classic Bluetooth (BT) Android device with built-in accelerometer support can be used in GroupSense. The devices we used in our experiments are Galaxy S2, Google nexus S, Google nexus 4 and nexus 7 tablets. Ideally, walking, running, cycling and driving should be recognized by our classifier. To develop our classifier, we used the Weka (Waikato Environment for Knowledge Analysis) data mining package; in particular, we used J48 Weka, which is a Java implementation of the C4.5 algorithm. We collected data for training from the Android device and then, offline, fed the data into J48 to build our classifiers; once built, the classifiers are loaded onto the Android device. Our current implementation is a first attempt for proving our concept for

Fig. 1. Three layer architecture

25

GAR. This proposed protocol can be assumed as a foundation of more complete scenarios with more parameters such as more sensors, or classifiers for even more complex group activities. According to our proposed three layer architecture, any group activity detection application can use the following generic protocol as the main phases of GAR.

1. Device discovery and connection establishment. 2.AtomicActivityClassifierThread. 3. SendMessageThread. 4. ReceiveMessageThread. Monitor continuously listens to 5. GroupActivityDetectorThread. 6. NearbyDetectorThread.

The key aim of running experiments based on our proposed model is investigating the feasibility of this work if the GroupSense prototype is kept running for a long enough duration, in terms of power consumption.

Overall, the experiments indicate that continuous small group activity recognition with a simple protocol for up to one or two hours using classic unoptimized Bluetooth (v2.0) is feasible in terms of both energy efficiency and data transfer. Also, we note that energy drain does not increase exponentially with the number of devices or with duration. However, beyond several hours and with more than four devices, further optimizations will be needed, and classic Bluetooth will not be adequate. Also, in terms of energy efficiency, our layered approach does not cause unnecessary additional significant power consumption.

V. EVALUATION AND CONCLUSION This paper has presented a conceptualization of group activity recognition, defining group activities compositionally, in terms of activities of individuals within the group. We also proposed a generic three layer framework (named GroupSense) for continuous group activity recognition with smartphones. An experimentally validated simplified model and protocol for P2P network was developed to illustrate and analyze the feasibility of our conceptualization in terms of energy consumption. Almost all works in GAR are more technology-oriented and assume that our real-life environment remains static once the system has been built. Our model has the capability of extending and adding individuals to a group simply by exploiting device discovery protocols – while running continuous device discovery and frequently communicating activity states to the monitor might sound energy draining, our experiments indicate that this is feasible and practical for up to one or two hours of continuous operation with up to three or four devices, even with no energy saving method applied in our current prototype and with using classic Bluetooth (instead of BLE) for P2P connection.

According to our results, in a P2P network with four devices, GroupSense can stay alive approximately for 4 hours. For example, if four people perform bush walking for 1 hour, for monitoring them, only 10% of battery is consumed. However, in the current prototype we used only accelerometer and more complete scenarios requires more sensors like GPS and microphone. Hence, optimizing power consumption seems to be essential for more complex continuous group activity recognition – our results hence can be treated as a

worst case scenario given that we did not apply complex optimizations.

We aim to further enhance and extend our work as follows:

� applying energy saving methods such optimizing number of message exchange between observers and monitor and do a comparison with the current one;

� trying more complex GAR using multiple sensor types,

� running the app with more than four users and evaluating of its scalability; and

� Using other technologies such as Wi-Fi direct, low power Bluetooth or a centralized cloud-based approach.

REFERENCES [1] O. Incel, M. Kose, and C. Ersoy, "A Review and

Taxonomy of Activity Recognition on Mobile Phones," BioNanoScience, vol. 3, pp. 145-171, 2013/06/01 2013.

[2] D. Gordon, J.-H. Hanne, M. Berchtold, A. Shirehjini, and M. Beigl, "Towards Collaborative Group Activity Recognition Using Mobile Devices," Mobile Networks and Applications, vol. 18, pp. 326-340, 2013/06/01 2013.

[3] M. W. Daniel Roggen, Gerhard Tröster, Dirk Helbing, "Recognition of Crowd Behavior from Mobile Sensors with Pattern Analysis and Graph Clustering Methods," Networks and Heterogenous Media, vol. 6, 2011.

[4] V. Ramanathan, B. Yao, and L. Fei-Fei, "Social Role Discovery in Human Events," in Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, 2013, pp. 2475-2482.

[5] K. N. Tran, A. Bedagkar-Gala, I. A. Kakadiaris, and S. K. Shah, "Social cues in group formation and local interactions for collective activity analysis," in Proceedings of International Conference on Computer Vision Theory and Applications, 2013.

[6] C. Direko�lu and N. O’Connor, "Team Activity Recognition in Sports," in Computer Vision – ECCV 2012. vol. 7578, A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, Eds., ed: Springer Berlin Heidelberg, 2012, pp. 69-83.

[7] W. Choi and S. Savarese, "A Unified Framework for Multi-target Tracking and Collective Activity Recognition," in Computer Vision – ECCV 2012. vol. 7575, A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, Eds., ed: Springer Berlin Heidelberg, 2012, pp. 215-230.

[8] T. Hirano and T. Maekawa, "A hybrid unsupervised/supervised model for group activity recognition," presented at the Proceedings of the 17th annual international symposium on International symposium on wearable computers, Zurich, Switzerland, 2013.

[9] D. Gordon, J. Czerny, and M. Beigl, "Activity recognition for creatures of habit," Personal and Ubiquitous Computing, pp. 1-17, 2013/03/03 2013.

26

[ieee 2014 15th ieee international conference on mobile data management (mdm) - brisbane, australia...

Documents