logme - where have i been, what have i done?€¦ · joão augusto curto leiria dias thesis to...
TRANSCRIPT
LogMe - Where have I been, what have I done?
João Augusto Curto Leiria Dias
Thesis to obtain the Master of Science Degree in
Information Systems and Computer Engineering
Supervisor: Professor Daniel Jorge Viegas Gonçalves
Examination Committee
Chairperson: Professor José Carlos Alves Pereira MonteiroSupervisor: Professor Daniel Jorge Viegas GonçalvesMember of the Committee: Professor João Coelho Garcia
November 2014
ii
Acknowledgments
I take this opportunity to express my profound gratitude and deep regards to my guide Professor
Daniel Jorge Viegas Goncalves for his exemplary guidance, monitoring and constant encouragement
throughout the course of this thesis.
I also want to thank all of my closest friends. I’m not nominating anyone particularly because I don’t
want to risk forgetting any of them. I’m sure they know who they are.
I especially thank my mom, dad and sister. My hard-working parents have sacrificed everything for
me and my sister and provided unconditional love and care. I would not have made it this far without
them. I must also mention the rest of my extended family. Could not wish for a better one.
The best outcome from these past years is finding my best friend, soul-mate and fiancee Joana
Barracosa. She always had faith in me, even when I didn’t. I truly thank her for sticking by my side
unconditionally. Without her none of this would be possible.
iii
iv
Resumo
Lifelogging e o processo de rastreio de dados pessoais gerados pelo utilizar sobre a sua vida, atraves
do uso de algumas ferramentas. Lifelogging e ferramentas de autorrastreio sao um conceito em ex-
pansao, que se tornou cada vez mais popular nos ultimos anos, em parte porque a tecnologia hoje em
dia permite faze-lo, mas tambem porque as pessoas querem saber mais sobre elas proprias, para as-
sim poderem melhorar as suas vidas. Hoje temos uma grande variedade de dispositivos que registam
e analisam os dados recolhidos, como o FitBit, Narrative Clip, etc No inıcio da era das “atividades” de
lifelogging, as pessoas enfrentavam algumas dificuldades, devido ao facto dos computadores serem
a unica ferramenta disponıvel, o que era uma enorme limitacao, ja que tinham que esperar ate estar
perto de um computador para registar os dados. Embora este ja nao seja um problema, existem outros
problemas hoje em dia, como qual e a melhor ferramenta para usar, falta de tempo, falta de motivacao
e, principalmente, diferentes fontes de dados, o que torna muito difıcil reunir todas as informacoes que
registamos. Para tal, criamos um sistema capaz de permitir tanto a recolha manual, como automatica,
facilitando assim o processo de recolha e integracao e executando-o tambem de forma eficiente.
Palavras-chave: Registo diario, Autorrastreio, Informatica Pessoal, Auto Quantificacao
v
vi
Abstract
Lifelogging is the process of tracking personal data generated by yourself and your life, through the
use of some tools. Lifelogging and self-tracking tools are an expanding concept, which has become
increasingly popular in the last years, in part because technology nowadays allows doing that, but also
because people want to learn more about themselves, thus they can improve their lives. Today we have
a good variety of devices that register and analyze the data collected, such as FitBit, Narrative Clip,
etc. In the beginning of lifelogging activities, people had faced some difficulties, due to the fact that
computers were the only tool available for that, which was a huge limitation, as they had to wait till being
near a computer to register the data. Although this is no longer a problem, there are still other problems
nowadays, like which is the best tool to use, lack of time, lack of motivation and, specially, different data
sources, which makes very hard to gather all the information we are logging. Therefore, we have created
a system that allows both manual and automatic collection, thus facilitating the collection and integration
process and also running it efficiently.
Keywords: Lifelogging, Self-tracking, Personal informatics, Quantified self
vii
viii
Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Resumo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Research Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Document Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Related Work 4
2.1 Reflective Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Solution 17
3.1 LogMe User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.2 Reminders’ System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.3 Data Exporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 BackEnd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.2 Manual Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.3 Automatic Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.4 Data Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.5 Reminders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
ix
4 Evaluation and Results 34
4.1 Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2.1 Initial Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.2 Tests’ Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.3 Final Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3.1 Analysis of Initial Form Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3.2 Analysis of Tests’ Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3.3 Analysis of Final Questionnaire’s Results . . . . . . . . . . . . . . . . . . . . . . . 39
4.4 Experimental Results’ Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5 Conclusions 47
5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6 Appendix A 49
Bibliography 52
x
List of Tables
4.1 Users’ ages summarized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2 Results of the first set of exercises, in terms of efficiency . . . . . . . . . . . . . . . . . . . 38
4.3 Results of the second set of exercises, in terms of efficiency . . . . . . . . . . . . . . . . . 39
4.4 Results of the second set of exercises, in terms of efficiency . . . . . . . . . . . . . . . . . 39
4.5 Individual and Overall SUS Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
xi
xii
List of Figures
1.1 Fitbit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 MyFitnessPal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Endomondo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 The Stage-Based Model of Personal Informatics Systems . . . . . . . . . . . . . . . . . . 6
2.2 Fitbit+ notification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 FitBit device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Narrative Clip device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 ActivMON on a user’s wrist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6 A reminder notification used in the study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.7 Logging frequency comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.8 The mobile widget for the feed of observations. . . . . . . . . . . . . . . . . . . . . . . . . 10
2.9 QS Spiral Visualization Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.10 Spark Visualization Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.11 Your.FlowingData Visualizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.12 QuitNow! Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.13 Examples of Sparktweets usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1 System Blocks Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2 LogMe Homepage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Manual data collection: creating a new variable . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Manual data collection: suggestions’ list based on the possible values . . . . . . . . . . . 21
3.5 Automatic data collection: new variable creation from FitBit . . . . . . . . . . . . . . . . . 22
3.6 Dealing with conflicts: change the value inserted . . . . . . . . . . . . . . . . . . . . . . . 23
3.7 Dealing with conflicts: adding the value inserted . . . . . . . . . . . . . . . . . . . . . . . 23
3.8 Automatic data collection from a Google Spreadsheet: dealing with conflicts . . . . . . . . 24
3.9 List of reminders that shows the missing values . . . . . . . . . . . . . . . . . . . . . . . . 25
3.10 Data Exporting: presenting the values collected for the Lunch Location variable . . . . . . 25
3.11 Database Model of our application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.12 Application layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.13 Setting up a new variable from a Google spreadsheet . . . . . . . . . . . . . . . . . . . . 30
xiii
3.14 Data import format required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1 Tracking activities’ results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2 Reasons that refrain users from keeping track of their activities . . . . . . . . . . . . . . . 38
4.3 Answers to question 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4 Answers to question 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5 Answers to question 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.6 Answers to question 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.7 Answers to question 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.8 Answers to question 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.9 Answers to question 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.10 Answers to question 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.11 Answers to question 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.12 Answers to question 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.13 Answers to question 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.14 Answers to question 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.15 Answers to question 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.16 Answers to question 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.17 Answers to question 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
xiv
Chapter 1
Introduction
Personal Informatics refers to a set of tools that allow users to collect relevant personal information
for self-monitoring and self-reflection, for example, weight, distance walked, geographic location (tracks),
computer activities, etc. Self-reflection is also referred as Reflective-Learning, which means returning
to and evaluating past experiences in order to promote continuous learning and improve future experi-
ences. These tools help people to have a better knowledge about their behaviors and habits. The fact
that a person self-tracks a particular activity doesn’t mean that will automatically improve some aspect
relatively to it. However it will definitely provide previously unknown insights. These will allow that person
to take the necessary steps to perform important changes for a possible improvement. That is, thus, one
of the main advantages of self-tracking. This type of information collection is associated to names like
lifelogging, living by numbers, personal analysis, quantified self and self-tracking.
One of the first lifelogging projects was MyLifeBits [9]. It was inspired by Vannevar Bush’s hypothetical
Memex computer system [6] referred to as ”A theoretical proto-hypertext computer system in which an
individual compresses and stores all of their books, records, and communications which then become
mechanized so that they may be consulted with exceeding speed and exibility.”. Gordon Bell was the
experiment subject in MyLifeBits. He has digitized all documents he has read or produced, CDs, emails,
and so on. He also gathered web pages browsed, phone and instant messaging conversations.
Until recently this kind of tools was only available for traditional computers, which brought some
problems, as for example, we could only make the register if we were at a computer. This kind of
limitation, sooner or later, would lead to a big lack of motivation to continue to register the activities.
More recently, the number of mobile devices users (smartphones and tablets) has grown exponentially,
associated to this, is the growth of the number of tools to this kind of practices. Another important
promoter in this area is the Quantified Self group. It is an international collaboration of users and makers
of self-tracking tools. They have annual conferences, to discuss and present new tools and systems for
lifelogging.
1
1.1 Motivation
Although we can collect information about any type of activity, the vast majority of these systems
collect data about physical activity, food ingested, weight control, personal finances, geographic local-
ization, among other things. For each of these categories, there are a large number of alternatives to
record personal data, for example:
• FitBit1 tracker measures steps taken, distance walked, calories burned, floors climbed, and activity
duration and intensity. It also measures sleep quality: how long it takes to fall asleep, how often
people wake up over the course of the night, and how long they are actually asleep (Fig. 1.1);
• MyFitnessPal2 is a free smart phone app and website that tracks diet and exercise to determine
optimal nutrients and caloric intake for the users goals (Fig. 1.2);
• Endomondo3 is also a mobile phone app and website where you can track your running, cycling
and other sports activity (Fig. 1.3).
Like those referred, there is a large number of other tracking tools available.
Figure 1.1: Fitbit Figure 1.2: MyFitnessPal Figure 1.3: Endomondo
It turns out that each of these applications typically only covers a particular activity, ie, an application
that makes geographic location tracking does not make personal finances, and several applications are
needed for those who want to do the tracking of various activities. Also, the majority of these systems,
has their own predefined data analysis methods, which may be somehow limited, making very hard to
correlate all data we are enthusiastically recording and not allowing users to make their own analysis.
Besides, there is also the need to a manual insertion, because there are things that users want to
register, that is not available in those applications.
1http://www.fitbit.com2http://www.myfitnesspal.com3http://www.endomondo.com
2
1.2 Research Goals
The goal of this thesis is to somehow mitigate this gap, by creating an application that can collect
data from multiple sources of self-tracking, gather that data, thus facilitating the collection and
integration of all the data recorded . This way it will allow the user to make a more reliable and
complete analysis and draw more solid conclusions.
Another main aspect is how to maintain the user’s motivation high, how to make the process of using
this tool to be simple and fast, that leads the user to want to use it. And, definitely, the quality and
usefulness of the conclusions of the data registered on it has a major impact.
Also an important feature, which sometimes fails in the existing tools, is the ability to allow exporting
the existing data to the date, as many users like to do their own analysis and correlate data as they wish
and whenever they want. Allowing these extraction will permit users to freely use their data, as it’s their
own right.
As multiple data sources are used, data should be normalized. To the question ”What did I have for
lunch today?” the answer can be available in two different places and with different texts, but they may
refer to the same thing. For example, the answers ”steak” and ”steak beef”.
1.3 Achievements
With this thesis we have developed a system for collecting data with a high level of quality. All
mechanisms for data collection were developed with the primary goal of making this transaction as easy
and quick as possible for the user, so that the motivation remains during the use of the application. Data
collection from this system is divided into two modes: insertion of real-time data and automatic data
importation from external sources. Thus, one of our major concerns throughout the development of this
application was to create mechanisms to handle and eliminate data redundancy and ”noises”, since we
are working with data from different sources, we easily have several situations where different terms
refer to the same thing.
1.4 Document Structure
The rest of the document is organized as follows: the second chapter (Related Work) is where we
are going to present and discuss some of the existing systems, a revision of what was done in this
area, main applications and approaches to existing problems in the lifelogging area; In the third chapter
(Solution) we are going to enunciate our methodology, main requirements to our system and describe
in more detail our solution, in the standpoint of the user, as well as of the backend; In the fourth chapter
(Evaluation and Results) we will explain how we have evaluated the functionality of our system, and our
experimental results; Finally, we conclude in Chapter 5 with a summary of the proposed methodology,
the results achieved and suggesting some guidelines for future research.
3
Chapter 2
Related Work
In this section we will address several works that somehow have interesting conclusions that meet
the objectives of this thesis.
2.1 Reflective Learning
The personal informatics systems allow the collection of personal relevant information so that its user
can obtain some self-knowledge, for example, food ingested, how many steps, etc.. And then what to do
with this data? The vast majority of these systems just merely create a collection of all the data collected
for future analysis, not allowing the collector to use it for introspection. In the paper [17] by Rivera-
Pelayo et al., it is presented a framework that combines reflective learning techniques with quantified
self-tools, where were identified three main dimensions of support, including: tracking cues, triggering
and, recalling and revisiting experiences:
a) Tracking cues: capturing and keeping track of certain data as basis for the whole reflective learning
process.
b) Triggering: fostering the initiation of reflective processes in the learner, based on the gathered data
and the analysis performed on it.
c) Recalling and revisiting experiences: supporting learners in recalling and revisiting through the
enrichment and presentation of data in order to make sense of past experiences.
Tracking means the observation of a person and their context, in a way that helps the reflective
process. Rivera-Pelayo et al. characterize this dimension in the following sub-dimensions:
a) Tracking means: Two main ways for tracking exist: self reporting through often specialized software
and hardware sensors that directly track behavior.
b) Tracked Aspects: Of crucial importance to Quantified-Self (QS) applications is the selection of
data about experiences and outcomes that is being tracked; what is tracked is likely to have a
4
large effect on user acceptance and efficiency for reflective learning. The tracked aspects found in
QS applications can be classified in the following way:
- Emotional Aspects: Such as mood, stress, interest, anxiety, etc.
- Private and Work Data: Data from work processes and our lives such as photographs,
browser’s history, digital documents, music, or use of a particular software etc.
- Physiological Data: These are physical indicators and biological signals that describe a per-
son’s state of health. The main approaches comprise the measurement of physical activity
(for applications focusing on sport) and factors indicating health and sickness (e.g. glucose
level).
- General activity: data about a users’ general activity such as the number of cigarettes, cups
of coffee or hours spent in a certain activity
c) Purposes: Another important classification dimension is that of the purpose of a QS application;
the goal which the user tries to achieve by using this application. This purpose drives and guides
which measures are tracked and which means are appropriate.
Triggers are responsible for starting the actual reflection process. The role of triggers is to raise
awareness. These were separated into two different kinds:
a) Active Triggering: Happens when is the application that sends a notification, so it can capture the
user’s attention. To withstand active triggering, the application has to do an analysis to the data
already collected, to detect a situation likely to begin the process of reflection. This situation can
be a deviation from the data of a target set.
b) Passive Triggering: The systems that only support this type of triggering does not show any no-
tifications, just make a presentation of the data when it is requested. It is up to the user, when
visualizing the data, to find something that starts the process of reflective learning.
Recalling and Revisiting Experiences can be influenced by different aspects. The way that the
data is enriched and presented to the user facilitates the revisitation of the data to analyze past ex-
periences and reflect on them. Therefore, for the process of reflective learning, the QS tools should
give a greater support in the following dimensions: contextualization, Data Fusion, Data Analysis, and
Visualization:
a) Contextualization: The tracked data can be enhanced with context information, ie, with any infor-
mation that can be added to this data in order to assist the process of reflection. According to
Rivera-Pelayo et al. [17], context information can be: Social Context, Spacial Context, Historical
Context, Item Metadata and Context from other Datasets.
b) Data Fusion: One important aid to the reflection process can be the fusion and comparison of
objective (i.e. measured by sensors), self (i.e. self reported data from the user), peer and group
assessment (reported data from others about a user). There may be differences and discrepancies
between these views that can foster reflection.
5
c) Data Analysis: Different forms of data processing help to present the user useful measurements
(e.g. number of cups of tea per day/week, average mood of my colleagues, etc.).
d) Visualization: Attractive and intuitive presentation and visualization forms for the users should be
chosen which, at the same time, foster the analysis of the data for reflective learning purposes and
being otherwise one of the major barriers.
Another relevant work is presented by Li et al. [12], where was conducted a study that attempted to
understand the main difficulties in the use of tools for self-tracking. From that study, a model based on
five states has born (Fig. 2.1), respectively: preparation, collection, integration, reflection, and action.
The main barriers found in each of the states are also discussed.
Figure 2.1: The Stage-Based Model of Personal Informatics Systems
The Preparation stage occurs before people start collecting personal information. This stage con-
cerns itself with people’s motivation to collect personal information, how they determine what information
they will record, and how they will record it. Barriers in the Preparation stage are related to determining
what information to collect and what collection tool to use.
The Collection stage is the time when people collect information about themselves. During this
stage, people observe different personal information, such as their inner thoughts, their behavior, their
interactions with people, and their immediate environment. Some problems occurred because of the
user, either because they lacked time, lacked motivation, or did not remember to collect information.
Integration is the stage that lies between the Collection and Reflection stages, where the information
collected are prepared, combined, and transformed for the user to reflect on. Integration barriers prevent
users from transitioning from collection to reflection of data. Users encountered these problems when
collected data came from multiple inputs, reflection of data happens in multiple outputs, and the format
of collected data is different from the format necessary for reflection.
The Reflection stage is when the user reflects on their personal information. This stage may involve
looking at lists of collected personal information or exploring or interacting with information visualiza-
tions. Barriers in the Reflection stage prevent users from exploring and understanding information about
themselves. These problems occurred because of lack of time or difficulties retrieving, exploring, and
understanding information.
6
The Action stage is the stage when people choose what they are going to do with their newfound
understanding of themselves. Some people reflect on the information to inform them on what actions to
take. Most systems do not have specific suggestions on what to do next, which is a barrier to applying
understanding of personal information.
In a paper [16] written by Pina et al., it’s presented a system that aims to reduce sedentary in work-
place, the Fitbit+. As the name indicates, this system uses a regular fitbit activity tracker, which detects
when the user is sitting for a long period of time. When such period is detected, a notification is sent
to the users computer (See Fig.2.2) in order to encourage him/her to take a walk. This notification can
be a specific action (e.g. Thirsty? Go grab a quick drink from the nearest water fountain), a remainder
(e.g. Moving helps with creativity. Take a short walk around the office to help yourself solve a difficult
problem) or even an informative feedback (e.g. You’ve taken 5 breaks today! Keep up the good work!).
Fitbit+ is a good example of a system that uses collected data to trigger the user to initiate an action.
Figure 2.2: Fitbit+ notification
Technology, like computers, television, video game consoles, etc. is often blamed for the general de-
crease in physical activity observed nowadays. However technology may actually be helpful in assisting
people to increase and maintain their levels of physical activity.
2.2 Data Collection
There are a number of smartphone apps that use phones’ internal accelerometers and/or external
sensors to monitor users’ activity, and then present the recorded information. MyFitnessPal and En-
domondo, mentioned in the introduction, are both smartphone applications which don’t depend on any
external device. There are also stand-alone wearable tracking devices on the market. Some of these
devices are aimed towards whole day activity monitoring and others are aimed towards workout monitor-
ing. However all incorporate an aspect of self-monitoring by presenting the user with information about
their current and past status.
Before this wave of specialized applications and devices, lifelogging already existed, supported by
some ”homemade” methods. These methods, which some people still use nowadays, can go from a
simple text file, to a more complex spreadsheet, for example, using a spreadsheet to track daily ex-
penses, a text document to take note of important events or commitments, or an online form with several
questions to answer every day. This can become pretty chaotic when we’re dealing with large amounts
of data, since these applications were not initially designed for this kind of activity. Because of that, it’s
normal that users feel attracted by systems that are easy and quick to use, or even, fully automatic, not
7
requiring any human intervention.
An example of a wearable device is FitBit. FitBit is a company founded in 2007 which is known by
the product of the same name (Fig. 2.3). This device uses a accelerometer to sense user movement. It
is able to track steps taken, distance walked, calories burned, floors climbed, and activity duration and
intensity. It also measures sleep quality: how long it takes to fall asleep, how often users wake up over
the night, and how long they are actually asleep. When connected to a computer the device will send
users data to the FitBit website, and in a personal account space it’s possible to observe an overview of
physical activity, setting and tracking goals, keeping food and activity logs, and interact with friends.
Another wearable device for lifelogging is the Narrative Clip1. It is a small camera that automatically
take one picture every 30 seconds while wearing it throughout the day. Currently this project is on the
launch phase, and was initially funded via crowd funding site Kickstarter. Narrative Clip is following the
steps of SenseCam2, which is also a lifelogging camera that was used in the MyLifeBits project that we
talked earlier. Like these devices, there are tons of other ones, each one with their own particularities,
for example, Nike FuelBand3, Withings Scale4, Jawbone Up5, Sleeptracker6, etc.
Figure 2.3: FitBit deviceFigure 2.4: Narrative Clip device
In [5], Burns et al. says that ”different users will employ technology in different roles within their lives.
Users who are highly physically active may use personal informatics to support their current activity
regime. Users who do not do regular physical activity but who see the need to do so, may use personal
informatics to help motivate them to change. And people who have little interest in exercise and are
unwilling to change are likely to be non-users.” Burns et al. also talks about Interface Complexity and
Engagement, respectively devices or applications with high complexity and high engagement interfaces.
High complexity in that a large amount of information is presented to the user, usually in the form of
numbers and graphs. Distance traveled, time taken, calories burned, etc. And high engagement in that
users must commit time to regularly monitor and understand the information presented.
Active users who employ technology in a supporting role will value this ”rich” presentation of infor-
mation. Their high level of intrinsic motivation to exercise means they will be willing to commit the time
1http://getnarrative.com2http://research.microsoft.com/sensecam3http://nikeplus.nike.com4http://www.withings.com5http://jawbone.com/up6http://www.sleeptracker.com
8
and effort to engage with such interfaces. Less active users, who employ technology in a motivational
role, will benefit from devices that use low complexity and low engagement interfaces. These interfaces
should be informative, yet simple to engage with. And if users choose to disengage with the interface,
they should be able to re-engage at a later date with minimal effort.
Authors have developed and evaluated a device that embodies such an interface - ActivMON (Fig.
2.5). ActivMON is a watch-like device worn on the wrist which incorporates an accelerometer and LED
light. The accelerometer measures the user’s level of physical activity and the LED changes colour to
show the user’s current activity level as compared to a daily goal. ActivMON is able to alert the user to
others’ physical activity in near real-time, as the LED flashes when other users wearing ActivMON are
doing physical activity.
Figure 2.5: ActivMON on a user’s wrist.
While some logging can easily be automatic (e.g. location, weather, calendar data, etc.), some by
definition require self-logging. Food intake, pain levels, and mood are examples that are quite difficult
to automatically determine, but can be quite easy for users to enter on their own. However, in practice,
people tend to forget and/or lack the motivation to self-log. To address this issue, Bentley and Tollmar
in [3] show how a simple reminder on a mobile phone can increase self-logging frequency. This study
was conducted in two different parts. In the pilot study users had to rely on their memory to remember
to record their food intake. In the full study users had an application with a notification system (Fig. 2.6)
to help them to remember. The manual food logging was rarely used. In the first week only a few users
tried it out, and after that, no more than two out of ten users logged food on the same day. After day 12,
only one user sporadically logged food for the rest of the month. This contrasts with the full study with
reminders enabled where 63% of users logged food each day. This percentage stayed quite consistent
throughout the month, showing the power of simple reminders to promote sustained logging (Fig. 2.7).
A system that also does a mix from different sources of data is presented in [20] by Tollmar et al.. It
receives daily step count and sleep data from a Fitbit device, weight and body fat from a Withings scale,
and location and calendar free/busy data from a mobile phone. Additionally, a mobile phone widget
for manually logging food and exercise was provided. Authors were also working on adding additional
automatically collected attributes such as weather, pollen levels, etc. The Mashups system performs
a statistical analysis of all uploaded data for each user on a nightly basis and identifies correlations
between inputs as well as significant deviations that occur during particular time periods (e.g. weekends
vs. weekdays). This system then produces a feed of relevant well being observations that users can
9
Figure 2.6: A reminder notification used in the study
Figure 2.7: Logging frequency comparison
view on a widget on their mobile phone or on a website (Fig. 2.8). This feed can contain items such as
“Your sleep is more interrupted on nights before you have very busy days.,” or “On Tuesdays you walk
significantly less than other days.” These insights are typically difficult for people to make on their own,
given the difficulty in visualizing data from multiple sources over longer time periods.
Figure 2.8: The mobile widget for the feed of observations.
2.3 Data Analysis
Data visualization is one of the main ways by which users of personal informatics tools analyze and
make sense of their data in the reflection stage. The challenge is to find meaningful and beautiful ways
to visualize the data to capture users attention and facilitate understanding of their behavioural patterns.
As self-tracking tools become outfitted with more advanced sensors and features, users are able to
track multiple streams of data. Take, for example, BodyTrack’s7 environmental base station. In addition
to tracking multiple indoor air quality and environmental variables such as temperature and humidity, the
BodyTrack website also allows users to combine streams of data into one view. For instance, users can7http://www.bodytrack.org
10
see their physical activity alongside the weather, their sleep quality, and heart rate if they choose to.
However, it is not easy to identify patterns that matter just by visualizing streams of raw data.
Larsen et al. proposed a visualization technique called QS Spiral [11]. Like the name suggests, this
representation is based on a spiral, where a full a circle corresponds to a time span (hour, day, week, or
year). The outer arcs are thicker, and become progressively thinner as they approach the center, so that
most space is allocated for information in the outer ring, representing the most recent time frame (See
Fig.2.9). This kind of visualization facilitates data analysis and helps identify periodic patterns.
Figure 2.9: QS Spiral Visualization Examples
Spark is another visualization technique presented by Fan et al. in [8]. This approach is a bit different
in a way that it represents physical activity data, using abstract art. As the authors said: ”We want to
explore physical activity as a work of art that people generate and display in their homes.”. Spark uses
data from a fitbit device, and its visualizations are based on circles to represent physical activity data.
The variations on data will influence the circles characteristics, for example, step count influences the
circle size, and the time influences its location (See Fig.2.10). Like Spark, there are other systems that
use real world metaphors to represent data, for example a garden [7] or a fish tank [13].
Figure 2.10: Spark Visualization Example.
11
Yet another way of presenting personal information is given by Kim and Giunchiglia in [10]. eLifeLog
is a platform that works mostly with photos and its main focus is to build a timeline with those photos.
Apart from photos it gathers other information such as location, time, events, people, etc. and builds
various types of visualizations with that information (e.g. a map with pins on visited locations, and
photos taken in that location, are shown when you click on those pins). In order to use it, users have to
install and configure it on their computers, and because of that, this system may not be suitable for less
experienced users.
Personal Informatics systems often deal in domains and utilize data that are just that: personal.
These systems make use of data that we create through our daily activities and help us review it in
a way that encourages reflection and self-knowledge. Because of that, it is especially important that
designers of Personal Informatics systems think about how their systems may be used and impact
users, because they are dealing in domains that are fundamentally tied to individuals’ ideas of self.
Sosik and Cosley conducted a study [19], where they talk about how Personal Informatics systems that
aggregate and display data back to users can have negative effects. For example, tools designed to
encourage weight loss and physical activity strive to help users reach their goals by tracking data such
as calories consumed, amount of exercise and/or current weight. One way these tools motivate users is
by having a set goals in the system and then displaying the collected data back to the user as positive or
negative progress is made. Presenting these data without considering users’ mental states and potential
reactions to the data can be harmful. Overly negative feedback can discourage use (and users).
2.4 Social Networks
Social networks are increasingly linked to lifelogging. The amount of personal information that people
put on these networks (Twitter8, Facebook9, Google+10, etc.) is so much that can be compared to a diary.
Your.FlowingData11 (YFD) is a self-tracking system that uses twitter as storage for records. Users
can record what they eat, when they go to sleep, how much television they watch, how many cigarettes,
their weight or anything else they want. To use YFD users just have to follow @yfd on twitter, and then,
by sending private messages to @yfd, they’re recording their actions. After that, when accessing a per-
sonal space in YFD webpage, users can analyze their records through some nice graphs. In Fig. 2.11
there is an example of the visualizations produced by YFD.
Here are some examples of Your.FlowingData usage12:
• d yfd weigh 160
• d yfd exercised arms
• d yfd watched Back to the Future8http://www.twitter.com9http://www.facebook.com
10http://plus.google.com11http://your.flowingdata.com12d yfd is Twitter syntax to private message @yfd
12
• d yfd played xbox at 20:00
• d yfd goodnight at 11pm
Like any other system that relies on text provided by the user, YFD will suffer from data quality
problems. These problems occur, typically, when the user refers to something by more than one name.
We’ve briefly talked about in the introduction. Imagine that in the end of the month, you want to know
how many time you’ve eaten a steak. If you’ve always recorded those meals with ”Steak”, there’s no
problem, and the system will give you the exact amount of times where you’ve eaten a steak. That will
be not the case, if some of those meal records were made with something like ”cow beef” or ”grilled
meat”. It’s impossible to the system to know that ”Steak”, ”cow beef” and ”grilled meat” refer to the same
thing.
Figure 2.11: Your.FlowingData Visualizations
Another aspect where social networks play a big role is the motivation factor. In [15], Nundy tells us
how his cousin used facebook to help him quit smoking.
”On January 4th, four days after smoking his last cigarette, he updated his status: ”bring it on day 5!”
Within hours, three people responded that they ”Like” his comment; five others commented favourably
with messages such as ”Good for you!!!” and ”Keep it going, bro.” Encouraged by the support he re-
ceived, my cousin posted another update three days later ”is one week non-smoking!!!” Again, within
hours, eight people responded that they liked his comment and another two offered congratulatory re-
marks. Though he didn’t necessarily realize it at the time, my cousin was creating a community of
supporters through Facebook.”
13
This story is the perfect example of how you can get motivation by using a social network. By sharing
with his friends that he was trying to quit smoking, Nundy’s cousin made some sort of commitment to
those who have replied, and most likely, he doesn’t want to fail on them. The great thing about this
motivation factor, is that it works both ways. Not only Nundy’s cousin got the support to stay on the right
path, but also his friends got motivated to better themselves. Quit smoking is just an example of how
social networks can help in the process of lifelogging, the same analogy can be applied to, for example,
weight loss.
In the previous case, we got to see how we can use a social network to keep a record of something
with status updates, and how to get something positive of it, through the comments received. There were
no 3rd party systems/apps evolved, but there are some applications (i.e. smartphone applications) to
perform self-tracking of something, which gives the users the possibility to, within the application, share
your data to a social network. An example of it is the QuitNow! application. It is an application to assist
in the cessation of the smoking habit, which provides motivational statistics and achievements along the
time (Fig. 2.12), which you can share to any social network.
Figure 2.12: QuitNow! Application
Another recent phenomenon on social networks, specially Twitter, are the Sparktweets. This is a
way of producing a bar chart with unicode characters. The funny and interesting part here is the kind
of information that people represent this way, with a limit of 140 characters (Twitter’s limit) . In the next
pictures there are some examples.
2.5 Discussion
Most of the time, when someone starts to self-tracking some aspect of their life, it is with the main
objective to, later on, observe the tracked data. Weight control, food ingestion, etc. Everything can be
tracked. But why would someone want to self-track something? To keep control. To change something
14
Figure 2.13: Examples of Sparktweets usage
that may not be right. And to be able to do that, the tool used to track, must be Reflective Learning
supportive. As we discussed earlier, there are tools that are used to perform lifelogging but were not
designed for that. Those tools are not prepared to provide user any kind of feedback about the collected
data, therefore, it’s difficult to the user properly reflect and take conclusions about tracked data. Systems
that have Reflective Learning support, give much more feedback to the user, presenting more than just a
collection of raw data. This brings us the first requirement for our system: Reflective Learning supportive.
When we talked about Your.Flowingdata system, we discussed the data quality issue. This problem
affects any self-tracking mechanism that relies on data given by the user, and there is no technique
that can prevent this issue from occurring. As we’re going to deal with different sources of data, this is
something that concerns us. In order to reduce the probability of data quality issues, we’re going to have
a mechanism of data normalization, so that we can produce trusted and accurate conclusions. Another
aspect to take into account are the content of that conclusions, since as we saw earlier, when things are
said in a wrong form, they can have a negative effect. This leads us to another requirement: Data quality.
In terms of graphical interface, we saw that different users, value different types of interfaces, for
example: active users who employ technology in a supporting role will value a ”richer” presentation of
information, while less active users, who employ technology in a motivational role, will benefit from de-
vices that use low complexity and low engagement interfaces. This detail brings the question ”What kind
of users will use my system?”. If they’re users that will value all kinds of information, the interface will
have to be more complete, and so, more complex. If they’re users that won’t want to be bothered with
”boring” information, the interface will be simpler. Since we’re aiming to build an application that pleases
all kinds of users, the third requirement will be: Balanced Interface Complexity.
With the recent boom of social networks usage, people tend to share more and more personal data
with their connections, and lifeloggers are no exception. As we analyzed in the related work, when we
15
combine social networks with lifelogging systems, we got some positive and motivational effects. When
an application for lifelogging supports any kind of social network, normally, the information that is ex-
posed, is wrapped in some predefined text, for example: This week i have lost <weight loss value>,
and i am currently with <current weight value>. The previous example is something that can raise pos-
itive reactions. But what if it were: I am currently with <current weight value>? It is completely different.
Once again, we’ll have to pay attention to how we expose user information. With this, another require-
ment is: Support for social networks.
When using an application (not only self-tracking applications), users want it to be as quick and
smooth as possible. In a tracking application this is one of the most important points. Users don’t want
to spend too much time to take note of something. Older applications with no defined standards, or
applications that are not directed to lifelogging, most of the times fail on this aspect. For example, if you
met a friend in your way to work, and want to take note of that while walking, it has to be something easy
to do while you’re walking. This brings another requirement: Easy to use and non time consuming.
By developing an application that’s easy to use and takes not much time, we’ll be covering also part
of our last requirement, which is, keep motivation. But the most important factor to keep the user moti-
vated will be covered by the quality of the conclusions presented to the user. When we talk about the
quality of the conclusions, we’re referring not only the quality of the content, but also, the way they’re
presented to the user. By using visualizations techniques like the ones we’ve talked before, users will
want to watch it grow and take form, and so, they will be motivated to provide more and more data.
Choosing the correct visualization to different types of data is also an important point.
With this, the five main requirements to our application are:
• R1: Reflective Learning supportive
• R2: Data Quality
• R3: Balanced Interface Complexity
• R4: Easy to use and Non time consuming
• R5: Keep motivation
16
Chapter 3
Solution
In order to build a system which properly supports reflective learning (requirement R1), we followed
the approaches from Rivera-Pelayo et al.[17] and Li et al.[12] discussed earlier, by implementing a
system with support to the three main dimensions of support (tracking, triggering and recalling), and
proper means to address the main barriers identified for each one of the states in the Five-Stage-Based
model.
Since the main goal of this project is to perform a easy and quick way to collect data from different
lifelogging data sources, this will accomplish the tracking dimension needs. To support the triggering
dimension, we implemented a system based on active notifications to the user, in order to initiate the
reflection process and help to keep the user motivated (requirement R5). These notifications are mostly
to remind the user to insert values that are missing. Since the recording of data is the main focus of this
project, we will have to outrun the barriers identified in the preparation and collection stages.
Most of the lifelogging systems that rely on text data provided by users have to deal with data quality
issues, like the ones discussed in the Your.Flowingdata system. To surpass this problem there are
several possible approaches: auto completion/suggestions or predefined values. We used all of those
approaches in our application. Data quality is directly related to normalization. To effectively process
data and make it suitable for natural language processing is a challenging issue due to the fact that
informally inputted text data is usually very noisy, that is why normalization is an essential step to go
through if we want high quality data. These cover the requirement R2.
In terms of Interface Complexity and Engagement, our system tries to please both sides. We wanted
to give users the power to select the level of interface complexity, in other words, the amount of visible
information. Relatively to the level of engagement, we tried to keep it as low as possible, although, it is
directly related to the level of complexity, in a way that, when one is high, the other is also high. Users
which prefer high level of information, would have more information to analyze, so, would have to commit
more time to it, but again, our objective was to always have interfaces that are easy to engage, no matter
what is the amount of the information presented (requirement R3).
By providing a clean and easy to use interface, we also covered requirement R4 and R5. If an
interface it’s easy to use, most likely, it will be non time consuming, and by not being time consuming, it
17
will keep the users motivated to continue using it. Requirement R5 will also be covered by the quality of
the conclusions presented.
Our system consists of three main blocks: Setup, Collection and Visualization (See Fig.3.1).
In the setup block is where the user feeds the system with existing data, whether by providing its
credentials to an existing and supported system, like fitbit, or by uploading an excel spreadsheet with
personal records. It’s also in this block where the user assists the system by locating and identify-
ing redundant data between the different data sources. All the information is stored in a database for
persistence.
The collection block is where this system supports the collection of information. In order to record
an action, for example ”eat”, user has to setup it beforehand. The subject of an action also has to be
preconfigured in order to avoid data quality issues.
The visualization block is where we present user with the views of the data provided to the system,
being able to choose a variable, period of collection and if it is going to be shown in a form of a list or
exported to CSV.
Figure 3.1: System Blocks Example
3.1 LogMe User
In this section we are going to describe our solution to the issues referred earlier, from the user
perspective. In fig. 3.4 we can see the homepage of our application, which is the main spot for the user
to start using it. In order to facilitate the insertion process, some fields come pre filled (Date and Time).
18
Figure 3.2: LogMe Homepage
In this thesis, we focused mainly in simplifying data collection, where we address manual collection,
automatic collection, merging data from multiple sources and normalization; reminders’ system; and
data exporting.
3.1.1 Data Collection
To address the problem of data collection, our application was implemented aiming towards user’s
needs.
We mostly relied on the 10 Usability Heuristics for User Interface Design [14]. These are 10 general
principles for interaction design and they’re called ”heuristics” because they are broad rules and not
specific usability guidelines.
1. Visibility of system status: The system should always keep users informed about what is going
on, through appropriate feedback within reasonable time.
2. Match between system and the real world: The system should speak the user’s language,
with words, phrases and concepts familiar to the user, rather than system-oriented terms. Follow
real-world conventions, making information appear in a natural and logical order.
3. User control and freedom: Users often choose system functions by mistake and will need a
clearly marked ”emergency exit” to leave the unwanted state without having to go through an
extended dialogue. Support undo and redo.
4. Consistency and standards: Users should not have to wonder whether different words, situa-
tions, or actions mean the same thing. Follow platform conventions.
5. Error prevention: Even better than good error messages is a careful design which prevents a
problem from occurring in the first place. Either eliminate error-prone conditions or check for them
and present users with a confirmation option before they commit to the action.
6. Recognition rather than recall: Minimize the user’s memory load by making objects, actions, and
options visible. The user should not have to remember information from one part of the dialogue
19
to another. Instructions for use of the system should be visible or easily retrievable whenever
appropriate.
7. Flexibility and efficiency of use: Accelerators unseen by the novice user may often speed up
the interaction for the expert user such that the system can cater to both inexperienced and expe-
rienced users. Allow users to tailor frequent actions.
8. Aesthetic and minimalist design: Dialogues should not contain information which is irrelevant
or rarely needed. Every extra unit of information in a dialogue competes with the relevant units of
information and diminishes their relative visibility.
9. Help users recognize, diagnose, and recover from errors: Error messages should be ex-
pressed in plain language (no codes), precisely indicate the problem, and constructively suggest
a solution.
10. Help and documentation: Even though it is better if the system can be used without documen-
tation, it may be necessary to provide help and documentation. Any such information should be
easy to search, focused on the user’s task, list concrete steps to be carried out, and not be too
large.
Having this in mind, we focused on obtaining an user interface that is easy and pleasurable to use:
• Quick response timing, as users are impatient (after all, who likes to wait?);
• Presents helpful texts when errors occur (small details can make an huge impact);
• Keeps user interface consistent with what users are used to and know, like similar buttons, keep
the elements where they usually are placed, use recognizable icons, as users don’t want to waste
time guessing what that symbol means;
• Easy and pleasant to read, there are some basic typographic elements: line-heights, letter spacing,
(relative) font-sizes, legible fonts, right colors (as certain colors can attract feelings and emotions,
we need to make sure that they not only convey the emotions we want, but also make the font
readable) and font styles ( bold and italic can be helpful in catching users’ attentions).
In our application, data collection can be achieved in two different ways, manually and automatically.
Next we will detail each one of these options.
Manual Data Collection
Users can insert a value directly in the application homepage, by selecting an already created variable.
If the variable was created with a set of allowed values, those will appear in the suggestions’ list, used
to minimize the errors (See Fig.3.4). Users can also access the ”Configure Vars” option on the menu, to
create a new variable (See Fig.3.3).
20
Figure 3.3: Manual data collection: creating a new variable
Figure 3.4: Manual data collection: suggestions’ list based on the possible values
Automatic Data Collection
The automatic data collection requires a few more steps. The user has to choose which variable will be
collected from the external source and that is presented in a list of possible variables. For example, in
fig. 3.5, we created a new variable named Lunch Location, which is going to collect its values from the
column ”Almoco (localizacao)” in the Google Spreadsheet.
Merge Data from Multiple Sources
Combining data residing in different sources and providing users with a unified view of these data,
we think that has been one of the gaps in lifelogging. By merging data from multiple sources we can
generate powerful insights that will help us make better choices, towards our goals. Each data source
can be seen as a puzzle piece, where you can see a general shape of a puzzle’s picture, but you can’t
see the entire picture. Without every piece, there are gaps that you can only fill in by guessing what
21
Figure 3.5: Automatic data collection: new variable creation from FitBit
the full picture looks like, which can make you make incorrect assumptions about what may best solve
a problem and, at the most extreme scenario, can harm you. When put together, multiple data sources
can give us powerful insights to make important decisions, allowing us to better identify the context of
our daily habits. It can help us validate our assumptions and change what’s wrong with confidence and
not based on a subjective opinion.
In this thesis we used two different sources: Google spreadsheet and FitBit. Our choice fell to these
sources as they are largely used and are available:
• FitBit: one of the most used and known lifelogging gadgets, tracks steps taken, distance traveled,
calories burned, stairs climbed and active minutes throughout the day. At night, it tracks sleep;
• Google Spreadsheet: it’s easy to access, create and edit your spreadsheets anywhere (phone,
tablet or computer), changes are automatically saved as you type, you can access your spread-
sheets anywhere, even without detection network (just activate the offline edit to work lay in the
browser or files on your mobile device);
Normalization
Normalization means adjusting values measured on different scales to a notionally common scale,
allowing it to be compared in a meaningful way. Text normalization is an important problem in natural
language processing, it means converting ’informally inputted’ text into the canonical form, by eliminating
’noises’ (Zhu et al.).
For instance, if a search for ”resume” is to match the word ”resume”, then the text would be normal-
ized by removing diacritical marks; and if ”elizabeth” is to match ”Elizabeth”, the text would be converted
to a single case. Applying to our thesis, ”cookie” or ”chocolate cookie” may refer to the same thing. The
quality of this pre processing is a key point for the quality of the conclusions and is something that the
user cannot be worried about.
22
If the user manually inserts a value that doesn’t appear in the suggestions’ list, the application will
present two options: add the value inserted to the allowed values (See Fig.3.6) or change the value
inserted to another value in the suggestions’ list (See Fig.3.7).
Figure 3.6: Dealing with conflicts: change the value inserted
Figure 3.7: Dealing with conflicts: adding the value inserted
If the variable collected from an external source has been configured with a list of possible values,
and if there is any conflict between the imported values and the allowable values, a list of conflicts will
be presented to the user (see. fig. 3.8). At this point, it will be given to the user several options to solve
these conflicts.
3.1.2 Reminders’ System
Motivation is the combination of desire, values and beliefs that drives someone to take action. These
motivating factors and/or lack of them, are at the root of why people behave the way they do.
You can influence your motivations, i.e., if someone consider something important and assign value
to it, they are more likely to do the work it takes to achieve their goal. Although, nowadays, we have a
stressful life, lots to think about and no time to do everything we want/need/have to. By keep forgetting
23
Figure 3.8: Automatic data collection from a Google Spreadsheet: dealing with conflicts
to register lunch, for example, when you notice it (maybe the end of the week?), you don’t have the
drive, nor maybe the mental availability, to do it. Thus, motivation drops and the probability of achieving
your goal falls tremendously. Therefore, a reminders’ system is so important and essential. If you
have something that keeps you reminding what you have to register, then you start seeing results, your
motivation ascends, turning into a vicious circle, and the odds of a good outcome are greatly increased.
In our application, we have a reminders’ system that goes through all the variables that have a
frequency associated and will check which of these variables have missing values for the current day.
Then, it presents the user a list of reminders to inform which are the missing values (see. fig. 3.9).
An option ”Insert Now” helps the user to directly go to the insertion function.
3.1.3 Data Exporting
There we can also see that we chose to collect that value from 01-01-2014 till 01-10-2014, one time
per day.
24
Figure 3.9: List of reminders that shows the missing values
To see the values collected, users must click in the ”View” option on the menu. Users can choose
the variable to present its values, with the possibility to choose the period associated and it’s possible to
list the results or to export them into a CSV, see fig. 3.10.
Figure 3.10: Data Exporting: presenting the values collected for the Lunch Location variable
3.2 BackEnd
To implement this application, several technological options had to be made. The first was the choice
between Standalone application VS web application. The fact that a web application is accessible from
any computer with an internet connection, and also the fact of not requiring facilities (beyond the initial
deployment where the application will be running) were the main reasons that led us to opt for such
approach.
3.2.1 System Architecture
The language chosen for this implementation was ASP.Net (C#)1 since that it is a language that we
were already familiar with and knew would perfectly suit our purposes. To implement some features, we
also appealed to JavaScript/jQuery. Regarding the layout, we used CSS Bootstratp frameworks2 and
1http://www.asp.net2http://getbootstrap.com
25
UIKit3. For data storage, we had discussed several options, such as MySQL, Microsoft SQL Server and
SQLit 4. Again, in order to turn the initial setup as easy as possible, we ended up choosing to use a
SQLite database as this does not require any installation or configuration.
Database Model
In figure 3.11 we present the database model used in our project. Later in the document, we will
explain how our application makes use of this database model.
Figure 3.11: Database Model of our application
Implementation Layers
This application was implemented following a layered approach. We have three different layers (see
fig. 3.12):
• Presentation
• Business
• Data
The presentation layer is the one responsible for the information delivery and formatting to be sent
to the business layer for further processing, and also for presenting information to the user.
3http://getuikit.com4http://www.sqlite.org
26
The business layer is the layer where all the information processing happens. This is where are
executed all the rules and calculations needed for the proper functioning of the application.
Finally, the data layer provides all necessary methods for the interactions performed on the database
by the business layer.
Figure 3.12: Application layers
Each variable that is created corresponds to an entry in Vars table, where each of these inputs
keep various types of information such as: an unique id, name, data source, frequency and possible
values. Regarding our ER model, we could have added a third table to keep the possible values of
each variable. For the sake of simplicity, we ended up by storing all the allowed values together, using a
special character to separate them. This decision is completely transparent to the system usage, since
it is the data layer who has the responsibility to deal with this detail.
Concerning the source of the data, this version of the application supports the following:
• User: All data is entered manually by the user;
• Google Spreadsheets: data is collected automatically from a Google Spreadsheet via the official
API;
• Fitbit: data is collected automatically by the official Fitbit API.
The implementation of this application it was all done in order to achieve a modular system for the
introduction of new external data sources. There is a main class, called Plugin, from which data access
classes implementations extends, thus it makes available a set of methods that the system will later use
27
without having to worry about to which external sources is interacting with. This way, any system that
provides an API to access the data, can be easily integrated into our application.
From the other hand, records live in another table (named Records) and each entry stores informa-
tion, such as the record timestamp, its value, a link (foreign key) to the corresponding variable and a
field indicating whether this record is pending or not. Record pending will be addressed in Data Quality
section.
For this version of the application we implemented an access to two external sources of information
(Google Spreadsheets and Fitbit). These choices are due to the fact that both platforms are widely used
in the practice lifelogging activity and also because these were two systems that we had at our disposal,
namely a Fitbit device, with which we performed tests.
3.2.2 Manual Data Collection
To manually enter a record via the application home page, there are two ways of doing it:
• We choose the variable to which we want to add the record, the record timestamp and its value.
• In case the chosen variable have a list of possible values associated, they are presented in the
form of a suggestion, so that the user can make a choice without errors. In a more technical
perspective, this part of the application was simpler to implement because it only requires the user
to select the variable (listed in a dropdown menu), the date indication and its value.
3.2.3 Automatic Data Collection
The automatic data collection requires specific background jobs:
Variable creation
When we create a variable in which the data source is external, a few more settings are asked for the
user to insert, as has been shown in the LogMe User section and as we can see on fig. 3.13. It may
seem that there are a lot of fields to fill, although it is done only the first time and it will ease all the
future processes and save time, making it quicker. Our application provides access to two external data
sources, Fitbit and Google spreadsheets.
Fitbit provides an API that allows us to collect any information that is generated/collected by the
device. For the sake of this application we have implemented access to three values: Steps, Water
and Sleep; respectively, number of steps in a given day, quantity of water ingested, and hours of sleep.
When we perform a request to the Fitbit API, we can choose in which format we want the response, it can
be Json or XML. We have chosen XML because of of the great .NET support to deal with information
formats and also because of our familiarity with it. Below we show an example of a response to the
GetWater request, where it indicates that in that given date, a total of 1000ml of water were consumed
(two records of 500ml):
28
<r e s u l t >
<summary>
<water>1000</water>
</summary>
<water>
<apiWaterLogImplV1>
<amount>500</amount>
<l og Id >487208310</ log Id>
</apiWaterLogImplV1>
<apiWaterLogImplV1>
<amount>500</amount>
<l og Id >487208796</ log Id>
</apiWaterLogImplV1>
</water>
</ r e s u l t >
Since there are no official libraries, we have implemented these methods from scratch, responsible
for making the requests and parsing the XML responses.
Google spreadsheets, contrary to Fitbit, offers an official library that simplifies access to data
through a set of methods and, by using it, we have access to every spreadsheet stored in our Google
drive. In this case, we can access a list of all the existing spreadsheets and the sheets within each one.
When in a sheet, we can navigate through their cells and retrieve data.
Once the variable is set, we proceed to the initial data collection. In this stage, system will make use
of the access’ plugin to link to the outside source. Using the settings provided by the user, it will make
a collection of all data in the fields that the user selected, in the time gap inserted, and it will update the
variable indicating the last value consumed, so it knows where to start next time. If the variable has been
configured with a list of possible values, a list of conflicts will be presented to the user (if there is any
conflict), between the imported values and the allowable values. Here will be given to the user several
options to solve these conflicts (see fig. 3.8). Concluding this step, the process of variable creation is
completed.
Background job
The process of automatic data collection occurs each time users load the homepage of our application.
Regarding the periodicity of the execution of this operation, and in order to obtain a better performance,
we pondered whether this should be performed within a fixed time interval, instead of being executed
every time you load the application. However, we opted for the second approach, since this only has
impact on performance when there is new data to be imported, which does not happen most of the time.
29
By the end of every automatic data collection operation, the system will update a field on the variable
database record, which holds information about the location of the last value consumed. In the case of
a Fitbit variable this will be a date. In a Google spreadsheet it is the row number. With this, the system
knows where to start collecting more data without user intervention.
Figure 3.13: Setting up a new variable from a Google spreadsheet
This application also supports the ability to make an import of data from an Excel spreadsheet. Unlike
Google spreadsheets, this is not associated with any created variable, its purpose serves only to make
a punctual data import into the system. Regarding the data import from a Google spreadsheet or from
an Excel spreadsheet, and since these can have several layouts, our application requires that a certain
format has to be followed. The fields to register have to be distributed among the columns, and in the
first column it is held the field’s timestamp. Also each row represents a record and can affect more than
one field (see fig. 3.14).
Authentication with Protected External Data Sources
When we talk about systems that deal with personal information and allow access to the data through
an API, these accesses are not public and we always have to associate an authentication mechanism.
For the two implemented systems we used the authentication mechanism through the OAuth protocol.
The alternative to OAuth system would be to ask a user to introduce their access data somewhere in
our application, which is not always seen as something favorably, as there have been cases of dishonest
developers who keep that data and use it for other purposes. That’s why we decided on using OAuth. In
OAuth system, the authentication is done in two steps:
1. The user is redirected to a login page of the entity in question;
2. After a successful login, it goes back to our application. When user returns to the application,
brings a token, which is provided to the application. This token serves to sign our requests to the
respective APIs. At any given time, the user is able to revoke access, causing the token to no
longer be valid, invalidating future orders.
30
Figure 3.14: Data import format required
3.2.4 Data Quality
A major goal of this thesis is to enable the user to build a single data source, where these are
normalized and consistent. In order to achieve this, we have taken great attention, and implemented
various mechanisms, that allow the user to achieve a very high level of quality data. Next we will talk
about these mechanisms and where do they come into play.
Manual Insertion
When manually entering a value that is not part of the list of possible values, by submitting that registra-
tion, system detects that the value is not allowed and let the user to make a decision about what to do
next, giving two options:
1. Add the value to the list of possible values and proceed with registration. Thus, this new value
starts to be one of the allowed values and next time it is inserted, it is already part of the sugges-
tions made for this variable.
2. Replace the value entered by one of the existing allowed values.
Automatic data collection
When the system connects to an external data source and collects new data, it will check for possible
31
conflicts between these new data collected and the allowed values. If there are any conflicts, a table will
be presented with all ”conflicting” values, where you can choose the action to take for each case. Like in
the manual insertion, here will also be presented the chance to add the new values to the list of possible
values or to change the value with one presented in the list displayed. Additionally two other options are
available (see fig. 3.8):
1. to edit the value and add it to the set of allowed values;
2. to ignore all occurrences of that value, not importing any of those values.
Only at the conclusion of these steps, data will be loaded into the database.
Normalization
In summary, the process of data normalization occurs every time new records are added to the database:
manual data insertion and automatic data collection from external sources.
This process of data normalization is one of the key aspects of this application because it is through
it that we can guarantee to users that they will obtain a set of consistent and correct data.
After each data entry, and in the existence of conflicts, the user is obligated to choose one of the
options already mentioned above. It is not possible to make such an operation later. This decision was
considered and discussed, and we chose this path because if it were possible to leave it to later, users
would eventually forget about it or simply ignore it, accumulating, that way, a large number of data to be
processed later on, which may lead to stop using the application because of the increased amount of
time needed to deal with it.
3.2.5 Reminders
In the previous paragraphs, our data quality concern was related to the normalization of the data, so
we could have a coherent set of data. Another form of data quality relates to the completeness of the
data, for example, a variable with a frequency of one time per day, shouldn’t have days with no records.
To achieve this data quality level, we provide means to prevent such ”holes” in the data.
In our application home page, after the execution of automatic data collection, system will cycle
through all the variables that have a frequency and will check which of these variables have missing
values for the current day. Then, it presents the user a list of reminders to inform the user which records
have missing values (see. fig. 3.9). The fact that we only check missing records for the current day
is deliberate, since this operation requires some processing. If we were to verify the completeness of
the data since the ”beginning of time” on the homepage, this would take more time than the acceptable
(and, as we referred earlier, users are impatient). Therefore, and since we consider that there should be
32
an option to make this more exhaustive verification, a section was created off the home page, where the
user can observe all data that is missing.
Another important aspect is that if the user has a huge amount of values missing for the current day,
the system will only present the first five. Again, this was implemented deliberately in order to not affect
the initial page loading time.
33
Chapter 4
Evaluation and Results
To evaluate the usability of data gathering and extracting process, a set of testers were given the
system to try and comment on its use. This evaluation consisted in a session, composed of three parts:
• Initial form where we traced testers’ profile (e.g. age, native language, experience with life logging
applications, etc.);
• Exercises, where were presented a set of actions to perform on a prototype that will cover all the
possible solutions to a certain problem (like the data quality issue mentioned earlier);
• Final questionnaire, aimed at usability and qualitative assessment of the system.
Our universe of testers was composed by a group of 16 people, which demonstrates interest in using
this kind of tools, with some experience on using technological gadgets (such as smart phones, tablets,
computers), females and males and aged between 18 and 50 years old.
4.1 Prototype
Users were given a set of tasks that they had to perform, using all the different approaches. By
observing users, measuring the task completion rates and time used, results were processed and ana-
lyzed.
After the tests, we used questionnaires to evaluate the results. Testers were given a set of questions
after using the prototype. The questionnaire encompassed several areas, such as the reason to use it,
application usefulness, data quality, easiness to use, if it changed or/and helped their lives, motivation
levels, etc. Users also had the possibility to give feedback on a free text field.
In the tests’ guide, questionnaires, data collected from them and their feedback were analyzed and
studied, and the conclusions will be presented on the next section.
4.2 Evaluation
In this section we will present the 3 parts of the testing session.
34
4.2.1 Initial Form
In order to obtain the users’ characterization, we used the form presented in the 6, in order to char-
acterize the testers’ universe.
4.2.2 Tests’ Guide
Following is presented the tests’ guide used, composed by 3 different exercises (creation of new vari-
ables, insertion of values and data export), each one with a set of actions to perform on the prototype.
1. Creation of new variables
1.1. New variable (name: ”Money withdrawn”, source data: user, frequency: 1x day);
1.2. New variable (name: ”Lunch spot”, source data: Google Spreadsheet, frequency: 1x day,
possible values: ”Rialva”, ”Zafran”, ”Farm”). The imported values that are not included in the
possible values must be corrected, added or ignored (it is up to the user);
1.3. New variable (name: ”Sleep hours”, source data: Fitbit, frequency: 1x day).
2. Values insertion
2.1. Import data from Excel sheet to ”Money withdrawn” variable;
2.2. Add the value ”20” to the variable ”Money withdrawn”;
2.3. Enter a value that is missing on the day;
2.4. Insert the value ”Charrua” in the variable ”Lunch spot”.
3. Data Exporting
3.1. List all records for the variable ”Lunch spot” between dates ”01-01-2013” and ”31-12-2013”;
3.2. Export to CSV the previous listing.
4.2.3 Final Questionnaire
After finishing the purposed exercises, users responded to a final questionnaire about the application.
This questionnaire is divided in 2 parts: first part we used the System Usability Scale (SUS) (Brooke [4])
and the second part we have specific questions related to the application.
The System Usability Scale is a reliable tool, with a simple, ten-item scale giving a global view of
subjective assessments of usability. Each question has five response options for respondents, from
Strongly agree to Strongly disagree. It was created to classify the easiness of use and to evaluate
a wide variety of products and services, including hardware, software, mobile devices, websites and
applications. The usability of a system can be measured only by taking into account the context of use
of the system (defined by the ISO standard ISO 9241 Part 1) and can be measured by evaluating three
features:
35
• Effectiveness: if users can successfully achieve their objectives;
• Efficiency: how much effort and resource is needed to achieve those objectives;
• Satisfaction: if the experience was satisfactory.
It is possible to recognize the components’ quality indicated by Nielsen in 10 Usability Heuristics for
User Interface Design, on the issues:
• Ease of Learning: 3, 4, 7 and 10;
• Efficiency: 5, 6 and 8;
• Ease of storage: 2;
• Minimization of errors: 6;
• Satisfaction: 1, 4, 9.
SUS has become an industry standard, with references in over 1300 articles and publications, which
major benefits include that it is a very easy scale to administer to participants; it can be used on small
sample sizes with reliable results; and it is valid, as it can effectively differentiate between usable and
unusable systems. And that was why our choice fell on this method to measure our system’s usability.
Below we can see the final questionnaire that was distributes after the execution of the tests. Answers
were given according to a scale from 5 (Strongly Agree) to 1 (Strongly Disagree), with exception of the
last question (Global Appreciation), which scale goes from 5 (Very Good) to 1 (Terrible).
1. Final Questionnaire - Part One
Q1 I think that I would like to use this system frequently.
Q2 I found the system unnecessarily complex.
Q3 I thought the system was easy to use.
Q4 I think that I would need the support of a technical person to be able to use this system.
Q5 I found the various functions in this system were well integrated.
Q6 I thought there was too much inconsistency in this system.
Q7 I would imagine that most people would learn to use this system very quickly.
Q8 I found the system very cumbersome to use.
Q9 I felt very confident using the system.
Q10 I needed to learn a lot of things before I could get going with this system.
2. Final Questionnaire - Part Two
(a) I think this application is useful.
(b) I can see myself using this application every day.
(c) The proposed exercises were easy to understand and execute.
(d) The application has responded quickly.
(e) Global Appreciation.
36
4.3 Experimental Results
To open the tests’ session, we have made an introduction to lifelogging, because not all users were
familiarized with that term. We have a made a short description, which everyone already knew, but didn’t
know that that term is used for that purpose. Briefly, we described lifelogging as the recording, storage
and distribution of everyday situations, with the support of technologies.
4.3.1 Analysis of Initial Form Results
After that clarification and introduction where we referred the application purpose, we started by
distributing our initial form to users.
Tests were executed by 16 people, 7 male and 9 female. On table 4.1 we can see the ages summa-
rized.
Age18-30 31-40 41-5044% 31% 25%
Table 4.1: Users’ ages summarized
75% of the users have, at least, a degree in college; 44% works in the technology area and 100%
often uses a computer, mobile phone an/or tablet. These results were expected, as functioning with
these kind of tools/applications implies some technology knowledge.
As for tracking activities, the results are shown in the fig. 4.1). 38% of users don’t track their activities,
the rest of the users track their activities, using Excel sheets and mobile phones applications, which is
also an expected result as it is the most affordable way.
Figure 4.1: Tracking activities’ results
In the last question, we wanted to know what was the main reason that refrain users from keeping
track of their activities. Definitely, lack of time, motivation and the fact that data is spread across several
devices doesn’t work favorably, as people tend to demotivate (see fig. 4.2).
37
Figure 4.2: Reasons that refrain users from keeping track of their activities
4.3.2 Analysis of Tests’ Results
When the initial form was filled by every user, we delivered the tests’ guide to the users at the same
time and the counters were set to zero for each user in each exercise. Given the users’ characteristics
(including the high familiarity with technology), the error rates during the tests’ execution was considered
zero. We also associate these results to the fact that this is a system with a low degree of difficulty.
The first set of exercises aims to teach the creation of new variables, with 3 different sources: manu-
ally, Google spreadsheet and FitBit. This way, we can also measure the satisfaction level for each data
source. The exercises were easily executed by every user, although exercise 1.2 took longer than the
other two exercises, but it was expected, considering they had to deal with conflicts. The first and third
exercises have low standard deviations which, in our opinion, would be expected since the course of the
task is fixed. However the second exercise is different, as the user needs to handle the conflict resolu-
tion, and that implies a choice that differs from user to user which, in turn, may have different durations.
Thus, this exercise has a higher standard deviation comparing to the other 2. Table 4.2 summarizes the
results of the first set of exercises, in terms of efficiency, measured in seconds.
1.1 1.2 1.3Min 19s 130s 30sMax 31s 180s 40s
Average 23,94s 154,25s 35,38sStd. Deviation 3,94s 16,15s 3,12s
Table 4.2: Results of the first set of exercises, in terms of efficiency
The results of the second set of exercises, which goal is to instruct about the insertion of values,
we asked users to insert values in different situations, for example in a variable imported from an Excel
sheet. These exercises all have a low standard deviation, such result was also expected, since all
exercises, except for the fourth, had a fixed course. The exercise 4, although it has two possible courses,
both hypotheses would take approximately the same amount of time. Table 4.3 summarizes the results
of the second set of exercises, in terms of efficiency, measured in seconds.
The third set of exercises has the purpose to show users how they can export data, in the first
exercise the user inserts a range of time and a variable to list all the records associated; in the second
exercise it is asked to export the previous list to CSV. The results obtained from these two exercises are
exactly what were expected. The small variation that is presented on exercise 3.1 relates only to the
time taken by users to insert the start and end dates. The duration of the exercise 3.2 is the same for
38
2.1 2.2 2.3 2.4Min 30s 10s 13s 13sMax 40s 15s 21s 22s
Average 34,12s 12,19s 17,63s 15,88sStd. Deviation 3,39s 1,47s 2,29s 3,31s
Table 4.3: Results of the second set of exercises, in terms of efficiency
all cases, as it is boiled down to pressing a button. Table 4.4 summarizes the results of the third set of
exercises, in terms of efficiency, measured in seconds.
3.1 3.2Min 20s 1sMax 25s 1s
Average 22,81s 1sStd. Deviation 1,55s 0s
Table 4.4: Results of the second set of exercises, in terms of efficiency
4.3.3 Analysis of Final Questionnaire’s Results
After executing the proposed exercises and taking notes about how long each user took in each exer-
cise, we asked our users to answer a final questionnaire. As we have referred earlier, this questionnaire
is divided in 2 parts: first part where we used the System Usability Scale (SUS) and the second part
where we have specific questions related to the application.
In the first part of this questionnaire, we have SUS. SUS scores have a range of 0 to 100, and in
order to evaluate what is a good result, we looked into several studies, as the opinions can slightly vary.
In ”An Empirical Evaluation of the System Usability Scale” [2], the author has done 2324 assessments
with an average on 70.14 based on all the assessments, although when the assessments are divided
up into each project the average is 69.69. Thus, the author claims that good systems get between 70-80
points, and exceptional systems get 90 or more. In ”Measuring Usability with the System Usability Scale
(SUS)” [18], from a study with over 5000 users across 500 different evaluations, the average result was
68. Based on both studies, we cn conclude that a score above 70 indicates that it is a good system.
The odd items are phrased positively and the even items are negative phrases, for example:
• This website was easy to use.
• It was difficult to find what I needed on this website.
The major reason for this is to minimize extreme response bias and acquiescent bias. Thus, we
grouped redundant questions to simplify our analysis.
On table 4.5 we can see the SUS scores:
Our overall SUS Score is 74,1, which means that we have a good system and we are on the right
track! Even though a SUS score can range from 0 to 100, it isn’t a percentage.
39
User Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 SUS ScoreU1 4 2 5 2 4 1 5 1 5 3 85U2 3 3 4 3 5 1 4 1 4 3 72,5U3 2 3 5 1 5 1 3 3 4 4 67,5U4 5 1 5 1 5 1 5 1 5 1 100U5 5 2 4 2 4 2 4 2 4 2 77,5U6 4 1 5 4 3 2 3 2 4 2 70U7 2 5 2 5 4 3 2 4 2 4 27,5U8 5 4 3 2 4 2 5 2 4 4 67,5U9 4 1 3 3 3 3 5 1 5 2 75U10 5 1 4 3 4 2 4 1 5 4 77,5U11 5 1 5 2 4 1 5 2 4 2 87,5U12 5 4 2 4 4 2 3 3 3 4 50U13 4 2 5 1 5 1 4 2 5 1 90U14 4 3 4 2 4 1 4 1 4 3 75U15 5 1 5 3 5 1 5 1 5 1 95U16 4 2 4 4 4 1 4 2 4 4 67,5
Average Overall SUS Score 74,1
Table 4.5: Individual and Overall SUS Scores
On fig. 4.3, we can see that around 80% of the users would like to use our system frequently, which
is a good sign and reinforces our work.
Figure 4.3: Answers to question 1
Figs. 4.4, 4.5 and 4.6 results relate to the system complexity, we can verify that the answers are
coherent and around 65% of users think the system is easy to use.
On figs. 4.7 and 4.8, the results are about the system’s consistency, both having a positive response
from around 85% of the users.
Results on figs. 4.9 and 4.10 are related to the responsiveness. We can conclude that the application
is non time consuming, as the application’s response is quickly enough.
Results on figs. 4.11 and 4.12 are related to the application complexity, which we can conclude that
40
Figure 4.4: Answers to question 2
Figure 4.5: Answers to question 3
Figure 4.6: Answers to question 4
some background knowledge is needed to use it.
In the second part, we asked users specific questions about our application.
On fig. 4.13, we can see the results about the usefulness of our application, and almost 100% of the
41
Figure 4.7: Answers to question 5
Figure 4.8: Answers to question 6
Figure 4.9: Answers to question 7
users responded positively.
On fig. 4.14, and according to the last answers, most of the users raises the possibility of using our
application.
42
Figure 4.10: Answers to question 8
Figure 4.11: Answers to question 9
Figure 4.12: Answers to question 10
43
Figure 4.13: Answers to question 1
Figure 4.14: Answers to question 2
44
On fig. 4.15, users indicated that the exercises proposed were easy. Once again, this is in agreement
with the answers given in the first part of the questionnaire.
Figure 4.15: Answers to question 3
On fig. 4.16, it is possible to verify that the application is quick on responding, according to most of
the users.
Figure 4.16: Answers to question 4
Finally, on fig. 4.17, the global appreciation is very positive, which lead us to think that our system
can be a great addition!
4.4 Experimental Results’ Discussion
Generally, the results of our experiments were positive, to the extent that most people considered
that our application is helpful and the exercises were adequate. Also, having a score of 74 on SUS is
very good and exciting, although we know that there is a lot more to improve. The overall assessment
was also positive, as most of the users see themselves using this application in the future.
Regarding our achievements, we can say that we have satisfied requirement R1 (Reflective Learning
supportive), as our application provides adequate means to address the main barriers identified for
45
Figure 4.17: Answers to question 5
each one of the states in the Five-Stage-Based model (Preparation, Collection, Integration, Reflection
and Action) through our data collection processes, data quality and normalization, reminder’s system for
users and data exporting.
As for requirement R2 (Data Quality), we used auto completion/suggestions and predefined values, also
data quality is directly related to normalization, which we also executed.
Requirement R3 (Balanced Interface Complexity), since we give users the power to select the amount
of visible information.
Concerning requirement R4 (Easy to use and Non time consuming), by providing a clean and easy
interface, which leads to be non time consuming.
Finally, requirement R5 (Keep motivation) is achieved through our active notification system to the user,
in order to initiate the reflection process and help to keep the user motivated, along with the clean, easy
and non time consuming interface.
We can conclude that we accomplished mostly of our goals, as we have created a system that allows
both manual and automatic collection, thus facilitating the collection and integration process.
46
Chapter 5
Conclusions
Lifelogging provides a great way to understand what is wrong in your lifestyle and to become aware of
what is and is not helping you towards your goals, such as physical activity, loosing weight, quit smoking,
food habits, etc. There are lots of applications and devices that help you record data, such as FitBit,
QuitNow!, etc. Although we still face several problems when lifelogging: lack of time, poor conclusions,
several applications to record different data preventing users to analyze data as a whole, etc.
In this work we proposed a solution to perform a mash-up of different lifelogging data sources, build-
ing a system which properly supports reflective learning, data quality, possibility to users to select differ-
ent data sources, and it will be an easy to use application and non time consuming, which is an important
feature nowadays. Fulfilling this requirements we created a system that solves (or greatly minimizes)
the problems indicated above.
At the end of this work we are pleased with the results and believe that we have developed something
useful. As already mentioned, we currently have a lot of personal information collection systems, where
each one of them works alone, not allowing its users to know how certain factors can influence others.
For example, relate physical activity with diet. We have accomplished mostly of our goals, our system
allows both manual and automatic collections, always aiming to facilitate the collection and integration
process.
5.1 Future Work
Regarding the future work, one of the most obvious features to implement is data visualization. Since
we are now able to aggregate data from several sources, and together with a high level of data quality,
it will be very interesting and useful to present results based on the correlation between all the variables
collected. It is also important to take into account what types of information visualization best suit each
scenario.
Another feature that we think would add a great value to this application would be to give users the
possibility to set up by themselves a new external data source. This way each user would customize the
application according to their lifelogger profile.
47
Integration with social networks is another issue to be addressed, sharing information on social
networks has gone viral and it is one of the main ”activities” nowadays, and it can be an important topic
when talking about motivation. Many systems already provide specific functionality to help users share
information on social networks, such as Facebook or Twitter. Normative social influence is a type of
social influence leading to conformity. It is defined as ”the influence of other people that leads us to
conform in order to be liked and accepted by them” [1]. Basically, when people that you identify as your
peers appear to believe or behave in a certain way, you are far more likely to believe or behave similarly.
For example, if all of your Facebook friends suddenly appear to be running, you’re more likely to become
a runner yourself, even if you hate running. You begin to develop the feeling that everyone else is doing
it except you. And in the world of normative social influence, the logical outcome is conformity. You find
yourself running as well. But it goes both ways, every time you post about your goals, you provoke the
same influence on your friends. As they start to succumb, their conformity reinforces the norm which, in
turn, makes you more likely to stay in the right course. In addition, as your friends like or comment on
your goal-oriented updates, the normative influence increases the probability that you will stick to your
posted goals.
48
Chapter 6
Appendix A
Initial form given to the users:
1. Gender
• Male
• Female
2. Age
• 18-30
• 31-40
• 41-50
3. What is your completed educational degree?
• Did Not Complete High School
• High School
• Some College
• Bachelor’s Degree
• Master’s Degree
• Advanced Graduate work or Ph.D.
4. What is your professional area?
• I’m still a student
• Technologies
• Arts
• Education
• Health
49
• Other
5. Do you often use a computer, mobile phone and/or tablet?
• Yes
• No
6. Do you frequently use gadgets to track your activities?
• Yes, but I just use my mobile phone and apps installed.
• Yes, I use my mobile phone and a tracking device.
• Yes, a specific tracking device.
• No, I use an Excel sheet or similar.
• No, I don’t track my activities.
• Other
7. What are the reasons that prevent you from tracking your activities?
• Lack of time
• Lack of motivation
• Data spread across several devices, which turns impossible to correlate and analyze every-
thing
• Not interested
• Other
50
Bibliography
[1] E. Aronson, T. D. Wilson, R. M. Akert, and B. Fehr. Social psychology. Pearson Education Canada,
2012.
[2] A. Bangor, P. T. Kortum, and J. T. Miller. An empirical evaluation of the system usability scale. Intl.
Journal of Human–Computer Interaction, 24(6):574–594, 2008.
[3] F. Bentley and K. Tollmar. The power of mobile notifications to increase wellbeing logging behavior.
In ACM SIGCHI International Conference on Human Factors in Computing Systems, 2013.
[4] J. Brooke. Sus-a quick and dirty usability scale. Usability evaluation in industry, 189:194, 1996.
[5] P. Burns, C. Lueg, and S. Berkovsky. Using personal informatics to motivate physical activity: Could
we be doing it wrong? In Proceedings of the 2012 ACM annual conference extended abstracts on
Human Factors in Computing Systems Extended Abstracts, CHI EA ’12, 2012.
[6] V. Bush. As we may think. volume 1, pages 36–44, New York, NY, USA, Apr. 1979. ACM. doi:
10.1145/1113634.1113638. URL http://doi.acm.org/10.1145/1113634.1113638.
[7] S. Consolvo, P. Klasnja, D. W. McDonald, D. Avrahami, J. Froehlich, L. LeGrand, R. Libby,
K. Mosher, and J. A. Landay. Flowers or a robot army?: encouraging awareness & activity with
personal, mobile displays. In Proceedings of the 10th international conference on Ubiquitous com-
puting, UbiComp ’08, pages 54–63, New York, NY, USA, 2008. ACM. ISBN 978-1-60558-136-1.
doi: 10.1145/1409635.1409644. URL http://doi.acm.org/10.1145/1409635.1409644.
[8] C. Fan, J. Forlizzi, and A. Dey. Spark: Visualizing physical activityusing abstract, ambient art. In
Proceedings of the 2012 ACM annual conference extended abstracts on Human Factors in Com-
puting Systems Extended Abstracts, CHI EA ’12, 2012.
[9] J. Gemmell, R. Lueder, and G. Bell. The mylifebits lifetime store. In Proceedings of the 2003 ACM
SIGMM workshop on Experiential telepresence, ETP ’03, pages 80–83, New York, NY, USA, 2003.
ACM. ISBN 1-58113-775-3. doi: 10.1145/982484.982500. URL http://doi.acm.org/10.1145/
982484.982500.
[10] P. H. Kim and F. Giunchiglia. The open platform for personal lifelogging: the elifelog architecture.
CHI EA ’13, 2013. ISBN 978-1-4503-1952-2.
51
[11] J. Larsen, A. Cuttone, and S. Lehmann. Qs spiral: Visualizing periodic quantified self data. In ACM
SIGCHI International Conference on Human Factors in Computing Systems, 2013.
[12] I. Li, A. Dey, and J. Forlizzi. A stage-based model of personal informatics systems. In Proceedings
of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’10, pages 557–566,
New York, NY, USA, 2010. ACM. ISBN 978-1-60558-929-9. doi: 10.1145/1753326.1753409. URL
http://doi.acm.org/10.1145/1753326.1753409.
[13] J. J. Lin, L. Mamykina, S. Lindtner, G. Delajoux, and H. B. Strub. Fish’n’steps: encouraging physical
activity with an interactive computer game. In Proceedings of the 8th international conference on
Ubiquitous Computing, UbiComp’06, Berlin, Heidelberg, 2006. Springer-Verlag.
[14] J. Nielsen. 10 usability heuristics for user interface design. Nielsen Norman Group: Evidence-
Based User Experience Research, Training, and Consulting, 1995.
[15] S. Nundy. Quit smoking by using facebook. http://www.kevinmd.com/blog/2010/03/
quit-smoking-facebook.html, 2010.
[16] L. R. Pina, E. Ramirez, and W. G. Griswold. Fitbit+: A behavior-based intervention system to reduce
sedentary behavior. In Pervasive Computing Technologies for Healthcare (PervasiveHealth), 2012
6th International Conference on, pages 175–178. IEEE, 2012.
[17] V. Rivera-Pelayo, V. Zacharias, L. Muller, and S. Braun. Applying quantified self approaches to sup-
port reflective learning. In Proceedings of the 2nd International Conference on Learning Analytics
and Knowledge, LAK ’12, pages 111–114, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-
1111-3. doi: 10.1145/2330601.2330631. URL http://doi.acm.org/10.1145/2330601.2330631.
[18] J. Sauro. Measuring usability with the system usability scale (sus), 2011.
[19] V. Sosik and D. Cosley. Thinking about side effects of personal informatics systems. In Proceedings
of the 2012 ACM annual conference extended abstracts on Human Factors in Computing Systems
Extended Abstracts, CHI EA ’12, 2012.
[20] K. Tollmar, F. Bentley, and C. Viedma. Mobile health mashups: Making sense of multiple streams
of wellbeing and contextual data for presentation on a mobile device. In Pervasive Computing
Technologies for Healthcare (PervasiveHealth), 2012 6th International Conference on, pages 65–
72. IEEE, 2012.
[21] C. Zhu, J. Tang, H. Li, H. T. Ng, and T. Zhao. A unified tagging approach to text normalization. In
ANNUAL MEETING-ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, volume 45, page 688.
Citeseer, 2007.
52