towards nlg for physiological data monitoring with body area...

6
http://www.diva-portal.org This is the published version of a paper presented at 14th European Workshop on Natural Language Generation, Sofia, Bulgaria, August 8-9, 2013. Citation for the original published paper : Banaee, H., Ahmed, M U., Loutfi, A. (2013) Towards NLG for Physiological Data Monitoringwith Body Area Networks. In: 14th European Workshop on Natural Language Generation (pp. 193-197). N.B. When citing this work, cite the original published paper. Permanent link to this version: http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-30257

Upload: others

Post on 26-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards NLG for Physiological Data Monitoring with Body Area …oru.diva-portal.org/smash/get/diva2:641704/FULLTEXT01.pdf · 2017. 3. 6. · This is the published version of a paper

http://www.diva-portal.org

This is the published version of a paper presented at 14th European Workshop on Natural LanguageGeneration, Sofia, Bulgaria, August 8-9, 2013.

Citation for the original published paper:

Banaee, H., Ahmed, M U., Loutfi, A. (2013)Towards NLG for Physiological Data Monitoringwith Body Area Networks.In: 14th European Workshop on Natural Language Generation (pp. 193-197).

N.B. When citing this work, cite the original published paper.

Permanent link to this version:http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-30257

Page 2: Towards NLG for Physiological Data Monitoring with Body Area …oru.diva-portal.org/smash/get/diva2:641704/FULLTEXT01.pdf · 2017. 3. 6. · This is the published version of a paper

Proceedings of the 14th European Workshop on Natural Language Generation, pages 193–197,Sofia, Bulgaria, August 8-9 2013. c©2013 Association for Computational Linguistics

Towards NLG for Physiological Data Monitoringwith Body Area Networks

Hadi Banaee, Mobyen Uddin Ahmed and Amy LoutfiCenter for Applied Autonomous Sensor Systems

Orebro University, Sweden{hadi.banaee,mobyen.ahmed,amy.loutfi}@oru.se

Abstract

This position paper presents an on-goingwork on a natural language generationframework that is particularly tailored forsummary text generation from body areanetworks. We present an overview ofthe main challenges when considering thistype of sensor devices used for at homemonitoring of health parameters. This pa-per describes the first steps towards the im-plementation of a system which collectsinformation from heart rate and respira-tion rate using a wearable sensor. The pa-per further outlines the direction for futurework and in particular the challenges forNLG in this application domain.

1 Introduction

Monitoring of physiological data using body areanetworks (BAN) is becoming increasingly popularas advances in sensor and wireless technology en-able lightweight and low costs devices to be easilydeployed. This gives rise to applications in homehealth monitoring and may be useful to promotegreater awareness of health and prevention for par-ticular end user groups such as the elderly (Ahmedet al., 2013). A challenge however, is the large vol-umes of data which is produced as a result of wear-able sensors. Furthermore, the data has a num-ber of characteristics which currently make auto-matic methods of data analysis particularly diffi-cult. Such characteristics include the multivariatenature of the data where several dependent vari-ables are captured as well as the frequency of mea-surements for which we still lack a general under-standing of how particular physiological parame-ters vary when measured continuously.

Recently many systems of health monitoringsensors have been introduced which are designedto perform massive and profound analysis in the

area of smart health monitoring systems (Baigand Gholamhosseini, 2013). Also several researchhave been done to show the applications and ef-ficiency of data mining approaches in healthcarefields (Yoo et al., 2012). Such progress in thefield would be suitable to combine with state ofthe art in the NLG community. Examples of suit-able NLG systems include the system proposed byReiter and Dale (2000) which suggested an archi-tecture to detect and summarise happenings in theinput data, recognise the significance of informa-tion and its compatibility to the user, and gener-ate a text which shows this knowledge in an un-derstandable way. A specific instantiation of thissystem on clinical data is BabyTalk project, whichis generated summaries of the patient records invarious time scales for different end users (Portetet al., 2009; Hunter et al., 2012). While theseworks have made significant progress in the field,this paper will outline some remaining challengesthat have yet to be addressed for physiological datamonitoring which are discussed in this work. Thepaper will also present a first version of an NLGsystem that has been used to produce summariesof data collected with a body area network.

2 Challenges in Physiological DataMonitoring with BAN

2.1 From Data Analysis to NLG

One of the main challenges in healthcare area ishow to analyse physiological data such that valu-able information can help the end user. To have ameaningful analysis of input signals, preprocess-ing the data is clearly an important step. Thisis especially true for wearable sensors where thesignals can be noisy and contain artifacts in therecorded data. Another key challenge in physio-logical data monitoring is mapping from the manydata analysis approaches to NLG. For examplefinding hidden layers of information with unsuper-

193

Page 3: Towards NLG for Physiological Data Monitoring with Body Area …oru.diva-portal.org/smash/get/diva2:641704/FULLTEXT01.pdf · 2017. 3. 6. · This is the published version of a paper

vised mining methods will be enable the system tomake a representation of data which is not pro-ducible by human analysis alone. However, do-main rules and expert knowledge are important inorder to consider a priori information in the dataanalysis. Further external variables (such as med-ication, food, stress) may also be considered in asupervised analysis of the data. Therefore, there isa challenge to balance between data driven tech-niques that are able to find intrinsic patterns in thedata and knowledge driven techniques which takeinto account contextual information.

2.2 End User / ContentA basic issue in any design of a NLG system is un-derstanding the audience of the generated text. Forhealth monitoring used e.g. at home this issue ishighly relevant as a variety of people with diversebackgrounds may use a system. For example, aphysician should have an interpretation using spe-cial terms, in contrast for a lay user where infor-mation should be presented in a simple way. Forinstance, for a decreasing trend in heart rate lowerthan defined values, the constructed message forthe doctor could be: “There is a Bradycardia at. . . ”. But for the patient itself it could be just:“Your heart rate was low at . . . ”. It is also im-portant to note that the generated text for the sameuser in various situations should also differ. Forinstance a high heart rate at night presents a dif-ferent situation than having a high heart rate dur-ing the high levels of activity. Consequently, allthe modules in NLG systems (data analysis, docu-ment planning, etc.) need to consider these aspectsrelated to the end user.

2.3 Personalisation / Subject ProfilingPersonalisation differs from context awarenessand is effective to generate messages adapted tothe personalised profile of each subject. One pro-file for each subject is a collection of informationthat would be categorised to: metadata of the per-son (such as age, weight, sex, etc.), the history ofhis/her signals treatments and the extracted fea-tures such as statistical information, trends, pat-terns etc. This profiling enables the system to per-sonalise the generated messages. Without profil-ing, the represented information will be shallow.For instance, two healthy subjects may have dif-ferent baseline values. Deviations from the base-line may be more important to detect than thresh-old detection. So, one normal pattern for one in-

Document Planning

Data Analysis

Single / Batch Measurement

Uni / Multi parameter Analysis

Event-based Message Handler

Personal Profiles

-metadata-events-patterns- ...

Text

Summary-based Message Handler

Ontology

Microplanning and Realisation

Data PreprocessingExpert

Knowledge

Global info. Message Handler

Ranking Functions

Figure 1: System architecture of text generation from phys-iological data.

dividual could be an outlier for another individualconsidering his/her profile.

3 System Architecture

In this section we outline a proposed system ar-chitecture, which is presented in Figure 1. So farthe handling of the single and batch measurementsand the data analysis have been implemented aswell as first version of the document planning. Formicroplanning and realisation modules, we em-ployed the same ideas in NLG system proposedby Reiter and Dale (2000).

3.1 Data CollectionBy using wearable sensor, the system is able torecord continuous values of health parameters si-multaneously. To test the architecture, more than300 hours data for two successive weeks have beencollected using a wearable sensor called Zephyr(2013), which records several vital signs such asheart rate, respiration, temperature, posture, activ-ity, and ECG data. In this work we have primar-ily considered two parameters, heart rate (HR) andrespiration rate (RR) in the generated examples.

3.2 Input MeasurementsTo cover both short-term and long-term healthcaremonitoring, this system is designed to support twodifferent channels of input data. The first channelis called single measurement channel which is acontinuous recorded data record. Figure 2 showsan example of a single measurement. In the fig-ure, the data has been recorded for nine continuoushours of heart rate and respiration data which cap-ture health parameters during the sequential activi-

194

Page 4: Towards NLG for Physiological Data Monitoring with Body Area …oru.diva-portal.org/smash/get/diva2:641704/FULLTEXT01.pdf · 2017. 3. 6. · This is the published version of a paper

20:30 21:45 23:00 00:15 01:30 02:45 04:00 05:15 06:30 07:45

0

20

40

60

80

100

120

140

160

180

200

220

HH:MM

bp

m

HR

RR

Watching TVExercising Walking

Walking Sleeping

Figure 2: An example of single measurement, 13 hours ofheart rate (HR) and respiration rate (RR).

40

80Mar 13

40

80Mar 14

40

80Mar 15

40

80Mar 16

40

80Mar 18

0 h 1 h 2 h 3 h 4 h 5 h 6

40

80

Hours

bpm

Mar 19

Figure 3: An example of batch measurement included heartrate for 6 nights.

ties such as exercising, walking, watching TV, andsleeping. To have a long view of health parame-ters, the system is also designed to analyse a batchof measurements. Batch measurements are sets ofsingle measurements. Figure 3 presents an exam-ple of a batch of measurements that contain all thereadings during the night for a one week period.This kind of input data allows the system to make arelation between longitudinal parameters and canrepresent a summary of whole the dataset.

3.3 Data Analysis

To generate a robust text from the health param-eters, the data analysis module extracts the infor-mative knowledge from the numeric raw data. Theaim of data analysis module is to detect and repre-sent happenings of the input signals. The primarystep to analyse the measurements is denoising andremoving artifacts from the raw data. In this work,by using expert knowledge for each health param-eter, the artifact values are removed. Meanwhile,to reduce the noise in the recorded data, a series ofsmoothing functions (wavelet transforms and lin-ear regression (Loader, 2012)) have been applied.

In this framework an event based trend detec-tion algorithm based on piecewise linear segmen-tation methods (Keogh et al., 2003) for the timeseries has been used. In addition, general statisticsare extracted from the data such as mean, mode,frequency of occurrence etc. that are fed into thesummary based message handler. As an ongoingwork, the system will be able to recognise mean-ingful patterns, motifs, discords, and also deter-mine fluctuation portions among the data. Also formulti-parameter records, the input signals wouldbe analysed simultaneously to detect patterns andevents in the data. Therefore the particular noveltyof the approach beyond other physiological dataanalysis is the use of trend detection.

3.4 Document Planning

Document planning is responsible to determinewhich messages should appear, how they shouldbe combined and finally, how they should be ar-ranged as paragraphs in the text. The messagesin this system are not necessarily limited to de-scribing events. Rather, the extracted informationfrom the data analysis can be categorised into oneof three types of messages: global information,event based, and summary based messages. Foreach type of message category there is a separateranking function for assessing the significance ofmessages for communicating in the text. The or-der of messages in the final text is a function basedon (1) how much each message is important (valueof the ranking function for each message) (2) theextracted relations and dependencies between thedetected events. The output of document plan-ning module is a set of messages which are organ-ised for microplanning and realisation. Documentplanning contains both event based and summarybased messages as described below.

Event based Message Handler: Most of theinformation from the data analysis module are cat-egorised as events. Event in this system is an ex-tracted information which happens in a specifictime period and can be described by its attributes.Detected trends, patterns, and outliers and alsoidentified relations in all kinds of data analysis(single/batch measurement or uni/multi parame-ter) are able to be represented as events in the text.The main tasks of the event based message handlerare to determine the content of events, constructand combine corresponding messages and their re-lations, and order them based on a risk function.

195

Page 5: Towards NLG for Physiological Data Monitoring with Body Area …oru.diva-portal.org/smash/get/diva2:641704/FULLTEXT01.pdf · 2017. 3. 6. · This is the published version of a paper

The risk function is subordinate to the features ofthe event and also expert knowledge to determinehow much this event is important.

Summary based Message Handler: Linguis-tic summarisation of the extracted knowledge datais a significant purpose of summary based messagehandler. With inspiration from the works doneby Zadeh (2002) and Kacprzyk et. al (2008), werepresent the summary based information consid-ering the possible combination of conditions forthe summary of data. The proposed system usesfuzzy membership function to map the numericdata into the symbolic vocabularies. For instanceto summarise the treatments of heart rate duringall nights of one week in linguistic form, we de-fine a fuzzy function to identify the proper rangeof low/medium/high heart rate level or specify aproper prototype for representing the changes suchas steadily/sharply or fluctuated/constant. Here,the expert knowledge helps to determine this task.

The validity of these messages is measured by adefined formula in linguistic fuzzy systems calledtruth function which shows the probability of pre-cision for each message. The system uses thisindicator as a ranking function to choose mostimportant messages for text. The main tasks ofsummary based message handler are: determiningthe content of the summaries, constructing corre-sponding messages, and ordering them based onthe truth function to be appeared in the final text.The summary based message handler is not con-sidered in previous work in this domain.

3.5 Sample Output

The implemented interface is shown in Figure 4which is able to adapt the generated text with fea-tures such as health parameters, end user, mes-sage handler etc.. Currently our NLG system pro-vides the following output for recorded signalswhich covers global information and trend detec-tion messages. Some instances of generated textare shown, below. The first portion of messages ineach text is global information which includes ba-sic statistical features related to the input signals.An example of these messages for an input data is:“This measurement is 19 hours and 28 minutes which started

at 23:12:18 on February 13th and finished at 18:41:08 on the

next day.”

“The average of heart rate was 61 bpm. However most of the

time it was between 44 and 59 bpm. The average of respira-

tion rate was 19 bpm, and it was between 15 and 25 bpm.”

Figure 4: A screenshot of the implemented interface.

Regarding to the event based messages, an ex-ample of the output text extracted from the trenddetection algorithm is:“Between 6:43 and 7:32, the heart rate suddenly increased

from 50 to 108 and it steadily decreased from 90 to 55 be-

tween 11:58 and 17:21.”

4 Future Work

So far we have described the challenges and thebasic system architecture that has been imple-mented. In this section we outline a number ofsample outputs intended for future work whichcaptures e.g. multivariate data and batch of mea-surement. We foresee that there is a non-trivial in-teraction between the event message handler andthe summary message handler. This will be fur-ther investigated in future work.

Samples for single measurement:“Since 9:00 for half an hour, when respiration rate became

very fluctuated, heart rate steadily increased to 98.”

“Among all high levels of heart rate, much more than half are

very fluctuated.”

Samples for batch of measurements:“During most of the exercises in the last weeks, respiration

rate had a medium level.”

“During most of the nights, when your heart rate was low,

your respiration rate was a little bit fluctuated.”

Other messages could consider the comparisonbetween the history of the subject and his/her cur-rent measurement to report personalised unusualevents e.g.:“Last night, during the first few hours of sleep, your heart

rate was normal, but it fluctuated much more compared to

the similar times in previous nights.”

In this work we have briefly presented a pro-posed NLG system that is suitable for summaris-ing data from physiological sensors using naturallanguage representation rate. The first steps to-wards an integrated system have been made andan outline of the proposed system has been given.

196

Page 6: Towards NLG for Physiological Data Monitoring with Body Area …oru.diva-portal.org/smash/get/diva2:641704/FULLTEXT01.pdf · 2017. 3. 6. · This is the published version of a paper

ReferencesMobyen U. Ahmed, Hadi Banaee, and Amy Loutfi.

2013. Health monitoring for elderly: an applica-tion using case-based reasoning and cluster analysis.Journal of ISRN Artificial Intelligence, vol. 2013, 11pages.

Mirza M. Baig and Hamid Gholamhosseini. 2013.Smart health monitoring systems: an overview ofdesign and modeling. Journal of Medical Systems,37(2):1–14.

James Hunter, Yvonne Freer, Albert Gatt, Ehud Re-iter, Somayajulu Sripada, and Cindy Sykes. 2012.Automatic generation of natural language nurs-ing shift summaries in neonatal intensive care:BT-Nurse. Journal of Artificial Intelligence inMedicine, 56(3):157–172.

Janusz Kacprzyk, Anna Wilbik, and SlawomirZadrozny. 2008. Linguistic summarization of timeseries using a fuzzy quantifier driven aggregation.Fuzzy Sets and Systems, 159(12):1485–1499.

Eamonn J. Keogh, Selina Chu, David Hart, andMichael Pazzani. 2003. Segmenting time series:a survey and novel approach. Data Mining In TimeSeries Databases, 57:1–22.

Catherine Loader. 2012. Smoothing: local regressiontechniques. Springer Handbooks of ComputationalStatistics, 571-596.

Franois Portet, Ehud Reiter, Albert Gatt, Jim Hunter,Somayajulu Sripada, Yvonne Freer, and CindySykes. 2009. Automatic generation of textual sum-maries from neonatal intensive care data. Journal ofArtificial Intelligence, 173:789–816.

Ehud Reiter and Robert Dale. 2000. Building naturallanguage generation systems. Cambridge Univer-sity Press, Cambridge, UK.

Illhoi Yoo, Patricia Alafaireet, Miroslav Marinov, KeilaPena-Hernandez, Rajitha Gopidi, Jia-Fu Chang, andLei Hua. 2012. Data mining in healthcare andbiomedicine: a survey of the literature. Journal ofMedical Systems, 36(4):2431–2448.

Lotfi A. Zadeh. 2002. A prototype centered approachto adding deduction capabilities to search engines.Annual Meeting of the North American Fuzzy In-formation Processing Society, (NAFIPS 2002) 523–525.

Zephyr. http://www.zephyr-technology.com, AccessedApril 10, 2013.

197